How open source paved the way for computing from anywhere
IBM’s Priya Nagpurkar explains how open-source communities work, why they foster innovation, and whether open technologies will underpin AI’s future.
IBM’s Priya Nagpurkar explains how open-source communities work, why they foster innovation, and whether open technologies will underpin AI’s future.
IBM built its hybrid cloud business with open-source technologies like Linux and Kubernetes. Today, they allow enterprises to process workloads from anywhere, shifting easily between public and private servers in the cloud. Those technologies, plus newer ones like Ray, and PyTorch, now play a role in the future of AI.
IBM designed watsonx, its next-generation AI and data platform, to run on the hybrid cloud to meet the demands of modern industry. We caught up with Priya Nagpurkar, who oversees the research driving innovations in IBM’s hybrid cloud platform, to talk about the past, present, and future of open-source technologies.
I did my PhD in programming language runtimes in the 2000s. Java was big back then, and just-in-time compilation was a new concept. IBM had just put Jikes RVM for testing virtual machines out in the open. If you wanted to try a new compilation technique, or just-in-time optimization, you used Jikes RVM. Because it was backed by IBM, you knew that your work would find a commercial application and be useful to real users. As a grad student, you don’t have time to build something from scratch and create something new on top. That was how the value of open source became clear to me.
It allows you to build on a vast prior body of work and to harness the collective, not just the talent at one company. That's how innovation goes faster. You can build on top of, and benefit from, what others are doing.
Someone from IBM may team up with somebody at Google because we believe in a particular technology. But we also have separate goals. Governance, the structure that allows collaborators to work together, lets you find the common ground. The downside is that things can take more time. Finding agreement in an open community takes longer than convincing the product team at your company. Many more people need to be persuaded that the problem you're trying to solve is the right one and that the solution will have broad value. There’s another positive aspect of governance, especially in AI: Many more people have a chance to find the flaws. Everybody talks about AI for good, but AI can go bad. If the whole community has eyes on a technology, it can keep problems in check.
As mentioned earlier, finding common ground takes time and energy so that’s one challenge. Another one is how you differentiate yourself. If you’re a company, how do you monetize the technology? But we have seen, at least with big systems and platforms, there’s plenty of opportunity to put technologies together to create a seamless user experience with ongoing support. For IBM, if we have a presence, we can shape the technology and influence it for the benefit of our clients and their use-cases. We can leverage the collective innovation for their benefit.
Most people are probably familiar with Linux, the operating system that launched the open-source software movement in the 1990s. IBM was an early supporter and contributor. Earlier, I mentioned Java and JikesRVM. I worked on two related technologies, OpenWhisk, for serverless computing, and Istio, the service mesh. Both underpin cloud-native ecosystem available through Red Hat OpenShift, now a high-growth area for IBM. The entire AI-software ecosystem builds on OpenShift. Watsonx will bring these technologies together seamlessly for enterprises to grow productivity. Software is increasingly more open, from operating systems and distributed systems to AI frameworks like Ray and PyTorch.
We look for a good match between the problems that we and the open-source community are trying to solve. Is the “problem-solution” fit right? Sometimes we might participate with existing communities. Other times we might incubate a technology and invite others to join. Foundation models are a great example. This was a big opportunity for IBM, Hugging Face, which offers thousands of open-sources AI models, and the Ray and PyTorch communities which are trying to make training and inferencing more efficient. Healthy, open governance is also important. Are other players allowed to participate and influence the outcome or is a single entity in charge? We want to be able to exercise influence to ensure that IBM’s interests are represented.
They are more like a meritocracy. You must first show that you bring technical value. If you earn the community’s trust as a contributor, you can advance to “maintainer” and make changes. A technical oversight committee often makes the major decisions. If a company like IBM gets involved, we often provide funding and have a seat on the board which gives us a say in important decisions. By backing and participating in a project officially, it usually means we hope to derive some commercial benefit.
It goes back to finding a way to differentiate. The OpenShift platform is a good example. OpenShift has been integrated with Ray, on top of Kubernetes, and with PyTorch, to run AI jobs. IBM helped to integrate these communities to get the technology to where we needed it. We helped package and seamlessly integrate the technologies to provide a great user experience.
There’s nothing to stop someone from picking up the technology without contributing, but that also means their influence is limited. If they need a new feature, it may not get prioritized. Open governance, the meritocracy I mentioned earlier, has checks and balances. Anyone can take open-source software, but what happens when you need support? Red Hat offers a commercial version of OpenShift, based on Kubernetes, the open-source container-orchestration system. They fix bugs for their clients and merge the changes back to Kubernetes so the community benefits.
There’s value in standardization, especially when a new area of technology is moving fast. It’s not always beneficial to keep a project inside, especially if interest in an open-source project is growing. Containers are a good example. When Kubernetes was put in the open; it led to standardization which benefited the entire industry.
In the platform world, open and closed can coexist. A classic example is Kubernetes. Instead of having just one place to put workloads, I can optimize the placement of a job by adding a plug-in. If I have a better algorithm, I can plug it in. Everyone's interested because maybe they have a better algorithm to drive your platform. That's perfectly legitimate, to offer a better user experience or way of optimizing the platform and how it's used. That's one concrete example of how both can coexist. Having a platform based on common standards is fine. But now I can customize my own infrastructure and optimize the full stack.
Yeah, the secret recipe is that I can give you a better way to customize or tune your models. I can give you better guardrails to ensure that the model outputs are tailored to your needs. These don't have to be strictly open features but rather value adds on top.
The big debate now is whether AI will be dominated by a few big players and their humongous models, or will the larger community have influence. Will value creation shift from the big models and big model learners to something more community driven? Or will the community team up and figure out how to build larger models? These are big questions. Many predict that the community approach will win because you can leverage many more people. But the hardware investment is huge. I think people will come up with new, innovative ways to derive value beyond creating the big base model. Communities like PyTorch and Ray are already making platform improvements in the open.
Bad governance usually happens when a project is dominated by a single player. A good indicator is whether others can contribute and make decisions. If I've made contributions, can I provide input? Healthy communities often have multiple people making contributions and having influence.
That might be a good analogy. It’s self-governing. I don't know their process intimately, but I think there's a structure for accepting edits.
Yeah, that can happen for multiple reasons. One reason is a poor governance model that makes it risky for others to invest in that community. If a community with poor governance or a single dominant player decides to change the terms or licensing, or doesn't let us submit critical fixes, we can’t risk building on it. If people decide to stop participating, other participants also start looking elsewhere. At IBM Research we try to be strategic about our open-source investments because they're not free. Once you put something out there, you have to invest in the community.
People still don't write it off completely. But from Google’s perspective, will it be worth keeping it going? Will it ever turn the corner or has PyTorch definitively won? An analogy is with cloud-native platforms where Kubernetes emerged as the winner, I think Mesos has all but disappeared.
Since these AI models are not open, there's a lack of scrutiny from the broader community, making it difficult for others to replicate results and run additional experiments that could improve safety.