Thursday, January 10, 2019

The cloud vs. open source redux

If you’re reading this, you’re probably aware that there is a fracas going on around open source licensing. Quite a bit has been written on the topic and I won’t rehash the specific details here; they have been well covered by:

However, to net l’affaire out, long-simmering issues associated with building businesses on the back of open source software are boiling over. In particular, cloud providers like Amazon Web Services (AWS) not only rely on vast amounts of open source software to run their infrastructure but they’re increasingly offering cloud services that directly compete with the companies that created much of that open source software in the first place. Furthermore, there’s a widespread (largely justified) perception, that some of these providers in particular are taking from the open source commons far more than they’re giving back.

Some thoughts.

This is not a new concern

As an industry analyst, I wrote a research paper titled “The Cloud vs. Open Source” in 2008. That’s just two years after AWS debuted. Much of the paper cautions against getting too fixated on source code when thinking about user freedoms and openness generally. This remains true today and is the subject of an entire chapter of my book How Open Source Ate Software that I published last year.

However, I also argued that throwing up roadblocks to making use of open source software was ultimately unproductive.

Today, Open Source is widely embraced by all manner of technology companies because they’ve found that, for many purposes, Open Source is a great way to engage with developer and user communities—and even with competitors. Therefore, the concern that, left to their own devices, companies will wholesale strip-mine Open Source projects and “take it all private” seems anachronistic. That’s not to say that everyone will always contribute as much code without copyleft as with it, but the suggestion that copyleft is all that’s holding the whole Open Source process together just doesn’t square with the facts.

Was I just wrong?

Now, at this point, you might turn around and say: “But wholesale strip-ming is exactly what’s happening. We need even stronger protections if the commons is not to be ruthlessly exploited!"

One problem is that all the evidence suggests this doesn’t work. Permissive licenses like Apache, MIT, and BSD have gained in popularity over time. There’s a reason for this. Much of modern open source’s success isn’t about the ability to view source code. It’s about its collaborative development model. And the Eclipse Foundation's Ian Skerrett argues that "projects use a permissive license to get as many users and adopters, to encourage potential contributions. They aren't worried about trying to force anyone. You can't force anyone to contribute to your project; you can only limit your community through a restrictive license."

Another data point is the AGPL. At the time the new version of the copyleft GPL came out (GPLv3 in 2007), the Affero General Public License was introduced as a new GPL variant. Copyleft basically says that if you distribute software, you have to make the source code available. This includes any changes you made. The rub is that, under the GPL’s terms, “distributing” basically means shipping software on a disc or offering it for download. This creates what some saw as a loophole because offering the software as a cloud service isn’t distribution as traditionally defined.

(Is this starting to sound familiar? I told you none of this was new.)

Enter the AGPL, which was just like the GPL except the definition of distribution was broadened to include offering software as a service.

However, the AGPL hasn’t been much used. Ironically, one of the users was MongoDB, which is one of the current companies that have relicensed their software to prevent its use by cloud providers. In general, lots of companies are nervous that the AGPL could potentially interact with internal code that they don’t want to make publicly available. So its often on the license no fly list for application development.

Which is all to say that the overall direction in open source has been away from restrictive licenses. Leaving aside whether an even more restrictive license could still be reasonably considered “open source,” there just seems very little appetite for such a creature.

Words matter

In my view, a lot of the heat around licenses like the Commons Clause comes about because the companies involved seem to be, on the one hand, trying to gain the perceived value of a proprietary license while also getting credit for still being open source. “Open core” arguably plays the same parlor trick.

It still might have been news if one or more of these companies simply relicensed some or all of their software to a license that was unabashedly proprietary even if it retained some aspects of open source. But I suspect it would have been much less of a tempest.

Whether or not doing so would have been a good idea is a separate question. But it’s their software. Their business challenges are real. Own it. If you’re not going to have an open source development model, I’m not sure why you particularly even care if it’s technically open source or not.

Can we make cloud providers do better?

While the software vendors are taking heat from one side. Cloud providers are taking it from another. There is indeed a widespread view that most cloud providers are takers rather than givers. AWS, as the #1 cloud provider, takes particular heat. It’s mostly deserved. Although Adrian Cockcroft’s team has arguably moved the needle in making AWS play better with open source communities, much more could be done.

However, publicly shaming Amazon will not be a very effective strategy to drive change. If you can't sell the business value of participating in open source, you've pretty much lost the battle. Shaming might net you some contributions for the PR value but certainly no real commitment. Pinning your hopes on Jeff Bezos’ altruism is not a winning move.

Instead, as Linux Foundation Executive Director Jim Zemlin  told me during an interview at the Open Source Leadership Summit last year: 

The epiphany that many companies have had over the last three to four years, in particular, has been, "Wow. If I have processes where I can bring code in, modify it for my purposes, and then, most importantly, share those changes back, those changes will be maintained over time.

"When I build my next project or a product, I should say, that project will be in line with, in a much more effective way, the products that I'm building.

"To get the value, it's not just consumed, it is to share back and that there's not some moral obligation, although I would argue that that's also important. There's an actual incredibly large business benefit to that as well." The industry has gotten that, and that's a big change.

In closing

None of this is to dismiss the underlying challenges that these changes came in response too. There will always be challenges at the level of the individual company trying to build a business no matter what the product. But there are more macro dynamics here as well. 

This shift of computing towards public clouds recreates a new type of vertically integrated stack. One-time chief technology officer of Sun Microsystems, Greg Papadopoulos, one suspects hyperbolically and with an eye towards something IBM founder Thomas J. Watson probably never said, suggested that “the world only needs five computers,” which is to say there would be “more or less, five hyperscale, pan-global broadband computing services giants” each on the order of a Google.  

Some cloud giants have indeed made significant contributions to open source projects. For example, Google originally created Kubernetes, the leading open source project for managing software containers, based on the infrastructure it had built for its internal use. Facebook has open sourced both software and hardware projects.

But, for the most part, these dominant companies use open source to create what are largely proprietary platforms far more than they reinvest to perpetuate ongoing development in the commons. And they’re sufficiently large and well-resourced that they mostly don’t depend on cooperative invention at this point.

It’s easy to dismiss free-riding as a problem given that organizations are missing out in some ways if they do so. However, to the degree that large tech companies, both cloud providers and others such as Apple, take far more from the open source commons than they contribute back, this at least raises concerns about open source sustainability.

The last section is based in part on content from How Open Source Ate Software (Apress 2018)

No comments: