Connections

Friday, February 28, 2025

Red Hat OpenShift Virtualization: There's no one workload type

We're at one of the more interesting periods for virtualization in 25 years of so.

That's about the time that Diane Greene, a co-founder of VMware and its CEO, was making the IT industry analyst rounds to talk up VMware's push into the x86 enterprise server virtualization space that didn't really exist at the time. Large Unix servers (and mainframes) had various ways to carve up the systems for different workloads. But x86 servers mostly didn't.

This was a problem, especially on servers running Microsoft Windows, because servers couldn't keep workloads from getting in each other's way. The result was that many servers were only running about 15% utilized or so.

It also just so happened that the bottom was about to drop out of IT spending with the popping of the dot-com bubble. The ability to make less hardware do more work was about to look very attractive. That virtualization didn't really demand a fundamentally different approach to applications and management than physical servers was attractive too.

VMware went through various ownership changes but, overall, it was a great success story.

Competition was limited

One consequence of this success is that potential competition, especially open source Xen and KVM, never got a huge amount of enterprise traction. VMware was just too entrenched.

VMware had also developed a rich set of tools (and a large partner ecosystem) to complement and enhance its core virtualization products. This made its offering even stickier. A potentially cheaper virtualization hypervisor was hard to make a case for in the enterprise.

Not everything was rosy. VMware never made especially great strides with cloud providers. It also arguably underinvested or underexecuted in application management relative to infrastructure management. Nonetheless VMware maintained a strong position.

Enter containers

However, about a decade ago, containers and container orchestration (especially in the guise of Kubernetes) were becoming important.

Containers weren't necessarily attractive to everyone. Especially at first, they didn't partition workloads as completely as virtualization did. Furthermore, taking advantage of containers and Kubernetes to their fullest benefited from a variety of new application architectural patterns such as microservices.

VMware recognized this shift but their virtualization cash cow was something of an anchor and they never had a strong container story. In late 2023, Broadcom completed its acquisition of VMware. Broad changes in pricing and licensing are underway.

Changes in pricing and shifts in the technology landscape can easily lead to changes in default buyer behavior.

A new dynamic

I've been watching how this plays out in the context of Red Hat OpenShift, Red Hat's Kubernetes-based application development platform. OpenShift Virtualization 4.18 was just released.

OpenShift Virtualization uses an upstream open source project called KubeVirt. OpenShift Virtualization provides a unified environment for VMs, containers and serverless technologies for maximum flexibility.

This is important because past studies such as the Konveyor community's State of Application Modernization Report 2024 have shown that organizations take a variety of approaches to modernizing different parts of their application portfolios. They may rewrite as cloud-native, just lift and shift, or other approaches in between.

As a result, there's often a benefit to a unified platform that can accommodate a number of different styles of workloads at different points in an organization's application modernization journey. I think the industry has sometimes been too quick to try to push everyone to the latest technology hotness. It's fine to advocate for new approaches. But you also have to meet people where they are.

Wednesday, February 12, 2025

Making AI more open: It's not all or nothing

I had a discussion with someone at the Linux Foundation Member Summit after Richard Fontana from Red Hat's talk and they didn't buy any arguments that, perhaps, training data not being open didn't invalidate the model as a whole being considered open.

But I agree with the pragmatism angle. I'm something of an absolutist with respect to the open source definition with respect to code. But there are clearly limitations in the degree to which you can open training data in many cases because of privacy and other concerns. This 2023 article I wrote for Red Hat Research Quarterly goes into some of the ways even supposedly anonymized data can be less anonymized than you may think.

Thus, while open data training sets are certainly a good goal, an absolutist position that open models (model weights and code) don't have significant value in the absence of open training data isn't a helpful one given that we know certain types of data, such as healthcare records, are going to be challenging to open up.

It strikes me that more measured approaches to AI openness that embody principles from the open source definition, such as not restricting how an open AI model can be used, are more practical and therefore more useful than insisting it's all or nothing.

CTO Chris Wright has recently shared Red Hat's perspective. I encourage you to read the whole piece which goes into more detail than I will here. But a couple salient excerpts.

The majority of improvements and enhancements to AI models now taking place in the community do not involve access to or manipulation of the original training data. Rather, they are the result of modifications to model weights or a process of fine tuning which can also serve to adjust model performance. Freedom to make those model improvements requires that the weights be released with all the permissions users receive under open source licenses.

This strikes me as an important point. While there are underlying principals as they relate to openness and open source, the actual outcomes usually matter more than philosophical rightness or open source as marketing.

While model weights are not software, in some respects they serve a similar function to code. It’s easy to draw the comparison that data is, or is analogous to, the source code of the model. In open source, the source code is commonly defined as the “preferred form” for making modifications to the software. Training data alone does not fit this role, given its typically vast size and the complicated pre-training process that results in a tenuous and indirect connection any one item of training data has to the trained weights and the resulting behavior of the model.

Model weights have a much closer analog to open source software than the training data does. That's of course not to say that training data shouldn't be opened up where practical. But for the many cases where it can't be, the perfect shouldn't be made the enemy of the good.

To be fair, we could make similar arguments about some of the newer "business open source" licenses that skirt the open source definition, but it's about drawing lines. After all, many large commercial open source products don't also release all their own build system code, test suites, and other information that would be useful for someone to reliably clone the delivered binary. Nonetheless, very few people object to calling the final product open source so long as its source code is under an approved open source license.

Steven Vaughn-Nichols also tackles this topic over at ZDNET in Red Hat's take on open-source AI: Pragmatism over utopian dreams.

Monday, February 03, 2025

What I'm keeping my eye on for computing's future

A bit over a year ago, I gave a presentation at the Linux Foundation Member Summit, where I took a look at some of the technologies catching a lot of attention and how the might develop. At the time, a lot of this was based on various work around Red Hat, including Red Hat Research, where I was working at the time. Think of this as both a fleshing out of the presentation, for those who weren't there, and a minor updating.

Here's the list. It mostly has to do with software and hardware infrastructure with an emphasis on open source. https://bitmason.blogspot.com/2020/04/podcast-if-linux-didnt-exist-would-we.htmlI'll be diving more deeply into some of the individual technologies over time. I don't want to turn this into a 10,000 word+ essay but mostly want to lay out some of the things I currently see as notable.

First, let me say that tech development has frequently surprised us. Here are a few items I came up with:

Microsoft Windows on servers (as seen from ~1990) and, really, Microsoft was going to nominate everything, right?

It looked poised to dominate. But it "merely" became important. Colorful language from Bryan Cantrill on this particular topic. And, really, it transitioned to Azure over time.

Linux and open source (as seen from mid-90s)

Linux was a curiosity and, really, so were things by BSD and the Apache Web Server until the first big Internet buildout got going in earnest. (And probably even then as far as enterprises were concerned until IBM made its very public Linux endorsement.

Server virtualization (as seen from 1999)

Server virtualization had been around for a long time. VMware made it mainstream. But it didn't become a mainstream production server tech until the economy soured and tech spending dried up to a significant degree at the beginning of the aughts.

Public cloud (as seen mostly incorrectly from mid-2000s)

Public clouds more or less came out of nowhere even if they had antecedents like timesharing.

Back-end containers/orchestration/cloud-native (from ~2010)

This came out of nowhere too from the perspective of most of the industry. Containers had been around a long time too. But they mostly settled into a largely failed lightweight alternative to virtualization role. Then Docker came along and made them a really useful tool for developers. Then Kubernetes came along and made them essentially the next generation alternative to enterprise virtualization—and a whole ecosystem developed around them.

Smartphones (as seen from ~2005)

I won't belabor the point here but the smartphones of 2005 were nothing like, and much less interesting, than the Apple and Android smartphones that came to dominate most of the industry.

The heavy hitters

These are the technology and related areas I'm going to hit on today. A few others for a future time. I also wrote about artificial intelligence recently although it impinges on a couple other topics (and, indeed, maybe all of them):

Data
Automation
Heterogeneous infrastructure

Data

Data touches on so many other things. The enormous training datasets of AI, observability for intelligent automation, the legal and regulatory frameworks that govern what data can be used, what data can be opened up, and even what open source really means in a world where data is at least as important as code. Security and data sovereignty play in too.

Let's break some of this down starting with security and data sovereignty.

There are a ton of subtopics here. What am I specifically focused on? Zero trust, confidential computing, digital sovereignty, software supply chain… ? And what are the tradeoffs?

For example, with confidential computing, though early, can we project that—in addition to protecting data in rest and in transit—it will increasingly be considered important to protect data in use in a way that makes it very hard to learn anything about a running workload.

As digital sovereignty gains momentum, what will the effects on hyperscalers and local partnering requirements be—and what requirements will regulations impose?

There are major open legal questions around the sources of training data. After all, most of the contents of the web—even where public—are copyrighted non-permissively. Not a few people consider the use of such data as theft though most of the IP lawyers I have spoken with are skeptical. There are ongoing lawsuits however, especially around the use of open source software for tools like GitHub's CoPilot.

There is also a whole category of license geek concerns around what "open" means in a data context, both for AI models and the data itself. Some of these concerns also play into releasing anonymized data (which is probably less anonymous than you think) under open licenses.

Automation

Although some principles have remained fairly constant, automation has changed a lot since it was often lumped into configuration management within the domain of IT. (I'm reminded of this on the eve of post-FOSDEM CfgMgmtCamp in Ghent, Belgium which I attended for a number of years when Ansible was often viewed as another flavor of configuration management in the vein of Puppet and Chef, the latter two being mostly focused on managing standard operating environments.)

The bottom line is that complexity is increasingly overwhelming even just partially manual operations.

This is one of the many areas where AI is playing into. While early days—with the AI term being applied to a lot of fairly traditional filtering and scripting tools—nonetheless AIOps is an area of active research and development.

But there are many questions:

Where are the big wins in dynamic automated system tuning that will most improve IT infrastructure efficiency? What’s the scope of the automated environment? How much autonomy will we be prepared to give to the automation and what circuit breakers and fallbacks will be considered best practice?

Maybe another relevant question is: What shouldn’t we automate? (Or where can we differentiate ourselves by not automating?) Think, for example, overly automated systems to serve customers either in-person or over the phone. Costs matter of course but every exec should be aware of systems causing customer frustration (which are very common today). Many aspects of automation can be good and effective but, today, there's a ton of sub-par automation determined to keep enabled humans from getting involved.

Heterogeneous infrastructure

I was maybe 75% a processor/server analyst for close to 10 years so some of the revitalized interest in hardware as something other than a commodity warms my heart.

Some of this comes about because simply cranking the process on x86 no longer works well—or at least works more slowly than over the past few decades. There are still some levers in interconnects and potentially techniques like 3D stacking of chips but it's not the predictable tick-tock cadence (to use Intel's term) that it was in years past.

GPU's, especially from NVIDIA and especially as applied to AI workloads, have demonstrated that alternative more-or-less workload-specific hardware architectures are often worth the trouble. And it's helped generate enormous value for the company.

Cloud APIs and open source play into this dynamic as well in a way that helps allow a transition away from a single architecture that evolves in lockstep with what independent software vendors (ISVs) are willing to support. Not that software doesn't still matter. NVIDIA's CUDA Toolkit has arguably been an important ingredient of its success.

But the twilight of the CMOS process scaling that has helped cement x86 as the architecture almost certainly goes beyond CPUs. Google's Tensor Processing Units (TPUs) and the various x86 Single Instruction Multiple Data (SIMD) extensions have not had the impact of x86 but there are interesting options out there.

RISC-V is an open-standard instruction set architecture (ISA) being explored and adopted by major silicon vendors and cloud providers. This article from Red Hat Research Quarterly discusses RISC-V more deeply particularly in a Field Programmable Gate Array context.

More broadly, although RISC-V, so far, has been deployed primarily in relatively small, inexpensive systems, the architecture is fully capable of both server and desktop/laptop deployments. Though a few years old—so don't pay too much attention to the specifics—this interview I did with RISC-V's CTO gets into a lot of the approach that RISC-V is taking that applies today.

While longer term and more speculative, quantum computing is another important area of hardware innovation. A quantum computer replaces bits with qubits—controllable units of computing that display quantum properties. Qubits are typically made out of either an engineered superconducting component or a naturally occurring quantum object such as an electron. Qubits can be placed into a "superposition" state that is a complex combination of the 0 and 1 states. You sometimes hear that qubits are both 0 and 1, but that's not really accurate. What is true is that, when a measurement is made, the qubit state will collapse into a 0 or 1. Mathematically, the (unmeasured) quantum state of the qubit is a point on a geometric representation called the Bloch sphere.

While superposition is a novel property for anyone used to classical computing, one qubit by itself isn't very interesting. The next unique quantum computational property is "interference." Real quantum computers are essentially statistical in nature. Quantum algorithms encode an interference pattern that increases the probability of measuring a state encoding the solution.

While novel, superposition and interference do have some analogs in the physical world. The quantum mechanical property "entanglement" doesn't, and it's the real key to exponential quantum speedups. With entanglement, measurements on one particle can affect the outcome of subsequent measurements on any entangled particles—even ones not physically connected.

While a lot of attention has focused on the potential impact of quantum on cryptography, it's more broadly imagined (assuming various advances) to potentially increase efficiencies or even to solve problems that are just not practically solvable in many fields.

Conclusion

Broadly, there's this idea that we'll go beyond x86 to what goes by various names but amount to aggregating various types of resources dynamically to meet workload needs. Those resources will be increasingly diverse: To optimize for different workloads, to work with large amounts of data, and to automate wherever it makes sense to do so.

Wednesday, January 29, 2025

What we got wrong about the cloud

Not everyone bought my comparison of how the cloud developed and AI is developing that I posted the other day. But I wanted to flesh out some thoughts about cloud from that post and my presentation at the Linux Foundation Member Summit in 2023.

First, some house keeping. When I write "we got wrong," I don't mean everyone and some—including myself—never fully bought into a number of the widely-believed assumptions. Furthermore, the aim of this post is not to belittle the important (and perhaps growing) role that public clouds play. Rather it's to chart how the cloud has evolved and why over about the past 20 years.

20 years is a convenient timeframe. That's about when Sun Microsystem started talking up Sun Grid. (Coincidentally, it's about when I was getting really established as an IT industry analyst and cloud matters fit neatly into the topics I was covering.) Amazon Web Services (AWS) would roll out its first three services in 2006. This is, in part, just a slice of history. But there are some lessons buried in the assumptions that were generally flawed.

The early narrative

Cloud computing would supposedly follow a trajectory similar to the distribution of electricity over a grid (this was before the deployment of solar power at any scale). As I wrote in CNET in 2009:

The vision of cloud computing, as originally broached by its popularizers, wasn't just about more loosely coupled applications being delivered over networks in more standardized and interoperable ways—a sort of next-generation service-oriented architecture, if you would. Rather, that vision was about a fundamental change to the economics of computing.
As recounted by, among others, Nicholas Carr in his The Big Switch, cloud computing metaphorically mirrors the evolution of power generation and distribution. Industrial-revolution factories—such as those that once occupied many of the riverside brick buildings I overlook from my Nashua, N.H., office—built largely customized systems to run looms and other automated tools, powered by water and other sources.
These power generation and distribution systems were a competitive differentiator; the more power you had, the more machines you could run, and the more you could produce for sale. Today, by contrast, power (in the form of electricity) is just a commodity for most companies—something that they pull off the grid and pay for based on how much they use.

The same article was titled "There is no 'Big Switch' for cloud computing." Go ahead and read it. But I'll summarize some of the key points here and add a few that became clear as cloud computing developed over time.

Economics

One of the supposedly strongest arguments for cloud computing was that of course it would be cheaper. Economies of scale and all that. Quoting myself again (hey, I'm lazy):

Some companies may indeed generate power in a small way [again, pre large-scale solar]—typically as backup in outages or as part of a co-generation setup—but you'll find little argument that mainstream power requirements are best met by the electric utility. The Big Switch argues that computing is on a similar trajectory.
And that posits cloud computing being a much more fundamentally disruptive economic model than a mostly gradual shift toward software being delivered as a service and IT being incrementally outsourced to larger IT organizations. It posits having the five "computers" (which is to say complexes of computers) in the world that Sun CTO Greg Papadopoulos hyperbolically referred to—or at least far, far fewer organizations doing computing than today.

It wasn't clear even at the time if, once you got to the size of a large datacenter, that economies of scale for the equipment in and operations of that datacenter were at that much of a disadvantage. And, over time, even if there are a variety of other reasons to use clouds—especially for predictable workloads and company trajectories—"the cloud is cheaper" often rings hollow. Claims of mass repatriation of applications running on-prem are probably overstated, but many organizations are being more careful about what they run where.

Computing as a utility

“Computing may someday be organized as a public utility just as the telephone system is a public utility,” Professor John McCarthy said at MIT’s centennial celebration in 1961. That, along with the economics, is what underpinned much of the early thinking about cloud computing.

Not only would computing delivered that way be cheaper (so the thinking went), but it would be a simple matter of selling units of undifferentiated compute and storage even if AWS's initial messaging service gave an initial hint that maybe it wasn't as simple as all that.

But a number of early projects had the implicit assumption that cloud was going to be a fairly simple set of services even if some specifics might differ from provider to provider.

The Deltacloud API imagined abstracting away the differences of the APIs of individual cloud providers. Various management products imagined the (always mostly mythical) single-pane-of-glass management across multiple clouds.

In reality though, the major cloud providers came out with an astonishing array of differentiated services. There are ways to provide some level of commonality by keeping things simple and by using certain third-party products. For example my former employer, Red Hat, sells the OpenShift application development platform, based on Kubernetes, that provides some level of portability between cloud providers and on-prem deployments.

However, the world in which you had the same sort of transparency in switching cloud providers that you largely have with electricity never came to pass. Which brings us to...

Cloudbursting

The goal of cloud portability and interoperability largely got obscured by the chimera of cloudbursting, the dynamic movement of workloads from one cloud to another, including on-prem. As I wrote in 2011:

Cloudbursting debates are really about the dynamic shifting of workloads. Indeed, in their more fevered forms, they suggest movement of applications from one cloud to another in response to real-time market pricing. The reasoned response to this sort of vision is properly cool. Not because it isn't a reasonable rallying point on which to set our sights and even architect for, but because it's not a practical and credible near- or even mid-term objective.

Since I wrote that article, it's become even clearer that there are many obstacles to automagical cloud migration—not least of which are the laws of physics especially as data grows in an AI-driven world and, often, the egress charges associated with shipping that data from one cloud to someplace else.

While companies do sometimes move workloads (especially new ones) to public clouds or repatriate certain workloads, usually because they think they can save money, very few workloads are scurrying around from one cloud to another minute to minute, day to day, or even month to month.

The edge

Dovetailing with some of the above is the concept of edge computing, i.e. computing that happens close to users and data.

Some aspects of edge computing are mostly a rebadging of remote and branch office (ROBO) computing environments. Think the computers in the back of a Home Depot big box store. But edge has also come into play with pervasive sensors associated with the Internet of Things (IoT), associated data, and AI operating on that data. Network limitations (and cost of transmitting data) imply filtering and analyzing data near to where it is collected much of the time—even if the original models are developed trained in some central location.

Essentially, edge is one more reason that the idea that everything will move to a cloud was a flawed concept.

Security

There was a lot of angst and discussion early on about cloud security. The reality is that security is almost certainly now seen as less of a concern but that some nuances of governance more broadly have probably increased in importance.

As for security in the more narrow sense, it's come to be generally seen that the major public clouds don't present much in the way of unique challenges. Your application security still matters. As do the usual matters of system access and so forth. You also need to have people who understand how to secure your applications and access in a public cloud environment which may be different from on-prem. But public clouds are not inherently less secure than on-prem systems connected to the public internet.

What has come to be seen as an issue, especially given geo-political conflicts, is where the data resides. While distribution of cloud computing centers to different regions was originally viewed as mostly a matter of redundancy and protecting against natural disasters and the like, increasingly it's about storing data and providing the services operating on that data in a particular legal and governmental jurisdiction.

Conclusion

None of the above should be taken as a takedown of cloud computing. For the right use cases, public clouds have a lot going for them, some of which many saw early on. For example, companies starting out don't need to spend capital on racks of servers when, odds are, they may not need that many servers (or they may need more to handle workload spikes). Flexibility matters.

So does the ability to focus on your business idea rather than managing servers—though public clouds come with their own management complexities.

However, the development of cloud computing over the past 20 years is also a useful lesson. Certainly some technical innovations just don't work out. But others like cloud do—just not in many of the ways that we expect them to. Perhaps that's an obvious point. But still one worth remembering.

Why AI reminds me of cloud computing

Even if you stipulated that cloud computing was going to be a big deal, the early cloud narrative got a lot of things wrong.

To name just a few which I'll deal with in a subsequent post: Cloud wasn't a utility, security really wasn't the key differentiator versus on-premise, and cost savings weren't a slam dunk. Much deeper discussion for another day. Cloud computing was an important movement but the details of that movement were often unclear and a lot of people got a lot of those details wrong.

I posit that the same is the case with AI.

I'm pretty sure that, as someone who was in the industry through the second AI winter, I'd be foolish (probably) to paint AI as yet another passing fad. But I'm also pretty sure that any picture I paint of the five to ten year-out future is going to miss some important details.

Certainly, there's a lot of understandable enthusiasm (and some fear) around large language models (LLMs)). My take is that it's hard to dispute that there is some there there. Talking to ex-IBM exec Irving Wladawsky-Berger at the MIT Sloan CIO Symposium in 2023 we jumped straight to AI. To Irving, “There’s no question in my mind that what’s happening with AI now is the most exciting/transformative tech since the internet. But it takes a lot of additional investment, applications, and lots and lots of [other] stuff.” (Irving also led IBM’s internet strategy prior to Linux.) I agree.

But. And here's where the comparison to cloud comes in; the details of that evolution seem a bit fuzzy.

AI has a long history. The origin of the field is often dated to a 1956 summer symposium at Dartmouth College although antecedents go back to at least Alan Turing.

It's been a bumpy ride. There have probably been at least two distinct AI winters as large investments in various technologies didn't produce commensurate value. The details are also a topic for another day. Where do we stand now?

AI today

The current phase of AI derives, to a large degree, from deep learning which, in turn, is largely based on deep neural networks (NNs) of increasing size (measured in # weights/parameters) trained on increasingly large datasets. There are ongoing efforts to downsize models because of the cost and energy consumption associated with training models but, suffice it to say, it's a resource-intensive process.

Much of this ultimately derives from work done by Geoffrey Hinton in the 1980s on back propagation and NNs in the 1980s but it became much more interesting once plentiful storage, GPUS, and other specialized and fast computing components became available. Remember a 1980s computer was typically chugging along at a few MHz and disk drives were sized in the MBs.

The latest enthusiasm around deep learning in generative AI, of which large language models (LLM) are the most visible subcategory. One of the innovations here is that they can answer questions and solve problems in a way that doesn't require human-supervised labeling of all the data fed into the model training. A side effect of this is that the answers are sometimes nonsense. But many find LLMs an effective tool that's continually getting better.

Let's take AI as a starting point just as we could take cloud of 20 years ago as a starting point. What are some lessons we can apply?

Thoughts and Questions about AIs Evolution

I've been talking about some of these ideas for a while—before there were mutterings of another AI winter. For the record, I don't think that's going to happen, at least not at the scale prior winters. However, I do think we can safely say that things will veer off in directions we don't expect and most people aren't predicting.

One thing I have some confidence in reflects a line from Russell and Norvig's AI textbook, which predates LLMs but I think still applies. “We can report steady progress. All the way to the top of the tree,” they wrote.

The context of this quote is that the remarkable advance of AI over maybe the last 15 years has been largely the result of neural networks and hardware that's sufficiently powerful to train and run models that are large enough to be useful. That's Russell and Norvig's tree.

However, AI is a broader field especially when you consider that it is closely related to and, arguably, intertwined with Cognitive Science. This latter field got its start at a different event a few months after the Dartmouth College AI conference, which is often taken the mark the birth of AI—though the "Cognitive Science" moniker came later. Cognitive Science concerns itself with matters like how people think, how children learn, linguistics, reasoning, and so forth.

What’s the computational basis for learning concepts, judging similarity, inferring causal connections, forming perceptual representations, learning word meanings and syntactic principles in natural language, and developing physical world intuitions?

In other words, questions that are largely divorced from commercial AI today for the simple reason that studies of these fields have historically struggled to make clear progress and certainly to produce commercially interesting results. But many of us strongly suspect that they ultimately will have to become part of the AI story.

There are also questions related to LLMs.

How genuinely useful will they be—and in what domains—given that they can output nonsense (hallucinations)? Related are a variety of bias and explainability questions. I observe that the reaction to LLMs on tech forums differ considerably with some claiming huge productivity improvements and others mostly giving a shrug. Personally, my observation with writing text is that they do a decent job of spitting out largely boilerplate introductory text and definitions of terms and thereby can save some time. But they're not useful today for more creative content.

Of course, what LLMs can do effectively has implications for the labor market as a paper by MIT economist David Autor and co-authors Levy and Murnane argues.

Autor’s basic argument is as follows. Expertise is what makes labor valuable in a market economy. That expertise must have market value and be scarce but non-expert work, in general, pays poorly.

With that context, Autor classifies three eras of demand for expertise. The industrial revolution first displaced artisanal expertise with mass production. But as the industry advanced it demanded mass expertise. Then the computer revolution started, really going back to the Jacquard loom. The computer is a symbolic processor and it carries out tasks efficiently—but only those that can be codified.

Which brings us to the AI revolution. Artificially intelligent computers can do things we can’t codify. And they know more than they can tell us. Autor asks ”Will AI complement or commodify expertise? The promise is enabling less expert workers to do more expert tasks”—though Autor has also argued that policy plays an important role. As he told NPR: “[We need] the right policies to prepare and assist Americans to succeed in this new AI economy, we could make a wider array of workers much better at a whole range of jobs, lowering barriers to entry and creating new opportunities.”

The final wild card that could have significant implications for LLMs (and generative AI more broadly) revolves around various legal questions. The most central one is whether LLMs are violating copyright by training on public but copyrighted content like web pages and books. (This is still an issue with open source software which generally still requires attribution in some form. There are a variety of other open source-related concerns as well such as whether the training data is open.)

Court decisions that limit the access of LLMs to copyrighted material would have significant implications. IP lawyers I know are skeptical that things would go this way but lawsuits have been filed and some people feel strongly that most LLMs are effectively stealing.

We Will Be Surprised

When I gave a presentation at the Linux Foundation Member Summit in 2023 in which I tried to answer what the next decade will bring for computing, AI was on the technologies list of course and I talked about some of the things I've discussed in this post. But the big takeaway I tried to leave attendees with was that the details are hard to predict.

After all, LLMs weren't part of the AI conversation until a couple years ago; ChatGPT's first public release was just in 2022. Many were confident that their pre-teens wouldn't need to learn to drive even if some skeptics like MIT's John Leonard were saying they didn't expect autonomous driving to come in his lifetime. Certainly, there's progress—probably most notably by Waymo's taxi service in a few locations. But it's hard to see the autonomous equivalent of Uber/Lyft's ubiquity anytime soon. Much less assistive driving systems that are a trim option when you buy a car. (Tesla's full self-driving doesn't really count. You still need to pay attention and be ready to take over.)

Friday, February 23, 2024

The Sinking of the Itanic: free ebook

Throughout my stint as an IT industry analyst during the 2000s, one of my significant interests was Intel's Itanium processor, a 64-bit design intended to succeed the ubiquitous 32-bit x86 family. I wrote my first big research report on Itanium and it seemed like something of an inevitability given Intel's dominance at the time.

But there were storm clouds. Intel took an approach to Itanium's design that was not wholly novel but it had never been commercially successful. The dot-com era was also drawing to a close even as Itanium's schedule slipped out. Furthermore, the initial implementation was not ready for primetime for a variety of reasons.

Especially with the benefit of hindsight, there were other problems with the way Intel and its partner, Hewlett-Packard, approached the market with Itanium as well. Itanium would ultimately fail, replaced by a much more straightforward extension to the existing x86 architecture.

This short book draws six lessons from Itanium's demise:

Lesson #1: It’s all about the timing, kid
Lesson #2: Don’t unnecessarily proliferate risk
Lesson #3: Don’t fight the last war
Lesson #4: The road to hell is paved with critical dependencies
Lesson #5: Your brand may not give you a pass
Lesson #6: Some animals can’t be more equal than others

While Itanium is the study point for this book, many of the lessons are applicable to many other projects.

Download your free PDF ebook today.

Wednesday, May 24, 2023

AI is looking summer-y

It never got to the point where the whispers about an impending AI winter got that commonplace, loud, or confident. However, as widespread commercialization of some of the most prominent AI applications—think autonomous vehicles—slipped well past earlier projections, doubts were inevitable. At the same time, the level of commercial investment relative to past AI winters made betting against it wholesale seem like a poor bet.

It’s the technology in the moment’s spotlight. On May 23, it was foundational to products announced at Red Hat Summit in Boston such as Ansible Lightspeed. However, the surprise today would be were AI not to have a prominent position at a technology vendor’s show.

But, as a way to get a perspective that’s less of a pure technologist take, consider the prior week’s MIT Sloan CIO Symposium Driving Resilience in a Turbulent World held in Cambridge MA. This event tends to take a higher-level view of the world, albeit one flavored by technology. Panels this year about how the CIO has increasingly evolved to a chief regulation officer, chief resilience officer, and chief transformation officer are typical of the sort of lenses this event uses to examine the current important trends for IT decision makers. As most large organizations become technology companies—and software companies in particular—it’s up to the CIO to partner with the rest of the C-suite to help chart strategy in the face of changing technological forces.And that means considering tech in the context of other forces—and concerns. For example, supply chain optimization is a broad company business challenge even if it needs technology as part of the puzzle.

AI rears its head

But even if AI was a relatively modest part of the agenda on paper, mostly in the afternoon, everyone was talking about it to a greater or lesser degree.

For example, Tom Peck, Executive Vice President & Chief Information Officer and Digital Officer, Sysco said that they were still “having trouble finding a SKU of AI in the store. We’re trying to figure out how to pluck AI and apply it to our business. Bullish on it but still trying to figure out build vs. buy.”

If I were to summarize the overall attitude towards AI at the event, it was something like: really interesting, really early, and we’re mostly just starting to figure out the best ways to get business value from it.

A discussion with Irving Wladawsky-Berger

I’ve known Irving Wladawsky-Berger since the early 2000s when he was running IBM’s Linux Initiative; he’s now a Research Affiliate at MIT’s Sloan School of Management, a Fellow of MIT’s Initiative on the Digital Economy and of MIT Connection Science, and Adjunct Professor at the Imperial College Business School. He’s written a fair bit on AI; I encourage you to check out his long-running blog.

There were lots of things on the list to talk about. But we jumped straight to AI. It was that sort of day. To Irving, “There’s no question in my mind that what’s happening with AI now is the most exciting/transformative tech since the internet. But it takes a lot of additional investment, applications, and lots and lots of [other] stuff.” (Irving also led IBM’s internet strategy prior to Linux.)

At the same time, Irving warns that major effects will probably not be seen overnight. “It’s very important to realize that many things will take years of development if not decades. I’m really excited about the generative AI opportunity but [the technology is] only about 3 years old,” he told me.

We also discussed The Economist’s How to Worry Wisely about AI issue, especially an excellent essay by Ludwig Siegele titled “How AI could change computing, culture and history.” One particularly thought provoking statement from that essay is “For a sense of what may be on the way, consider three possible analogues, or precursors: the browser, the printing press and practice of psychoanalysis. One changed computers and the economy, one changed how people gained access and related to knowledge, and one changed how people understood themselves.”

Psychoanalysis? Freud? It’s easy to see the role the browser and the printing press have had as world-changing inventions. He goes on to write: “Freud takes as his starting point the idea that uncanniness stems from ‘doubts [as to] whether an apparently animate being is really alive; or conversely, whether a lifeless object might not be in fact animate’. They are the sort of doubts that those thinking about llms [Large Language Models] are hard put to avoid.”

This in turn led to more thought-provoking conversation about linguistic processing, how babies learn, and emergent behaviors (“a bad thing and a bug that has nothing to do with intelligence”). Irving concluded by saying “We shouldn’t stop research on this stuff because it’s the only way to make it better. It’s super complex engineering but it’s engineering. It’s wonderful. I think it will happen but stay tuned.”

The economics

“The Impact of AI on Jobs and the Economy” closed out the day with a keynote by David Autor, Professor of Economics, MIT.

If you want to dive into an academic paper on the topic, here’s the paper by Autor and co-authors Levy and Murnane.

However, Autor’s basic argument is as follows. Expertise is what makes labor valuable in a market economy. That expertise must have market value and be scarce but non-expert work, in general, pays poorly.

With that context, Autor classifies three eras of demand for expertise. The industrial revolution first displaced artisanal expertise with mass production. But as the industry advanced it demanded mass expertise. Then the computer revolution started, really going back to the Jacquard loom. The computer is a symbolic processor and it carries out tasks efficiently—but only those that can be codified.

Which brings us to the AI revolution. Artificially intelligent computers can do things we can’t codify. And they know more than they can tell us. The question Autor posits is ”Will AI complement or commodify expertise? The promise is enabling less expert workers to do more expert tasks”—though Autor has also argued that policy plays an important role. As he told NPR in early May: “[We need] the right policies to prepare and assist Americans to succeed in this new AI economy, we could make a wider array of workers much better at a whole range of jobs, lowering barriers to entry and creating new opportunities.”

Sunday, April 23, 2023

Kubecon: From contributors to AI

I find that large industry shows like KubeCon + CloudNativeCon (henceforth referred to as just KubeCon for short) are often at least as useful for plugging into the overall zeitgeist of the market landscape and observing the trajectory of various trends as they are for diving deep on any single technology. This event, held in late April in Amsterdam was no exception. Here are a few things that I found particularly noteworthy; they may help inform your IT planning.

Contributors! Contributors! Contributors!

Consider first who attended. With about 10,000 in-person registrations it was the largest KubeCon Europe ever. Another 2,000 never made it off the waiting list. Especially if you factor in tight travel budgets at many tech companies, it’s an impressive number by any measure. By comparison, last year’s edition in Valencia had 7,000 in-person attendees; hesitancy to attend physical events has clearly waned.

Some other numbers. There are now 159 projects within the Cloud Native Computing Foundation (CNCF) which puts on this event; the CNCF is under the broader Linux Foundation umbrella. It started with one, Kubernetes, and even as relatively recently as 2017 had just seven. This highlights how the cloud native ecosystem has become about so much more than Kubernetes. (It also indirectly suggests that a lot of complaints about Kubernetes complexity are really complaints about the complexity of trying to implement cloud-native platforms from scratch. Hence, the popularity of commercial Kubernetes-based platforms that do a lot of the heavy lifting with respect to curation and integration.)

Perhaps the most striking stat of all though was the percentage of first-timers at Kubecon: 58%. Even allowing for KubeCon’s growth, that’s a great indicator of new people coming into the cloud-native ecosystem. So all’s good, right?

Mostly. I’d note that the theme of the conference was “Communities in Bloom.” (The conference took place with tulips in bloom around Amsterdam.) VMware’s Dawn Foster and Apple’s Emily Fox also gave keynotes on building a sustainable contributor base and saving knowledge as people transition out of a project respectively. This all has a common theme. New faces are great but having a torrent of new faces can stress maintainers and various support systems. The torrent needs to be channeled.

Liz Rice, Chief Open Source Officer at Isovalent and Emeritus chair of the Technical Oversight Committee put it to me this way. The deliberate focus on community at this KubeCon doesn’t indicate a crisis by any means. But the growth of the CNCF ecosystem and the corresponding level of activity is something to be monitored and perhaps some steps taken in response.

It’s about the platform

The rise of the platform engineer and the “platform” term generally has really come into the spotlight over the past couple of years. Panelists on the media panel about platform engineering described platforms as having characteristics such as documentable, secure, able to connect to different systems like authentication, incorporating debuggability and observability, and perhaps most of all, flexibility.

From my perspective, platform engineering hasn’t replaced DevOps as a concept but it’s mostly a more appropriate term in the context of Kubernetes and the many products and projects surrounding it. DevOps started out as something that was as much about culture as technology; at least the popular shorthand was that of breaking down the wall between developers and operations. While communicating across silos is (mostly) a good thing, at scale, operations mostly provisions a platform for developers — perhaps incorporating domain-specific touches relevant to the business — and then largely gets out of the way. Site Reliability Engineers (SRE) shoulder much of the responsibility for keeping the platform running rather than sharing that responsibility with developers. The concept isn’t new but “DevOps” historically got used for both breaking down walls between the two groups and creating an abstraction that allowed the two groups to largely act autonomously. Platform engineering is essentially co-opting the latter meaning.

The latest abstraction that we’re just starting to see is the Internal Developer Platform (IDP) — such as the open source Backstage that came out of Spotify. “Freedom with guardrails” is how one panelist described the concept. An IDP provides developers with all the tools they need under a governing IT governance umbrella; this can create a better experience for developers by presenting them with an out-of-the-box experience that includes everything they need to start developing. It’s a win for IT too. It cuts onboarding time and means that development organizations across the company can use the same tools, have access to the same documentation, and adhere to the same standards.

Evolving security (a bit)

Last fall, security was pervasive at pretty much every IT industry event I attended, including KubeCon North America in Detroit. It featured in many keynotes. Security vendor booths were omnipresent on the show floor.

It’s hard to quantify the security presence at this KubeCon by comparison. To be clear, security was well-represented both in terms of booths and breakouts. And security is so part and parcel of both platforms and technology discussions generally that I’m not sure if it would even be possible to quantify how much security was present.

However, after making myself a nuisance with several security vendors on the show floor, I’ll offer the following assessment. Security is as hot a topic as ever but the DevSecOps and supply chain security messages are getting out there after a somewhat slow start. So there may be less need to bang the drum quite so loudly. One security vendor also suggested that there may be more of a focus on assessing overall application risk rather than making security quite so much about shifting certain specific security processes earlier in the life cycle. Continuous post-deployment monitoring and remediation of the application as a whole is at least as important. (They also observed that the biggest security focus remains in regulated industries such as financial services.)

An AI revolution?

What of the topic of the moment — Large Language Models (LLM) and generative AI more broadly? These technologies were even the featured topic of The Economist weekly magazine that I read on my way back to the US from Europe.

The short answer is that they were an undercurrent but not a theme of the event. I had a number of hallway track discussions about the state of AI but the advances, which are hard to ignore or completely dismiss even for the most cynical, have happened so quickly that there simply hasn’t been time to plug into something like the cloud-native ecosystem. That will surely change.

It did crop up in some specific contexts. For example, in the What’s Next in Cloud Native panel, there was an observation that Day 2 operations (i.e. after deployment) are endlessly complex. AI could be a partial answer to having a more rapid response to the detection of anomalies. (To my earlier point about security not being an island relative to other technologies and processes.) AIOps is already an area of rapid research and product development, but there’s the potential for much more. And indeed, a necessity, as attackers will certainly make use of these technologies as well.

Monday, January 17, 2022

Hardware data security with John Shegerian

John Shegerian is the co-founder and executive chairman of recycling firm ERI and the author of The Insecurity of Everything. In this podcast, we talk about both the sustainability aspects of electronic waste and the increasing issue of the security risk associated with sensitive data stored on products that are no longer in use. Shegerian argues that CISOs and others responsible for the security and risk mitigation at companies have historically been mostly focused on application security.

However, today, hardware data security is very important as well. He cites one study where almost 42% of 159 hard drives purchased online still held sensitive data. And the problem extends beyond hard drives and USB sticks to a wide range of devices such as copiers that store data that has passed through them.

Shegerian details some of the steps that companies (and individuals) can take to reduce the waste they send to landfills and to prevent their data from falling into the wrong hands.

Fill out this form for a free copy of The Insecurity of Everything: https://eridirect.com/insecurity-of-everything-book/

Listen to the podcast [MP3 - 26:55]

Tuesday, January 04, 2022

RackN CEO Rob Hirschfeld on managing operational complexity

There's a lot of complexity, both necessary and unnecessary, in the environments where we deploy our software. The open source development model has proven to be a powerful tool for software development. How can we help people better collaborate in the open around operations? How can we create a virtuous cycle for operations?

Rob and I talked about these and other topics before the holidays. We also covered related topics including the skills shortage, complexity of the software supply chain, and building infrastructure automation pipelines.