Connections: What I'm keeping my eye on for computing's future

A bit over a year ago, I gave a presentation at the Linux Foundation Member Summit, where I took a look at some of the technologies catching a lot of attention and how the might develop. At the time, a lot of this was based on various work around Red Hat, including Red Hat Research, where I was working at the time. Think of this as both a fleshing out of the presentation, for those who weren't there, and a minor updating.

Here's the list. It mostly has to do with software and hardware infrastructure with an emphasis on open source. https://bitmason.blogspot.com/2020/04/podcast-if-linux-didnt-exist-would-we.htmlI'll be diving more deeply into some of the individual technologies over time. I don't want to turn this into a 10,000 word+ essay but mostly want to lay out some of the things I currently see as notable.

First, let me say that tech development has frequently surprised us. Here are a few items I came up with:

Microsoft Windows on servers (as seen from ~1990) and, really, Microsoft was going to nominate everything, right?

It looked poised to dominate. But it "merely" became important. Colorful language from Bryan Cantrill on this particular topic. And, really, it transitioned to Azure over time.

Linux and open source (as seen from mid-90s)

Linux was a curiosity and, really, so were things by BSD and the Apache Web Server until the first big Internet buildout got going in earnest. (And probably even then as far as enterprises were concerned until IBM made its very public Linux endorsement.

Server virtualization (as seen from 1999)

Server virtualization had been around for a long time. VMware made it mainstream. But it didn't become a mainstream production server tech until the economy soured and tech spending dried up to a significant degree at the beginning of the aughts.

Public cloud (as seen mostly incorrectly from mid-2000s)

Public clouds more or less came out of nowhere even if they had antecedents like timesharing.

Back-end containers/orchestration/cloud-native (from ~2010)

This came out of nowhere too from the perspective of most of the industry. Containers had been around a long time too. But they mostly settled into a largely failed lightweight alternative to virtualization role. Then Docker came along and made them a really useful tool for developers. Then Kubernetes came along and made them essentially the next generation alternative to enterprise virtualization—and a whole ecosystem developed around them.

Smartphones (as seen from ~2005)

I won't belabor the point here but the smartphones of 2005 were nothing like, and much less interesting, than the Apple and Android smartphones that came to dominate most of the industry.

The heavy hitters

These are the technology and related areas I'm going to hit on today. A few others for a future time. I also wrote about artificial intelligence recently although it impinges on a couple other topics (and, indeed, maybe all of them):

Data
Automation
Heterogeneous infrastructure

Data

Data touches on so many other things. The enormous training datasets of AI, observability for intelligent automation, the legal and regulatory frameworks that govern what data can be used, what data can be opened up, and even what open source really means in a world where data is at least as important as code. Security and data sovereignty play in too.

Let's break some of this down starting with security and data sovereignty.

There are a ton of subtopics here. What am I specifically focused on? Zero trust, confidential computing, digital sovereignty, software supply chain… ? And what are the tradeoffs?

For example, with confidential computing, though early, can we project that—in addition to protecting data in rest and in transit—it will increasingly be considered important to protect data in use in a way that makes it very hard to learn anything about a running workload.

As digital sovereignty gains momentum, what will the effects on hyperscalers and local partnering requirements be—and what requirements will regulations impose?

There are major open legal questions around the sources of training data. After all, most of the contents of the web—even where public—are copyrighted non-permissively. Not a few people consider the use of such data as theft though most of the IP lawyers I have spoken with are skeptical. There are ongoing lawsuits however, especially around the use of open source software for tools like GitHub's CoPilot.

There is also a whole category of license geek concerns around what "open" means in a data context, both for AI models and the data itself. Some of these concerns also play into releasing anonymized data (which is probably less anonymous than you think) under open licenses.

Automation

Although some principles have remained fairly constant, automation has changed a lot since it was often lumped into configuration management within the domain of IT. (I'm reminded of this on the eve of post-FOSDEM CfgMgmtCamp in Ghent, Belgium which I attended for a number of years when Ansible was often viewed as another flavor of configuration management in the vein of Puppet and Chef, the latter two being mostly focused on managing standard operating environments.)

The bottom line is that complexity is increasingly overwhelming even just partially manual operations.

This is one of the many areas where AI is playing into. While early days—with the AI term being applied to a lot of fairly traditional filtering and scripting tools—nonetheless AIOps is an area of active research and development.

But there are many questions:

Where are the big wins in dynamic automated system tuning that will most improve IT infrastructure efficiency? What’s the scope of the automated environment? How much autonomy will we be prepared to give to the automation and what circuit breakers and fallbacks will be considered best practice?

Maybe another relevant question is: What shouldn’t we automate? (Or where can we differentiate ourselves by not automating?) Think, for example, overly automated systems to serve customers either in-person or over the phone. Costs matter of course but every exec should be aware of systems causing customer frustration (which are very common today). Many aspects of automation can be good and effective but, today, there's a ton of sub-par automation determined to keep enabled humans from getting involved.

Heterogeneous infrastructure

I was maybe 75% a processor/server analyst for close to 10 years so some of the revitalized interest in hardware as something other than a commodity warms my heart.

Some of this comes about because simply cranking the process on x86 no longer works well—or at least works more slowly than over the past few decades. There are still some levers in interconnects and potentially techniques like 3D stacking of chips but it's not the predictable tick-tock cadence (to use Intel's term) that it was in years past.

GPU's, especially from NVIDIA and especially as applied to AI workloads, have demonstrated that alternative more-or-less workload-specific hardware architectures are often worth the trouble. And it's helped generate enormous value for the company.

Cloud APIs and open source play into this dynamic as well in a way that helps allow a transition away from a single architecture that evolves in lockstep with what independent software vendors (ISVs) are willing to support. Not that software doesn't still matter. NVIDIA's CUDA Toolkit has arguably been an important ingredient of its success.

But the twilight of the CMOS process scaling that has helped cement x86 as the architecture almost certainly goes beyond CPUs. Google's Tensor Processing Units (TPUs) and the various x86 Single Instruction Multiple Data (SIMD) extensions have not had the impact of x86 but there are interesting options out there.

RISC-V is an open-standard instruction set architecture (ISA) being explored and adopted by major silicon vendors and cloud providers. This article from Red Hat Research Quarterly discusses RISC-V more deeply particularly in a Field Programmable Gate Array context.

More broadly, although RISC-V, so far, has been deployed primarily in relatively small, inexpensive systems, the architecture is fully capable of both server and desktop/laptop deployments. Though a few years old—so don't pay too much attention to the specifics—this interview I did with RISC-V's CTO gets into a lot of the approach that RISC-V is taking that applies today.

While longer term and more speculative, quantum computing is another important area of hardware innovation. A quantum computer replaces bits with qubits—controllable units of computing that display quantum properties. Qubits are typically made out of either an engineered superconducting component or a naturally occurring quantum object such as an electron. Qubits can be placed into a "superposition" state that is a complex combination of the 0 and 1 states. You sometimes hear that qubits are both 0 and 1, but that's not really accurate. What is true is that, when a measurement is made, the qubit state will collapse into a 0 or 1. Mathematically, the (unmeasured) quantum state of the qubit is a point on a geometric representation called the Bloch sphere.

While superposition is a novel property for anyone used to classical computing, one qubit by itself isn't very interesting. The next unique quantum computational property is "interference." Real quantum computers are essentially statistical in nature. Quantum algorithms encode an interference pattern that increases the probability of measuring a state encoding the solution.

While novel, superposition and interference do have some analogs in the physical world. The quantum mechanical property "entanglement" doesn't, and it's the real key to exponential quantum speedups. With entanglement, measurements on one particle can affect the outcome of subsequent measurements on any entangled particles—even ones not physically connected.

While a lot of attention has focused on the potential impact of quantum on cryptography, it's more broadly imagined (assuming various advances) to potentially increase efficiencies or even to solve problems that are just not practically solvable in many fields.

Conclusion

Broadly, there's this idea that we'll go beyond x86 to what goes by various names but amount to aggregating various types of resources dynamically to meet workload needs. Those resources will be increasingly diverse: To optimize for different workloads, to work with large amounts of data, and to automate wherever it makes sense to do so.

Monday, February 03, 2025

What I'm keeping my eye on for computing's future

No comments:

Important links

Blog Archive

Podcasts (retired)

Labels