Wednesday, January 21, 2015

Don't skeuomorph your containers

Containers were initially pitched as more or less just another form of partitioning. A way to split large systems into smaller ones in which workloads not requiring a complete system by themselves could coexist without interfering with each other. Server/hardware virtualization is the most familiar form of partitioning today but, in its x86 form, it was only the latest in a long series of partitioning techniques initially applied mostly to mainframes and Unix servers. 

The implementation details of these various approaches differed enormously and even within a single vendor—nay, within a single system design—multiple techniques hit different points along a continuum which mostly traded off flexibility against degree of isolation between workloads. For example, the HP Superdome had a form of physical partitioning using hardware, a more software-based partitioning approach, as well as a server virtualization variant for HP-UX on the system’s Itanium processors. 

But, whatever their differences, these approaches didn’t really change much about how one used and interacted with the individual partitions. They were like the original pre-partitioned systems, There were just more of them and they were correspondingly smaller. Indeed that was sort of the point. Partitioning was fundamentally about efficiency and was logically just an extension of resource management approaches that allowed for the co-existence of multiple workloads historically .

Ohc breakout 02

At a financial industry luncheon discussion I attended last December, one of the participants coined a term that I promptly told him I was going to steal. And I did. That term was “skeuomorphic virtualization” which he used to describe hardware/server virtualization. Skeuomorphism is usually discussed in the context of industrial design. Wikipedia describes a skeuomorph as "a derivative object that retains ornamental design cues from structures that were necessary in the original.” The term has entered the popular lexicon because of the shift away from shadows and other references to the physical world such as leather-patterned icons in recent versions of Apple’s iOS

However, the concept of skeuomorphism can be thought of as applying more broadly—to the idea that existing patterns and modes of interaction can be retained even though they’re not necessarily required for a new technology. In the case of “skeuomorphic virtualization,” a hypervisor abstracts the underlying hardware. While this abstraction was employed over time to enable new capabilities like live migration that were difficult and expensive to implement on bare metal, virtualized servers still largely look and feel like physical ones to their users. Large pools of virtualized servers do require new management software and techniques—think the VMware administrator role—but the fundamental units under management still have a lot in common with a physical box: independent operating system instances that are individually customizable and which are often relatively large and long-lived. Think of all the work that has gone into scaling up individual VMs in both proprietary virtualization and open source KVM/Red Hat Enterprise Virtualization. 

In fact, I’ll go so far as to argue that the hardware virtualization approach largely won out over the alternatives of the time in c. 2000 because of skeuomorphism. Hardware virtualization let companies use their servers more efficiently by placing more workloads on each server. But it also let them continue to use whatever hodgepodge of operating system versions they were using and to continue to treat individual instances as unique “snowflake” servers if they so chose. The main OS virtualization (a.k.a. containers) alternative at the time—SWSoft’s Virtuozzo—wasn’t as good a match for highly heterogeneous enterprise environments because it required all the workloads on a server to run atop a single OS kernel. In other words, it imposed requirements that went beyond the typical datacenter reality of the day. (Lots more on that background.)

Today, however, as containers enjoy a new resurgence of interest, it would be a mistake to continue to treat this form of virtualization as essentially a different flavor of physical server. As my Red Hat colleague Mark Lamourine noted on a recent podcast:

One of the things I've hit so far, repeatedly, and I didn't really expect it at first because I'd already gotten myself immersed in this was that everybody's first response when they say, "Oh, we're going to move our application to containers," is that they're thinking of their application as the database, the Web server, the communications pieces, the storage.They're like, "Well, we'll take that and we'll put it all in one container because we're used to putting it all on one host or all in one virtual machine. That'll be the simplest way to start leveraging containers." In every case, it takes several days to a week or two for the people looking at it to suddenly realize that it's really important to start thinking about decomposition, to start thinking about their application as a set of components rather than as a unit.

In other words, modern containers can be thought of and approached as “fat containers” that are essentially a variant of legacy virtual machines. But it’s far more fruitful and useful to approach them as something fundamentally new and enabling that’s part and parcel of an environment including containerized operating systems, container packaging systems, container orchestration like Kubernetes, DevOps practices, microservices architectures, “cattle” workloads, software-defined everything, and pervasive open source as part of a new platform for cloud apps. 

 

 

Wednesday, January 14, 2015

Links for 01-14-2015

Wednesday, January 07, 2015

Photo: Zabriskie Point from last fall

After Amazon re:Invent in Las Vegas I spent a few days in Death Valley (which is one of the few redeeming things about going to Las Vegas). On my last morning, got an interesting mix of sun and clouds. Zabriskie Point was actually supposed to be closed for various reconstruction work but the closing had been moved out a month.

Podcast: Containerized operating systems with Mark Lamourine

Packaging applications using containers is a hot trend that goes well beyond containers as just operating system virtualization. In this podcast, we discuss the benefits of a containerized operating system like Red Hat Enterprise Linux Atomic Host, how it works from a technical perspective, and how containers aren't just another take on virtualization.


Listen to MP3 (0:17:08)
Listen to OGG (0:17:08)

[Transcript]

Gordon Haff:  Hi, everyone. This is Gordon Haff with the New Year's edition of the Cloudy Chat Podcast. I'm once again here with my colleague, Mark Lamourine.
Welcome, Mark.
Mark Lamourine:  Hello.
Gordon:  We've been talking a lot about containers and the orchestration of containers. For this session, we'll talk about where those containers run. We're going to talk a little bit about the generic containerized operating systems, and then get into some specifics about the Red Hat Enterprise Linux 7 Atomic Host which is now in beta.
We have an upstream project, Project Atomic, and we have the beta of our commercial offering. For the rest of this podcast, we're going to talk about Atomic or Atomic host or RHEL Atomic, and that's just shorthand for this technology in general. Mark, maybe you could start us off with talking a little bit about the idea behind container hosts in general.
Mark:  One of the important things about containers is that they make it possible to do some things that you can't do if you're running on the bare host. It allows you to include libraries and things that might not be resident on every host. You don't have to worry about it. As the application designer, you just include the parts you need.
They run inside this virtual container. One of the things people noticed right away is that if you start doing this, suddenly a lot of the things that are on the host on a general purpose host aren't really necessary there because the containers all bring along whatever they need. It very quickly became evident that you could pare out a lot of the general purpose applications leaving only a minimal host which is designed specifically to run containers.
Gordon:  Because you can basically put the specific things that a given application needs, package it up, and essentially be part of that application.
Mark:  It turns out that once you have a set of containers, the way you work with them is merely to start and stop the containers. You don't have to run a lot of other commands to make the applications run. There's no point in having those commands there at all.
Gordon:  This really comes back to the theme of one of our earlier podcasts that we did towards the end of last year that effectively containers have almost evolved from this thinking around being a way of virtualizing the operating system. Which they are from a technical standpoint, but the thing that's really interesting people about them is they're a way to essentially virtualize and package up applications.
Mark:  That's really one of the more critical aspects of containerization is it's really a new software delivery mechanism. We started off with tarballs and GZIPs and graduated, although people curse them sometimes, to packaging systems, whether it's RPM, Debian or a SysV packagingon Solaris.
We've got some other things if you go to different languages. Ruby has its own Ruby packaging mechanism, the Ruby Gem. Python has their own. But each of those is language specific and application specific. They load more stuff on your host.
What containers bring is the ability to keep your host clean, to not have all of that extra stuff burdening the host that's running the application. Those parts are actually in the container with your application. Your host doesn't even have to know they're there.
Gordon:  Conceptually, everything we've been talking about really applies to modern containers as a general concept. Mark, may we talk about what some of the differences or flavors or differing approaches or philosophies there are around some of the different containerized operating system approaches out there?
We've got Atomic. We've got CoreOS. There are various other types of projects that are in the works.
Mark:  In some cases, they're very similar. They're all getting at the idea of a very minimalized base operating system that is designed and tuned for running containers. CoreOS had that where before Docker, they had a means of logging on and just running individual pieces, but Docker was the thing that really brought it all to life by making it is so that you could create new containers easily.
You could create images easily, then instantiate them easily, and it created an ecosystem that has started really driving this concept. At their base, they're very similar. They do have some differing philosophies when it comes to management. That's, I think, where some of the differentiation is going to come in.
Gordon:  Could you maybe go into some details about that about, about how, let's say, Atomic does things differently from CoreOS, for example?
Mark:  Atomic started with a couple of different projects. It started first with Colin Walters' OSTree. One of the ideas about these containerized hosts is that because you are not installing a bunch of applications and then having to maintain them each individually, you can create a system where you can do your updates to the operating system. You can do them and be able to roll back.
With a RPM‑based maintenance system, once you install the RPMs or the Debian packages, you can remove them, but you can't easily go back to the previous state. Both CoreOS and Project Atomic have this idea that when you replace or update the operating system image, you do it in an Atomic fashion. You're doing this in a way which allows you to have nearly perfect roll back to a previous state.
Now they have very different ways of accomplishing this, but, in essence, they have the same goal. That's actually a secondary goal to the container host, but it's something that becomes possible and reasonable when you have a fairly small host versus a more generalized one which has lots and lots of packages to maintain.
Gordon:  Could you describe under the cover how Atomic is doing things?
Mark:  The way Atomic does this is that the author of OSTree, Colin Walters, created a mechanism where instead of having a single file system tree, he has a hidden file system tree that's controlled by the boot loader. What he really has is one that has all of the files in it. Then he has two separate file system trees which contain hard links to those actual files.
When you do an Atomic update, you're only updating one of those trees. You're running the other one. But because the one you're running remains unchanged while you're doing an update, you can reboot forward to your new environment. If that fails, you can reboot back to your old one which hasn't been modified.
Gordon:  That's really quite a change from what you've traditionally done because you build up an operating system, and you don’t have much choice but to just wipe everything clean and start over again, which you couldn’t easily do because all your applications were in there and all your customizations, and it was really hard to do this kind of thing.
Mark:  If you break your system rolling forward, it leaves you with a few very limited choices, which were things like rebuild the machine from kickstart and configuration management. Atomic allows you to do it in a much cleaner fashion and with a lot more assurance that you're going to get what you expect.
Gordon:  The other point we probably ought to highlight with respect to Atomic is that this is built from Red Hat Enterprise Linux 7. All of the certifications, the hardware certifications and other types of certifications, and support mechanisms and everything associated with RHEL 7 still apply to Atomic Host. You get all these benefits you're talking about, but it's still the RHEL that you know and love.
Mark:  If you logged into one, you wouldn't be able to tell it's not RHEL, unless you know where to look to find the Atomic label. If you log into one as a user, the only thing you're going to notice is that there are very few actual user accounts because all the applications run in a container. There's no need for lots of special user accounts.
You log in as root and you do your work, and usually you'll use some orchestration or distributed control system to actually start and stop the containers. It looks like RHEL. It is RHEL.
Gordon:  From the orchestration perspective, it actually includes Kubernetes which is the framework for managing clusters of containers. We've been collaborating with Google on this.
Mark:  Kubernetes is a work in progress. Google is working with us developing, as you call it, an orchestration system. It's a way for you to, instead of saying what you want to do, saying, "Start this Docker command," or, "Stop this Docker command," you get to decide, "I want to build an application with a database and a Web server and a messaging server."
You describe all this and say, "Go," and Kubernetes makes it happen. You don't have to worry about the actual placement.
Gordon:  One of the really interesting things, from my perspective of Atomic, is we are getting into all of this cool, new containerized type of stuff. We're getting the container portability across hybrid cloud deployments, differentphysical hardware, certified hypervisors, and so forth, public cloud providers like Amazon.
But, at the same time, from a sys admin perspective, this is not really a radical change to how they've conventionally done things.
Mark:  There are ways in which it's the same. The hosts are going to be deployed using PXE or some kind of install‑to‑disc mechanism. The user management of those hosts will probably be very similar. There actually is going to be some significant change and some significant learning in how to use these things and where the boundaries are.
That's one of the things that's going to change is where the boundaries are between the admins and the users, the application developers, the operations people. I think that's going to settle out. The boundaries are going to shift. It might not come out the same way it would three years ago.
Gordon:  That is a fair point. I was at a luncheon in New York City before the holidays, and we were having a discussion about containers. One of the points that really came out of the discussion that I think is an important one is you can use containers to look like a slightly different, maybe a little less isolated, a little bit more efficient version of server virtualization.
But, and I think this is really the key point, using containers most efficiently really requires thinking about applications, application development, and application architectures in a lot of different ways.
Mark:  One of the things I've hit so far, repeatedly, and I didn't really expect it at first because I'd already gotten myself immersed in this was that everybody's first response when they say, "Oh, we're going to move our application to containers," is that they're thinking of their application as the database, the Web server, the communications pieces, the storage.
They're like, "Well, we'll take that and we'll put it all in one container because we're used to putting it all on one host or all in one virtual machine. That'll be the simplest way to start leveraging containers." In every case, it takes several days to a week or two for the people looking at it to suddenly realize that it's really important to start thinking about decomposition, to start thinking about their application as a set of components rather than as a unit.
That's one of the places where the orchestration models are going to come in because they're going to allow you to, first, decompose your application from the traditional model, and then recompose it and still treat it as an application, but now using these container components.
Gordon:  One of the folks I was having lunch with, I forget who it was actually, but I told him I was going to steal this term of his. He referred to server virtualization as "skeuomorphic virtualization." What he meant by that was that when server virtualization really came in, one of the reasons it was so successful was that it made physical servers better utilized, and therefore more cost effective.
But, by and large to a first approximation, it didn't change the whole operational and management model of servers. As you say, you can, in principle, use containers the same way. In fact, service providers pretty much have done that. It's a more efficient form of virtualization.
The reason everyone's so excited here, and the reason we're having this series of podcasts, is that it enables things like DevOps. It enables new operational models. It enables new application architectures.
Mark:  The last one is really the interesting one. I like the skeuomorphic metaphor because the reason virtualized operating systems, virtualized hardware was adopted so easily is because everybody went, "Oh. Oh, I get it. That's just like my hardware. Once I get past the first piece."
Containers really aren't. Containers really are a little different. To get the best advantage from them, it's going to take a little bit different thinking along the way. One of the things, the Holy Grail of software development has been the idea of hardware store of objects where you could walk down the aisles of your hardware store, pick up a hammer and a bunch of plywood and two‑by‑fours, and build something.
All of those things were standardized. You'd have all the standard plumbing, heating, and whatever. All of the efforts so far have failed to some degree or other. You look at object‑oriented programming. People thought, "Oh, this is going to completely change the way we program." It's had some effect, but it hasn't had the effect of, "Oh, this is just a hardware store where I go in and pick the components I want and it all works."
Containerization, I don't know if it's going to be successful, but I think it has more promise than the previous ones did. That you can create a database container that is generic enough that it only exposes the relevant variables and that somebody can come along, once they have a certified one, and say, "I would like a database container.I need to give the five variables to the consumer of the container and initialize the database, and I'm ready to go."
The component will be reusable to the point where the user no longer has to really think beyond, "Here are my inputs."
Gordon:  Great, Mark. For those listeners who want to take a look at this, as I said at the beginning, the Red Hat Enterprise Linux 7 Atomic Host is available in beta. It's on both Amazon Web Services and the Google Compute platform. If you want to take a look at the upstream project, that is Project Atomic and links to all that stuff will be in the show notes.

This gets us off to a great start in the New Year. We're going to be talking much more about these and related topics in upcoming podcasts. Thank you, everyone. Thank you, Mark.