Friday, February 20, 2015

Podcast: Configuration Manangement with Red Hat's Mark Lamourine

We take things up a level from the prior podcasts about container management in this series to discuss the goals of configuration management and how things change (and don't) with containers, what's the meaning of state, promise theory, and containerized operating systems such as Project Atomic.

Previous container podcasts with Mark:
Other shownotes:
Listen to MP3 (0:16:52)
Listen to OGG (0:16:52)

[Transcript]

Gordon Haff:  Hi, everyone. This is Gordon Haff with another disk of Cloudy Chat podcast, here once again with my colleague, Mark Lamourine.
For today, we're going to take things up level and talk about what is configuration management at a conceptual level and some of the ways that configuration management is changing in a cloud and containerized world.
One of the reasons this is an interesting topic today, and it really is an interesting topic--I was at configuration management camp in Ghent Belgium a couple of weeks ago. The event sold out. It was absolutely packed.
All the different configuration management systems out there, enormous amount of interest in this space. The reason there is so much interest is because what has been classic configuration management is changing in pretty fundamental ways.
Mark, you've got a lot of experience as a system admin. You're very familiar with classic or conventional configuration management systems. Maybe a good way to start would be to talk from your perspective as a system admin what is or was configuration management classically.
Mark Lamourine:  It's going to stay. I don't think that's changing. It's finding new places to be applicable. Most people, when they talk about configuration management, they talk about managing the configuration of individual hosts as a bigger system.
Allowing you to create either a portion or a complete enterprise specification for how all of your machines should be configured and then defining that specification and then using the configuration management system to realize that.
You make it so that each machine, as it comes up, joins your configuration management system. Then the processes run on the box to make it fit, to make it configured like your definition, your specification of what that machine should be.
Of the elements of this usually are one controversial one, is whether there's an agent running on each host that listens for changes and there are discussions about whether this is a good or a bad thing and what to do about it.
But the other big thing is that there is some global state definition for what the larger system, the group of hosts should look like and how it should behave.
Gordon:  This gets into a lot of the, again, classic thinking about systems in general, certainly in a pre‑cloud world.
This really applied, whether we're talking physical servers or virtualized servers, that there is some correct state that everything is not only driven toward to start with, but is constantly monitored to keep in tune with that correct truth, if you would.
Mark:  It started out when I was the young cub sysadmin, we'd go and we had a set of manual procedures that started out as things in our head set the network, set resolv.conf, set the host name, make sure time services were running.
It would start off when you only had a short list of these things that you would do and then hand it over, it wasn't really a big deal. You'd go to each one, you install the machine, you'd spend 15 minutes making it fit into your network and then you'd hand it off to some developer or user.
Over time, we realized that we were doing an awful lot of this and we were hiring lots of people to do this, so we need to write scripts to do it. Eventually, people started writing configuration management systems, starting with Mark Burgess and CFengine. That was the origin of that.
There were a number of that during that time. Then CFengine and Puppet became the defacto ones for a while that, as everyone knows, that's changing a lot now. The idea was that we were doing these tasks manually and then when we stop, we started automating them, we were automating them in a custom way.
These people recognize the patterns and said, "We can do this. There's a pattern here that we can automate, that we can take one step higher." That led to these various systems which would make your machines work a certain way. The specifications we had, the settings we had were fairly static. That made a lot of sense.
Gordon:  One of the things that's probably worth mentioning, and this gets into this "pets" versus cattle or models of state of systems is that because these systems were pets, i.e, you didn't shut it down and stored it up with a clean version of the same thing.
Really, you try to keep the running system running properly. One of the traditional jobs of configuration management was to take care of things like drift. As these systems change, again, bring them back to the correct state of truth.
Mark:  In some sense, the pets versus cattle model, was that way of thinking was enabled by the invention of configuration management systems. People look at it the other way now. When things were pets, it was because they had to be.
The rate of change was slow enough that drift was less of an important thing than just not having to send someone to spend an hour to bring a new machine online.
The fact that you could use these things to prevent drift or to drive change over large groups of systems, I think that was a side effect and something that people realized after they started using the tools to stop doing manual labor.
The cattle versus pets distinction is one that was enabled when all of the sudden, you realize...We use to measure the difficulty of working in an environment by the number of machines per administrator.
When I was first starting, it's like 10 to 1 or 15 to 1 was a good ratio because of the amount of manual labor that went into it.
Then with the start of CM systems, 100 to 1 or 200 to 1 in data center environments was a good ratio. Now, you don't even look at that anymore? Why would you? Because you've got thousands of VMs.
You get a system like OpenStack or Amazon, you don't even look at the ratio of hosts to sysadmins anymore. It's become irrelevant. It's become irrelevant because these systems made cattle versus pets possible.
Gordon:  You mentioned Mark Burgess You mentioned this idea of state. Let's talk a little bit more about this. How do you think both state as we move to these containerized cloud‑type systems?
Mark:  I'm confused a little bit. We're finding how this older idea, which made a lot of sense when the machines changed very slowly or relatively slowly, how does that fit when the machines are changing?
The case of a small enterprise, it might be tens of machines started and stopped per day, or hundreds, to something like Amazon or OpenStack, where it's thousands, maybe even thousands per minute. I don't know.
I've seen numbers from Google where they have thousands of machines starts and stops per minute over the entire world. Maybe even that's the wrong scale. The original idea was something where you had something that was essentially stable.
Your machines didn't change. When they changed, it was because you changed them. Again, you had users, who are these other people.
The idea of state made a lot of sense in that context. The idea of a state is static. That's the root of the word. Life has become much more dynamic. We expect change. We expect drift. We expect that our definition of what is correct changes. It changes faster than we can apply it to the machines we have.
We've gone from this idea where I could define a state and the machines would settle on that state, and then using the configuration management system, and then would come along later and we'd tweet the state.
We'd update some packages or we'd change some specification or we add or remove a user to a point where you almost never expect it to settle, you never expect to reach the state that you've defined as your correct state.
You change things gradually or determine eventual consistency. Things will eventually get there, but we're changing the state now so fast that in some senses, if you have this single central state.
You're never going to achieve consistency across the entire system before you change the state again. In that sense, I start wondering whether this state really make sense.
Gordon:  What replaces it?
Mark:  This is where Mark Burgess, some of his work over the last couple of decades, is starting to come into its own. He's a proponent of something called promise theory.
Whether or not the theory holds, there is a kernel of an idea that's really, really important there, which is that...He says this is impossible. He's thinking, this becomes so complex at so many different scales that reaching that state, or sometimes even defining that state, doesn't make sense.
He wants to flip the state definition on its side or upside down. He wants to say, "Let's treat all these things locally. Let's figure out what the little tiny piece is."
The old way would be to say, "I eventually reached some state." What he's saying is that the new piece, you teach it some promises. I promise I will be on the net. I promise that I will serve web pages. I promise that I will take files from a certain location.
You define the promises well down in the scaling. You try and define a system based on, "If all these things fulfill their promises, then some desired behavior will come about at a much higher scale." I'm not yet convinced that this is an engineering model.
This is one of the things that I've talked to you about it and I've talked to a couple of other people about it, that this is a great idea, I like this. What I don't know is how to do engineering with it yet.
We'll see whether or not there are people who are ignoring the state using...Some of the newer configuration management systems, some of them have state built in like Salt does. Ansible really doesn't. Ansible really is more about applying changes to something than reaching a certain state.
There is fuzziness in all of this, whether or not when it's true or not. People are starting to recognize that this is a problem, and people are starting to find ways to define the behaviors of the system without necessarily defining the low level states one piece at a time.
Gordon:  That's probably a pretty good segue to bring this particular podcast home. As I mentioned, that was a config management camp again a couple of weeks ago, huge amount of interest in Chef, in Puppet, in Salt, in Ansible, in Foreman, in CFengine.
Maybe we could close this out with some comments about some of the different approaches being taken here and some of your thoughts and some of these different tools.
Mark:  The first thing I want to say with respect to that is that while I describe this fast‑moving dynamic environment, there are lots of companies that are still and will continue to run in a more conventional environment for a long time.
I'm not saying, that these configuration management systems are, in any sense, obsolete. They still have a place, because the environments that they are designed for still exist.
That said, there are several different things that seem to happen. One is push versus pull model. You get systems like Puppet, which are strong push model. You get something like CFEngine, which uses a strong pull model.
In both cases, they have had to create feedback mechanisms, which really are the other one, which leads me to believe that push versus pull is probably a straw man, that there probably have to be feedback loops in both directions regardless of which emphasis you take.
Then you get the agent version versus agentless discussion. There are people who would say, "Adding this new thing that runs on each host that listens for changes is an overhead, which isn't really necessary." The strongest proponent of an agentless system that I've heard of is Ansible.
Ansible uses SSH, which is in some senses, its agent. Then the SSH login triggers some Ansible behavior on the host. Again, I this is a muddy distinction.
But it's fair that this additional agent doesn't run in Ansible's case but also Ansible, it seems to me, defines more the means of creating the state while ignoring the state engine itself. I'm probably going to get hate mail and corrections for that. Corrections are welcome, hate mail, not so much.
These are the distinctions that are there now. There are still people now who are looking at the cloud environment, and they're looking at these configuration management systems in trying to figure out how to use them. They're still trying to apply them in the same way. I'm a little suspicious of that as well.
I'm interested in seeing how configuration management systems get used in an Atomic environment [Red Hat Enterprise Linux Atomic Host] or in a CoreOS environment or a minimalized operating system environment, where the whole point of that is to eliminate the need for this configuration management and where they move the configuration out to the containers.
Put a container here, put another container here, make the containers work together, that's what the configuration management system would have done. Now we got orchestration systems doing that for the most part.
I'm interesting in seeing how this evolves, whether their conventional system administration systems, how they fit and how people end up using them and whether or not they turn out to be more or less useful than they would be in a conventional environment.
Gordon:  If someone wants to learn some more about this stuff, what do you recommend?
Mark:  First is to look at the various configuration management systems, largely avoid the hype. There are people who are advocates who are not somuch  pundits. I'm skeptical of people who will say, "This is the right way. This is the best way." If you wanted to learn about promise theory, certainly Mark Burgess's books are on that. Mark's the only person I know who is publishing in an academic sense.

This is one of the things I'm personally interested as system administration something worth of academic study. Mark is the only person I know who's doing that in publishing.

Friday, February 13, 2015

Recording podcasts in-person 2015 edition

Podcastingpic

For my Cloudy Chat podcast series, I’ve been focusing lately on repeat guests drawn heavily from local Red Hat colleagues in Westford. I find it’s a great way to get interesting material out there without a whole lot of logistical overhead. Especially with all the activity going on with containers, docker, kubernetes, configuration management, and containerized operating systems like Project Atomic, there’s no shortage of things to cover without going too far afield.

I describe an earlier setup here. (See also how I use Google+ Hangouts for remote recording.) However, over time, I’ve experimented with some different setups for in-person recording to simplify the process while maintaining good quality. I’m pretty happy with where I’ve ended up—with the caveat that I’m always learning and tweaking things. 

For recordings in the office:

In my earlier post, I describe recording using a laptop and a USB microphone. I’ve also done recordings using a Peavey PV6 USB Mixing Console and XLR dynamic microphones connected to a laptop. I still use the latter setup if there are more than two of us and/or I want to control the individual microphone levels. However, in the interests of simplicity, I now use a digital recorder connected to two dynamic microphones on desktop stands. Here’s the specific gear list:

You’ll probably also want a larger SD card (the recorder comes with a 2GB one), a mini-USB cable and power adaptor, and some spare AA batteries.

With this setup, you can just sit the recorder on the table, plug in the microphones, and sit one in front of yourself and one in front of your guest. I’m not going to go into every detail of the recorder but a few tips and tricks.

  • You may want to plug the recorder into power (using its mini-USB) if possible. It’s fairly battery hungry and doesn’t give a lot of warning when it’s about to go.
  • Make sure you have recording space left. (You may be noticing a theme here. It’s called personal experience.)
  • The Tascam input can be set to auto-level. In my experience it’s only somewhat effective but I still find it better to use it than to not use it. 
  • The two external microphones will record into different stereo channels, which offers another opportunity to balance the recording with a bit of manipulation in Audacity or your audio processing software of choice. You can split the stereo channels into separate mono tracks, process them individually, and then recombine into a single stereo track. 

 For recordings on the road.

While the above setup is relatively compact, it’s more than I really want to travel with most of the time. Furthermore, it requires that you be able to find a table in a relatively quiet area which is often far easier said than done at the conferences I attend. You can’t really use the Tascam as a handheld recorder with its internal mics. They’re just too sensitive and pickup the noise of you handling the recorder. 

Instead, I use my iPhone or iPad and plug in a handheld iRig microphone. There's a corresponding iOS application but there's no reason you couldn't use any other recording application; the microphone just plugs into a standard 3.5mm jack. One nice detail of the iRig is that it comes with a splitter built into the jack. This means that you can easily monitor the recording with headphones, which can be useful if you're dealing with intermittent background noise.

I then just hold the microphone and move it up close to whoever is speaking at the moment. This generally works quite well for the style of interview podcasts that I do. I then transfer the recording to my laptop using whatever mechanism the recording app provides—in the case of iRig, I send it up to a server with FTP, then download it. I then edit the recording using Audacity in the usual way.

The same company also makes a small microphone that plugs directly into the jack of an iPhone. I don’t find handling the iPhone like a microphone quite as natural as handling a cylindrical microphone—but this mic lives in my accessory bag so it’s always with me in case an opportunity to make a recording pops up.

Thursday, February 12, 2015

Podcast: Architecting for Containers with Mark Lamourine


In the latest episode in my containers series, I talk with my Red Hat colleague Mark Lamourine about how containers change the way we architect applications. No longer do we aggregate all of an application's services together. Rather we decompose them and encapsulate individual services within loosely coupled containers. We also talk about extending the container model across multiple hosts and orchestrating them using Kubernetes.


Prior container podcasts with Mark:



Listen to MP3 (0:18:41)
Listen to OGG (0:18:41)

[Transcript]

Gordon Haff:  Hi, everyone. Welcome to another edition of the Cloudy Chat Podcast. I'm Gordon Haff, and I'm here with my colleague in the cloud product strategy group, Mark Lamourine.
Welcome, Mark.
Mark Lamourine:  Thank you.
Gordon:  For today's topic, we're going to dive into a little more technical detail. If you go to the show notes, you'll find pointers to some of our earlier discussions, but for today, we're going to dive into a bit more detail. Specifically, we're going to talk about adapting complex applications for Docker and Kubernetes, Kubernetes being the orchestration engine we're working on together with Google.
Many of these points are also going to be more generally applicable to complex containerized applications in general. Let's start off. First topic we'll talk about, Mark, is decomposition, this idea of microservices, the heart of what microservices are.
Mark:  One of the things that we found when we did a hackathon several months ago, with a number of groups within Red Hat, each of them with an eye toward containerizing their applications, everybody's first impulse was to try and take all of the parts and put them in one container. Thinking this is how we do it when we have a single host. We put the database and the messaging and all of the things on one host.
It only took a few hours well into the second day for some people to discover that they really had to take the different parts and put them into separate containers. It's not just an ideal "Oh, gee, we'll have micro containers. Won't that be wonderful? Or microservices." When you start working with it, you actually find that that's the easiest and the best way to manage your applications.
Gordon:  We talked about this in a little more detail in the last podcast. As you say, everyone's first reaction is "This is just a different form of virtualization.” And arguably one of the reasons virtualization became so popular was people could to a first approximation, just tweak it, just treat it like physical servers.
Containers really are allied, if you would, with a different architectural approach.
Mark:  They are. It's something that's very new for a lot of people. It's pretty much for everybody. It's a pretty new way of thinking about how to work with your applications. It enforces something that designers and me, as a system administrator, especially system administrators have been asking for for a long time, which is establishing clear boundaries between the various sub‑services of your applications.
As system administrators, we wanted that because we didn't want to have any hidden interactions between the parts. When we went to diagnose one, we know what configuration files and what settings to look for. If the configuration files were used, for instance, by two different components, there would be a bleed over.
Some data might change in one, and it would appear in the other. That would cause confusion for us.
We've been asking for something like this for a long time. It becomes really important when you start working with containers because of the isolation the containers give you. It means you really have to think carefully about what information is shared and what information is really for only one component.
Gordon:  Mark, the idea here with these microservices is that you can just pick them off the shelf and reuse them. In fact, that's one of the big arguments for having a microservices type architecture. In practice, you still need to get configuration information specific to a given environment into those containers. How do you go about doing that?
Mark:  Docker offers two major ways of getting things in like that. The first way is to provide environment variables on the command line. Understand, when I talk about getting the stuff in, I say provide an environment variable. What you're really doing on the Docker command line is saying, "Make this environment variable exist inside the container."
You're putting a command line argument that when your container runs, the container will see an environment variable with a value that corresponds. One way is to pass in these environment variables. A second way using Docker is you can actually put the arguments on the command line for the Docker if the container is designed well.
Those arguments will show up as command line arguments to your process. For instance, if you're starting a MongoDB database, if the container is designed to accept them, you could put the MongoDB arguments on the end of your Docker command line, and the MongoDB process will get those as if they were passed in as command line arguments to it.
A third way using Docker is to provide a file system volume for some existing file on the outside. That's not really recommended for something like Kubernetes, because that means you have to push the file to whatever your target is and then to tell Kubernetes to pick it up.
Most containers should probably use either environment variables or command line arguments for passing values into a container, especially when you're using Kubernetes.
Gordon:  That probably goes without saying, but I might say anyway that these environment variables, for example, are going to be isolated within that user's container.
Mark:  I'm sorry. Yes. When you set the environment variables using Docker, there's an argument to the Docker command line. That causes those environment variables to appear to the processes inside the container but not to anything else.
Kubernetes also has a means of setting environment variables or command line arguments to the containers inside the Kubernetes pod.
Gordon:  Let's talk about the process of getting started here. You want to have these containers, these micro services eventually running as part of orchestrated Kubernetes application. How do you get started with this process?
Mark:  This is part of what we were talking about with the decomposition is that by breaking your application down into component services, it allows you to make sure that they're working inside Kubernetes and inside Docker one piece at a time.
In the examples I've used so far, I typically started with the database. I'll build the database into a Docker container and I will build a Kubernetes pod around it, make sure that the pod runs in Kubernetes. Then I'll go to the next piece, whether it's messaging or a web server or something like that.
As I design the next piece, I will make sure that it can talk to the container running inside Kubernetes even though the testing one is running outside. When I'm satisfied with that, I'll build a Kubernetes pod around that and move it inside it and gradually iteratively build up a complete application running in Kubernetes.
Gordon:  You're coming at this with your sys admin hat on. We're talking about this from an infrastructure perspective. It's probably worth noting that this is a very DevOps‑y kind of approach, or a world in terms of these small services loosely coupled, incrementally added, incrementally updated.
Mark:  It is a different idea from more conventional application design. It's been very conventional especially in the VM world to create a complex application by just propping the software on and then tweaking the configuration files on your database and your Web server and writing some script that does all this stuff but it all runs on one host.
What this fosters is the ability to blur the lines between the different parts. Sometimes, things bleed over. You can't do that with a container. You can't trust those bleed over pieces to work. You really have to build up the pieces one at a time. I suppose I would call that DevOps, but as a system admin, I would've always called that good development practice.
Gordon:  A real world example here. Amazon, for example...I'm not talking Amazon Web services now but really Amazon, the retail entity. Their practice in terms of how they put things together is that everything must have an API and only talk through that API. That's really that way of institutionalizing that type of approach to development.
Mark:  As a system admin, this just makes my little heart go pit and pat. More times than I care to think about, I've been faced with a situation where someone said, "My application worked." It turned out that there was some file being changed by one service of the application that was used by another service. The change was blind.
It wasn't until we started looking at this whole thing as a system that we discovered it. Containers really enforce this, because by setting the boundaries, they're not only setting the boundaries to the outside world as a security boundary but they're also setting the boundaries between the different components of your application in enforcing those boundaries.
This is going to make for much more robust applications in the long run, because a lot of these side effects that slip through in traditional application design are going to become exposed. It's a lot more work up front. There are a lot of people now where their first thought is, we'll just do it the way we've always done it as if it was running on a VMware host.
They very soon find that in a container environment, this is very difficult because it turns out you can't get inside easily once you put all of that stuff inside. Where if you build each of your components into a separate container, you can observe the interactions between them in a way that becomes much more clear.
Gordon:  Mark, let's get back to talking specifically about Kubernetes and specifically how you create Kubernetes service and pod definitions and if you could start out by defining what pod means in the context of Kubernetes.
Mark:  The pod is the basic computational unit for Kubernetes. It wraps containers. It wraps Docker containers. It actually recognizes something that some people had been wondering about for a while. A common use of Docker containers includes punching holes between containers that are sharing information.
You don't want the information to get out further, but you want the processes to be able to share information.
In Docker, you're actually punching holes out of one container, into another. With Kubernetes, what they do instead is they allow you to create this thing called the pod, which by definition, contains multiple containers, multiple Docker containers. The pod is used to define the resources which are shared by all the containers inside.
By definition, they will share a network name space. They may also share some external volume space. There are other components that a pod can allow them to share. The main important thing is, the pod and the container are somewhat analogous but the pod can be bigger than one container.
The other important component is a service. What a Kubernetes services does is it makes the processes running inside a container available but over a network. What a service does is it defines an IP address and a port that are well‑known that you can then attach to your database through your Web server so that other containers and processes outside can reach them.
Gordon:  It's a type of a abstraction?
Mark:  It is. It's actually implemented as a proxy and a load balancer. Each of the Kubernetes minions has this proxy process running on it. When you create the service object, all of us proxies will start listening on the ports you define and forward any packets you get to the pods, which are ready to accept them.
Gordon:  Lets you you implement load balancing, HA, things like that?
Mark:  It does, but the more important thing right up front is merely letting processes in one container know about the communications to the processes in another. If you've got a Web server that wants to have a database back end, if you just had the containers in there out there in the cloud, there would be no way for the Web server to find the database.
Using a service object, you can say it will create an IP address that your Web server pod can pick up, and it can send packets there. Port forwarders will make sure that the database actually receives them.
Gordon:  We’ve been mostly talking about the server side of things so far. There are few other topics I'd like to touch on, identity, networking, and storage. First of all, identity.
Mark:  There are two aspects of identity with respect to Kubernetes. One is, for a given container, who am I? What's my IP address? What's my host name? What server name am I going to use? Those values are generally passed in the command line and will be passed in as part of the Kubernetes pod definition.
The other aspect, which is really unresolved, is what user am I when you're running inside the container? There is probably an Apache user defined inside the container. That's not necessarily going to be the same between containers or if there's shared storage, the storage may have a user ID on it that may or may not match.
There's no concrete resolution for that right now. There are people working on things like either using a service like IPA to create a universal user mechanism or using something like Kerberos. How those are going to be integrated with Docker and Kubernetes is unclear still.
Gordon:  Software-defined networking, how does that intercept with what we've been talking about?
Mark:  If you're running Kubernetes in a cloud service, the cloud services all have software‑defined networks. This is how they run. This is something that they do to make sure that all of their services are available. Kubernetes has a mechanism right now where if you're running into Google cloud, you can say for a given service, "Please create me an external IP address."
That'll get published. You can request it from Kubernetes, and then you can tell your users what that address is and you can assign a host name.
If you're not running in a cloud environment where a software‑defined network is part of the infrastructure, right now, there's no good solution. Most networking groups are not amenable to just granting /24 blocks to development groups.
There are people who are working on it and people who are thinking about how best to do this. I know one thing that's being used now is a project from CoreOS called Flannel, which provides networking within the Kubernetes cluster. It could probably be used as well to provide an external interface. It's still fairly limited, the problem being that if you only have one or two external facing addresses, then you have competition for ports from all of services inside the cluster.
It's unclear yet, how that is perceived.
Gordon:  We've talked compute, we've talked security, we’ve talked networking. You can probably guess what's coming next. How does this intercept with software‑defined storage?
Mark:  Again, in Kubernetes, they're really focusing on the process control, still, and on the things that are happening at the host to container interface. Both Red Hat and Google are working on adding the ability for Kubernetes to manage the host storage so that you could say, put into your Kubernetes pod definition, "Oh, I need storage from some SEF or from some Google Cloud storage area." Kubernetes will be able to make that happen.
Right now, that's not available. If you wanted to create a Kubernetes cluster that has a shared storage, you pretty much have to define and configure the storage on each of the minion hosts first and make it so that...Essentially, any process running on the host can reach the storage by a path, for instance, using a manifest auto mount.
Making it so that the same path gets you the same storage no matter which host you're on. It's doable. It's not impossible. It's not something that's going to scale up in the long term. There are people working on it.
Gordon:  Thank you, Mark. This has been very educational. We got a bunch more topics I'd like to get into, but I think we'll maybe leave those for our upcoming podcast. Thanks, everyone. Thanks, Mark.

Mark:  Thank you.

Wednesday, February 11, 2015

Links for 02-17-2015

Presentation: Devising a Practical Approach to the Internet of Things

The Internet of Things (IoT) is hot. It’s also hard to get your head around given the proliferation of wildly different use cases, types of devices, interconnection mechanisms, and data patterns. What’s more, IoT is also intertwined with all the other big computing trends from clouds to data analysis to DevOps processes. In this session, Red Hat’s Gordon Haff will draw on a wide range of research, in addition to user examples, to help you structure your thinking about and approach to IoT as a technology enabler and a business opportunity. This discussion will include the types of platforms associated with IoT, connectivity characteristics, the intersection with data analytics and social, and a snapshot of current standards work.

Originally delivered for MD&M West, Anaheim, CA on 10 Feb 2015

Wednesday, January 21, 2015

Don't skeuomorph your containers

Containers were initially pitched as more or less just another form of partitioning. A way to split large systems into smaller ones in which workloads not requiring a complete system by themselves could coexist without interfering with each other. Server/hardware virtualization is the most familiar form of partitioning today but, in its x86 form, it was only the latest in a long series of partitioning techniques initially applied mostly to mainframes and Unix servers. 

The implementation details of these various approaches differed enormously and even within a single vendor—nay, within a single system design—multiple techniques hit different points along a continuum which mostly traded off flexibility against degree of isolation between workloads. For example, the HP Superdome had a form of physical partitioning using hardware, a more software-based partitioning approach, as well as a server virtualization variant for HP-UX on the system’s Itanium processors. 

But, whatever their differences, these approaches didn’t really change much about how one used and interacted with the individual partitions. They were like the original pre-partitioned systems, There were just more of them and they were correspondingly smaller. Indeed that was sort of the point. Partitioning was fundamentally about efficiency and was logically just an extension of resource management approaches that allowed for the co-existence of multiple workloads historically .

Ohc breakout 02

At a financial industry luncheon discussion I attended last December, one of the participants coined a term that I promptly told him I was going to steal. And I did. That term was “skeuomorphic virtualization” which he used to describe hardware/server virtualization. Skeuomorphism is usually discussed in the context of industrial design. Wikipedia describes a skeuomorph as "a derivative object that retains ornamental design cues from structures that were necessary in the original.” The term has entered the popular lexicon because of the shift away from shadows and other references to the physical world such as leather-patterned icons in recent versions of Apple’s iOS

However, the concept of skeuomorphism can be thought of as applying more broadly—to the idea that existing patterns and modes of interaction can be retained even though they’re not necessarily required for a new technology. In the case of “skeuomorphic virtualization,” a hypervisor abstracts the underlying hardware. While this abstraction was employed over time to enable new capabilities like live migration that were difficult and expensive to implement on bare metal, virtualized servers still largely look and feel like physical ones to their users. Large pools of virtualized servers do require new management software and techniques—think the VMware administrator role—but the fundamental units under management still have a lot in common with a physical box: independent operating system instances that are individually customizable and which are often relatively large and long-lived. Think of all the work that has gone into scaling up individual VMs in both proprietary virtualization and open source KVM/Red Hat Enterprise Virtualization. 

In fact, I’ll go so far as to argue that the hardware virtualization approach largely won out over the alternatives of the time in c. 2000 because of skeuomorphism. Hardware virtualization let companies use their servers more efficiently by placing more workloads on each server. But it also let them continue to use whatever hodgepodge of operating system versions they were using and to continue to treat individual instances as unique “snowflake” servers if they so chose. The main OS virtualization (a.k.a. containers) alternative at the time—SWSoft’s Virtuozzo—wasn’t as good a match for highly heterogeneous enterprise environments because it required all the workloads on a server to run atop a single OS kernel. In other words, it imposed requirements that went beyond the typical datacenter reality of the day. (Lots more on that background.)

Today, however, as containers enjoy a new resurgence of interest, it would be a mistake to continue to treat this form of virtualization as essentially a different flavor of physical server. As my Red Hat colleague Mark Lamourine noted on a recent podcast:

One of the things I've hit so far, repeatedly, and I didn't really expect it at first because I'd already gotten myself immersed in this was that everybody's first response when they say, "Oh, we're going to move our application to containers," is that they're thinking of their application as the database, the Web server, the communications pieces, the storage.They're like, "Well, we'll take that and we'll put it all in one container because we're used to putting it all on one host or all in one virtual machine. That'll be the simplest way to start leveraging containers." In every case, it takes several days to a week or two for the people looking at it to suddenly realize that it's really important to start thinking about decomposition, to start thinking about their application as a set of components rather than as a unit.

In other words, modern containers can be thought of and approached as “fat containers” that are essentially a variant of legacy virtual machines. But it’s far more fruitful and useful to approach them as something fundamentally new and enabling that’s part and parcel of an environment including containerized operating systems, container packaging systems, container orchestration like Kubernetes, DevOps practices, microservices architectures, “cattle” workloads, software-defined everything, and pervasive open source as part of a new platform for cloud apps.