Thursday, February 12, 2015

Podcast: Architecting for Containers with Mark Lamourine

In the latest episode in my containers series, I talk with my Red Hat colleague Mark Lamourine about how containers change the way we architect applications. No longer do we aggregate all of an application's services together. Rather we decompose them and encapsulate individual services within loosely coupled containers. We also talk about extending the container model across multiple hosts and orchestrating them using Kubernetes.

Prior container podcasts with Mark:

Listen to MP3 (0:18:41)
Listen to OGG (0:18:41)


Gordon Haff:  Hi, everyone. Welcome to another edition of the Cloudy Chat Podcast. I'm Gordon Haff, and I'm here with my colleague in the cloud product strategy group, Mark Lamourine.
Welcome, Mark.
Mark Lamourine:  Thank you.
Gordon:  For today's topic, we're going to dive into a little more technical detail. If you go to the show notes, you'll find pointers to some of our earlier discussions, but for today, we're going to dive into a bit more detail. Specifically, we're going to talk about adapting complex applications for Docker and Kubernetes, Kubernetes being the orchestration engine we're working on together with Google.
Many of these points are also going to be more generally applicable to complex containerized applications in general. Let's start off. First topic we'll talk about, Mark, is decomposition, this idea of microservices, the heart of what microservices are.
Mark:  One of the things that we found when we did a hackathon several months ago, with a number of groups within Red Hat, each of them with an eye toward containerizing their applications, everybody's first impulse was to try and take all of the parts and put them in one container. Thinking this is how we do it when we have a single host. We put the database and the messaging and all of the things on one host.
It only took a few hours well into the second day for some people to discover that they really had to take the different parts and put them into separate containers. It's not just an ideal "Oh, gee, we'll have micro containers. Won't that be wonderful? Or microservices." When you start working with it, you actually find that that's the easiest and the best way to manage your applications.
Gordon:  We talked about this in a little more detail in the last podcast. As you say, everyone's first reaction is "This is just a different form of virtualization.” And arguably one of the reasons virtualization became so popular was people could to a first approximation, just tweak it, just treat it like physical servers.
Containers really are allied, if you would, with a different architectural approach.
Mark:  They are. It's something that's very new for a lot of people. It's pretty much for everybody. It's a pretty new way of thinking about how to work with your applications. It enforces something that designers and me, as a system administrator, especially system administrators have been asking for for a long time, which is establishing clear boundaries between the various sub‑services of your applications.
As system administrators, we wanted that because we didn't want to have any hidden interactions between the parts. When we went to diagnose one, we know what configuration files and what settings to look for. If the configuration files were used, for instance, by two different components, there would be a bleed over.
Some data might change in one, and it would appear in the other. That would cause confusion for us.
We've been asking for something like this for a long time. It becomes really important when you start working with containers because of the isolation the containers give you. It means you really have to think carefully about what information is shared and what information is really for only one component.
Gordon:  Mark, the idea here with these microservices is that you can just pick them off the shelf and reuse them. In fact, that's one of the big arguments for having a microservices type architecture. In practice, you still need to get configuration information specific to a given environment into those containers. How do you go about doing that?
Mark:  Docker offers two major ways of getting things in like that. The first way is to provide environment variables on the command line. Understand, when I talk about getting the stuff in, I say provide an environment variable. What you're really doing on the Docker command line is saying, "Make this environment variable exist inside the container."
You're putting a command line argument that when your container runs, the container will see an environment variable with a value that corresponds. One way is to pass in these environment variables. A second way using Docker is you can actually put the arguments on the command line for the Docker if the container is designed well.
Those arguments will show up as command line arguments to your process. For instance, if you're starting a MongoDB database, if the container is designed to accept them, you could put the MongoDB arguments on the end of your Docker command line, and the MongoDB process will get those as if they were passed in as command line arguments to it.
A third way using Docker is to provide a file system volume for some existing file on the outside. That's not really recommended for something like Kubernetes, because that means you have to push the file to whatever your target is and then to tell Kubernetes to pick it up.
Most containers should probably use either environment variables or command line arguments for passing values into a container, especially when you're using Kubernetes.
Gordon:  That probably goes without saying, but I might say anyway that these environment variables, for example, are going to be isolated within that user's container.
Mark:  I'm sorry. Yes. When you set the environment variables using Docker, there's an argument to the Docker command line. That causes those environment variables to appear to the processes inside the container but not to anything else.
Kubernetes also has a means of setting environment variables or command line arguments to the containers inside the Kubernetes pod.
Gordon:  Let's talk about the process of getting started here. You want to have these containers, these micro services eventually running as part of orchestrated Kubernetes application. How do you get started with this process?
Mark:  This is part of what we were talking about with the decomposition is that by breaking your application down into component services, it allows you to make sure that they're working inside Kubernetes and inside Docker one piece at a time.
In the examples I've used so far, I typically started with the database. I'll build the database into a Docker container and I will build a Kubernetes pod around it, make sure that the pod runs in Kubernetes. Then I'll go to the next piece, whether it's messaging or a web server or something like that.
As I design the next piece, I will make sure that it can talk to the container running inside Kubernetes even though the testing one is running outside. When I'm satisfied with that, I'll build a Kubernetes pod around that and move it inside it and gradually iteratively build up a complete application running in Kubernetes.
Gordon:  You're coming at this with your sys admin hat on. We're talking about this from an infrastructure perspective. It's probably worth noting that this is a very DevOps‑y kind of approach, or a world in terms of these small services loosely coupled, incrementally added, incrementally updated.
Mark:  It is a different idea from more conventional application design. It's been very conventional especially in the VM world to create a complex application by just propping the software on and then tweaking the configuration files on your database and your Web server and writing some script that does all this stuff but it all runs on one host.
What this fosters is the ability to blur the lines between the different parts. Sometimes, things bleed over. You can't do that with a container. You can't trust those bleed over pieces to work. You really have to build up the pieces one at a time. I suppose I would call that DevOps, but as a system admin, I would've always called that good development practice.
Gordon:  A real world example here. Amazon, for example...I'm not talking Amazon Web services now but really Amazon, the retail entity. Their practice in terms of how they put things together is that everything must have an API and only talk through that API. That's really that way of institutionalizing that type of approach to development.
Mark:  As a system admin, this just makes my little heart go pit and pat. More times than I care to think about, I've been faced with a situation where someone said, "My application worked." It turned out that there was some file being changed by one service of the application that was used by another service. The change was blind.
It wasn't until we started looking at this whole thing as a system that we discovered it. Containers really enforce this, because by setting the boundaries, they're not only setting the boundaries to the outside world as a security boundary but they're also setting the boundaries between the different components of your application in enforcing those boundaries.
This is going to make for much more robust applications in the long run, because a lot of these side effects that slip through in traditional application design are going to become exposed. It's a lot more work up front. There are a lot of people now where their first thought is, we'll just do it the way we've always done it as if it was running on a VMware host.
They very soon find that in a container environment, this is very difficult because it turns out you can't get inside easily once you put all of that stuff inside. Where if you build each of your components into a separate container, you can observe the interactions between them in a way that becomes much more clear.
Gordon:  Mark, let's get back to talking specifically about Kubernetes and specifically how you create Kubernetes service and pod definitions and if you could start out by defining what pod means in the context of Kubernetes.
Mark:  The pod is the basic computational unit for Kubernetes. It wraps containers. It wraps Docker containers. It actually recognizes something that some people had been wondering about for a while. A common use of Docker containers includes punching holes between containers that are sharing information.
You don't want the information to get out further, but you want the processes to be able to share information.
In Docker, you're actually punching holes out of one container, into another. With Kubernetes, what they do instead is they allow you to create this thing called the pod, which by definition, contains multiple containers, multiple Docker containers. The pod is used to define the resources which are shared by all the containers inside.
By definition, they will share a network name space. They may also share some external volume space. There are other components that a pod can allow them to share. The main important thing is, the pod and the container are somewhat analogous but the pod can be bigger than one container.
The other important component is a service. What a Kubernetes services does is it makes the processes running inside a container available but over a network. What a service does is it defines an IP address and a port that are well‑known that you can then attach to your database through your Web server so that other containers and processes outside can reach them.
Gordon:  It's a type of a abstraction?
Mark:  It is. It's actually implemented as a proxy and a load balancer. Each of the Kubernetes minions has this proxy process running on it. When you create the service object, all of us proxies will start listening on the ports you define and forward any packets you get to the pods, which are ready to accept them.
Gordon:  Lets you you implement load balancing, HA, things like that?
Mark:  It does, but the more important thing right up front is merely letting processes in one container know about the communications to the processes in another. If you've got a Web server that wants to have a database back end, if you just had the containers in there out there in the cloud, there would be no way for the Web server to find the database.
Using a service object, you can say it will create an IP address that your Web server pod can pick up, and it can send packets there. Port forwarders will make sure that the database actually receives them.
Gordon:  We’ve been mostly talking about the server side of things so far. There are few other topics I'd like to touch on, identity, networking, and storage. First of all, identity.
Mark:  There are two aspects of identity with respect to Kubernetes. One is, for a given container, who am I? What's my IP address? What's my host name? What server name am I going to use? Those values are generally passed in the command line and will be passed in as part of the Kubernetes pod definition.
The other aspect, which is really unresolved, is what user am I when you're running inside the container? There is probably an Apache user defined inside the container. That's not necessarily going to be the same between containers or if there's shared storage, the storage may have a user ID on it that may or may not match.
There's no concrete resolution for that right now. There are people working on things like either using a service like IPA to create a universal user mechanism or using something like Kerberos. How those are going to be integrated with Docker and Kubernetes is unclear still.
Gordon:  Software-defined networking, how does that intercept with what we've been talking about?
Mark:  If you're running Kubernetes in a cloud service, the cloud services all have software‑defined networks. This is how they run. This is something that they do to make sure that all of their services are available. Kubernetes has a mechanism right now where if you're running into Google cloud, you can say for a given service, "Please create me an external IP address."
That'll get published. You can request it from Kubernetes, and then you can tell your users what that address is and you can assign a host name.
If you're not running in a cloud environment where a software‑defined network is part of the infrastructure, right now, there's no good solution. Most networking groups are not amenable to just granting /24 blocks to development groups.
There are people who are working on it and people who are thinking about how best to do this. I know one thing that's being used now is a project from CoreOS called Flannel, which provides networking within the Kubernetes cluster. It could probably be used as well to provide an external interface. It's still fairly limited, the problem being that if you only have one or two external facing addresses, then you have competition for ports from all of services inside the cluster.
It's unclear yet, how that is perceived.
Gordon:  We've talked compute, we've talked security, we’ve talked networking. You can probably guess what's coming next. How does this intercept with software‑defined storage?
Mark:  Again, in Kubernetes, they're really focusing on the process control, still, and on the things that are happening at the host to container interface. Both Red Hat and Google are working on adding the ability for Kubernetes to manage the host storage so that you could say, put into your Kubernetes pod definition, "Oh, I need storage from some SEF or from some Google Cloud storage area." Kubernetes will be able to make that happen.
Right now, that's not available. If you wanted to create a Kubernetes cluster that has a shared storage, you pretty much have to define and configure the storage on each of the minion hosts first and make it so that...Essentially, any process running on the host can reach the storage by a path, for instance, using a manifest auto mount.
Making it so that the same path gets you the same storage no matter which host you're on. It's doable. It's not impossible. It's not something that's going to scale up in the long term. There are people working on it.
Gordon:  Thank you, Mark. This has been very educational. We got a bunch more topics I'd like to get into, but I think we'll maybe leave those for our upcoming podcast. Thanks, everyone. Thanks, Mark.

Mark:  Thank you.

No comments: