In the latest episode in my containers series, I talk with my Red Hat colleague Mark Lamourine about how containers change the way we architect applications. No longer do we aggregate all of an application's services together. Rather we decompose them and encapsulate individual services within loosely coupled containers. We also talk about extending the container model across multiple hosts and orchestrating them using Kubernetes.
Prior container podcasts with Mark:
Listen to MP3 (0:18:41)
Listen to OGG (0:18:41)
[Transcript]
Gordon
Haff: Hi, everyone. Welcome to another edition of the Cloudy Chat
Podcast. I'm Gordon Haff, and I'm here with my colleague in the cloud product
strategy group, Mark Lamourine.
Welcome,
Mark.
Mark
Lamourine: Thank you.
Gordon:
For today's topic, we're going to dive into a little more technical
detail. If you go to the show notes, you'll find pointers to some of our
earlier discussions, but for today, we're going to dive into a bit more detail.
Specifically, we're going to talk about adapting complex applications for
Docker and Kubernetes, Kubernetes being the orchestration engine we're working
on together with Google.
Many
of these points are also going to be more generally applicable to complex
containerized applications in general. Let's start off. First topic we'll talk
about, Mark, is decomposition, this idea of microservices, the heart of what
microservices are.
Mark:
One of the things that we found when we did a hackathon several months
ago, with a number of groups within Red Hat, each of them with an eye toward
containerizing their applications, everybody's first impulse was to try and
take all of the parts and put them in one container. Thinking this is how we do
it when we have a single host. We put the database and the messaging and all of
the things on one host.
It
only took a few hours well into the second day for some people to discover that
they really had to take the different parts and put them into separate
containers. It's not just an ideal "Oh, gee, we'll have micro containers.
Won't that be wonderful? Or microservices." When you start working with
it, you actually find that that's the easiest and the best way to manage your
applications.
Gordon:
We talked about this in a little more detail in the last podcast. As you
say, everyone's first reaction is "This is just a different form of
virtualization.” And arguably one of the reasons virtualization became so
popular was people could to a first approximation, just tweak it, just treat it
like physical servers.
Containers
really are allied, if you would, with a different architectural approach.
Mark:
They are. It's something that's very new for a lot of people. It's pretty
much for everybody. It's a pretty new way of thinking about how to work with
your applications. It enforces something that designers and me, as a system
administrator, especially system administrators have been asking for for a long
time, which is establishing clear boundaries between the various sub‑services
of your applications.
As
system administrators, we wanted that because we didn't want to have any hidden
interactions between the parts. When we went to diagnose one, we know what
configuration files and what settings to look for. If the configuration files
were used, for instance, by two different components, there would be a bleed
over.
Some
data might change in one, and it would appear in the other. That would cause
confusion for us.
We've
been asking for something like this for a long time. It becomes really
important when you start working with containers because of the isolation the
containers give you. It means you really have to think carefully about what
information is shared and what information is really for only one component.
Gordon:
Mark, the idea here with these microservices is that you can just pick
them off the shelf and reuse them. In fact, that's one of the big arguments for
having a microservices type architecture. In practice, you still need to get
configuration information specific to a given environment into those
containers. How do you go about doing that?
Mark:
Docker offers two major ways of getting things in like that. The first
way is to provide environment variables on the command line. Understand, when I
talk about getting the stuff in, I say provide an environment variable. What
you're really doing on the Docker command line is saying, "Make this
environment variable exist inside the container."
You're
putting a command line argument that when your container runs, the container
will see an environment variable with a value that corresponds. One way is to
pass in these environment variables. A second way using Docker is you can
actually put the arguments on the command line for the Docker if the container
is designed well.
Those
arguments will show up as command line arguments to your process. For instance,
if you're starting a MongoDB database, if the container is designed to accept
them, you could put the MongoDB arguments on the end of your Docker command
line, and the MongoDB process will get those as if they were passed in as
command line arguments to it.
A
third way using Docker is to provide a file system volume for some existing
file on the outside. That's not really recommended for something like
Kubernetes, because that means you have to push the file to whatever your
target is and then to tell Kubernetes to pick it up.
Most
containers should probably use either environment variables or command line
arguments for passing values into a container, especially when you're using
Kubernetes.
Gordon:
That probably goes without saying, but I might say anyway that these
environment variables, for example, are going to be isolated within that user's
container.
Mark:
I'm sorry. Yes. When you set the environment variables using Docker,
there's an argument to the Docker command line. That causes those environment
variables to appear to the processes inside the container but not to anything
else.
Kubernetes
also has a means of setting environment variables or command line arguments to
the containers inside the Kubernetes pod.
Gordon:
Let's talk about the process of getting started here. You want to have
these containers, these micro services eventually running as part of
orchestrated Kubernetes application. How do you get started with this process?
Mark:
This is part of what we were talking about with the decomposition is that
by breaking your application down into component services, it allows you to
make sure that they're working inside Kubernetes and inside Docker one piece at
a time.
In
the examples I've used so far, I typically started with the database. I'll
build the database into a Docker container and I will build a Kubernetes pod
around it, make sure that the pod runs in Kubernetes. Then I'll go to the next
piece, whether it's messaging or a web server or something like that.
As
I design the next piece, I will make sure that it can talk to the container
running inside Kubernetes even though the testing one is running outside. When
I'm satisfied with that, I'll build a Kubernetes pod around that and move it
inside it and gradually iteratively build up a complete application running in
Kubernetes.
Gordon:
You're coming at this with your sys admin hat on. We're talking about
this from an infrastructure perspective. It's probably worth noting that this
is a very DevOps‑y kind of approach, or a world in terms of these small
services loosely coupled, incrementally added, incrementally updated.
Mark:
It is a different idea from more conventional application design. It's
been very conventional especially in the VM world to create a complex application
by just propping the software on and then tweaking the configuration files on
your database and your Web server and writing some script that does all this
stuff but it all runs on one host.
What
this fosters is the ability to blur the lines between the different parts.
Sometimes, things bleed over. You can't do that with a container. You can't
trust those bleed over pieces to work. You really have to build up the pieces
one at a time. I suppose I would call that DevOps, but as a system admin, I would've
always called that good development practice.
Gordon:
A real world example here. Amazon, for example...I'm not talking Amazon
Web services now but really Amazon, the retail entity. Their practice in terms
of how they put things together is that everything must have an API and only
talk through that API. That's really that way of institutionalizing that type
of approach to development.
Mark:
As a system admin, this just makes my little heart go pit and pat. More
times than I care to think about, I've been faced with a situation where
someone said, "My application worked." It turned out that there was
some file being changed by one service of the application that was used by
another service. The change was blind.
It
wasn't until we started looking at this whole thing as a system that we
discovered it. Containers really enforce this, because by setting the
boundaries, they're not only setting the boundaries to the outside world as a
security boundary but they're also setting the boundaries between the different
components of your application in enforcing those boundaries.
This
is going to make for much more robust applications in the long run, because a
lot of these side effects that slip through in traditional application design
are going to become exposed. It's a lot more work up front. There are a lot of
people now where their first thought is, we'll just do it the way we've always
done it as if it was running on a VMware host.
They
very soon find that in a container environment, this is very difficult because
it turns out you can't get inside easily once you put all of that stuff inside.
Where if you build each of your components into a separate container, you can
observe the interactions between them in a way that becomes much more clear.
Gordon:
Mark, let's get back to talking specifically about Kubernetes and
specifically how you create Kubernetes service and pod definitions and if you
could start out by defining what pod means in the context of Kubernetes.
Mark:
The pod is the basic computational unit for Kubernetes. It wraps
containers. It wraps Docker containers. It actually recognizes something that
some people had been wondering about for a while. A common use of Docker
containers includes punching holes between containers that are sharing information.
You
don't want the information to get out further, but you want the processes to be
able to share information.
In
Docker, you're actually punching holes out of one container, into another. With
Kubernetes, what they do instead is they allow you to create this thing called
the pod, which by definition, contains multiple containers, multiple Docker
containers. The pod is used to define the resources which are shared by all the
containers inside.
By
definition, they will share a network name space. They may also share some
external volume space. There are other components that a pod can allow them to
share. The main important thing is, the pod and the container are somewhat
analogous but the pod can be bigger than one container.
The
other important component is a service. What a Kubernetes services does is it
makes the processes running inside a container available but over a network.
What a service does is it defines an IP address and a port that are well‑known
that you can then attach to your database through your Web server so that other
containers and processes outside can reach them.
Gordon:
It's a type of a abstraction?
Mark:
It is. It's actually implemented as a proxy and a load balancer. Each of
the Kubernetes minions has this proxy process running on it. When you create
the service object, all of us proxies will start listening on the ports you
define and forward any packets you get to the pods, which are ready to accept
them.
Gordon:
Lets you you implement load balancing, HA, things like that?
Mark:
It does, but the more important thing right up front is merely letting
processes in one container know about the communications to the processes in
another. If you've got a Web server that wants to have a database back end, if
you just had the containers in there out there in the cloud, there would be no
way for the Web server to find the database.
Using
a service object, you can say it will create an IP address that your Web server
pod can pick up, and it can send packets there. Port forwarders will make sure
that the database actually receives them.
Gordon: We’ve been mostly talking about the server
side of things so far. There are few other topics I'd like to touch on,
identity, networking, and storage. First of all, identity.
Mark:
There are two aspects of identity with respect to Kubernetes. One is, for
a given container, who am I? What's my IP address? What's my host name? What
server name am I going to use? Those values are generally passed in the command
line and will be passed in as part of the Kubernetes pod definition.
The
other aspect, which is really unresolved, is what user am I when you're running
inside the container? There is probably an Apache user defined inside the
container. That's not necessarily going to be the same between containers or if
there's shared storage, the storage may have a user ID on it that may or may
not match.
There's
no concrete resolution for that right now. There are people working on things
like either using a service like IPA to create a universal user mechanism or
using something like Kerberos. How those are going to be integrated with Docker
and Kubernetes is unclear still.
Gordon:
Software-defined networking, how does that intercept with what we've been
talking about?
Mark:
If you're running Kubernetes in a cloud service, the cloud services all
have software‑defined networks. This is how they run. This is something that
they do to make sure that all of their services are available. Kubernetes has a
mechanism right now where if you're running into Google cloud, you can say for
a given service, "Please create me an external IP address."
That'll
get published. You can request it from Kubernetes, and then you can tell your
users what that address is and you can assign a host name.
If
you're not running in a cloud environment where a software‑defined network is
part of the infrastructure, right now, there's no good solution. Most
networking groups are not amenable to just granting /24 blocks to development
groups.
There
are people who are working on it and people who are thinking about how best to
do this. I know one thing that's being used now is a project from CoreOS called
Flannel, which provides networking within the Kubernetes cluster. It could
probably be used as well to provide an external interface. It's still fairly
limited, the problem being that if you only have one or two external facing
addresses, then you have competition for ports from all of services inside the
cluster.
It's
unclear yet, how that is perceived.
Gordon:
We've talked compute, we've talked security, we’ve talked networking. You
can probably guess what's coming next. How does this intercept with software‑defined
storage?
Mark:
Again, in Kubernetes, they're really focusing on the process control,
still, and on the things that are happening at the host to container interface.
Both Red Hat and Google are working on adding the ability for Kubernetes to
manage the host storage so that you could say, put into your Kubernetes pod
definition, "Oh, I need storage from some SEF or from some Google Cloud
storage area." Kubernetes will be able to make that happen.
Right
now, that's not available. If you wanted to create a Kubernetes cluster that
has a shared storage, you pretty much have to define and configure the storage
on each of the minion hosts first and make it so that...Essentially, any
process running on the host can reach the storage by a path, for instance,
using a manifest auto mount.
Making
it so that the same path gets you the same storage no matter which host you're
on. It's doable. It's not impossible. It's not something that's going to scale
up in the long term. There are people working on it.
Gordon:
Thank you, Mark. This has been very educational. We got a bunch more
topics I'd like to get into, but I think we'll maybe leave those for our
upcoming podcast. Thanks, everyone. Thanks, Mark.
Mark:
Thank you.