The Need for Orchestration in Container-based Deployments
In actual modern deployments that make use of containers (especially the cloud-based ones), it’s common to need some degree of failover, load balancing, and in general, services clustering. This is where container orchestrators enter the scene. Deploying on medium to large scale platforms implies resource scheduling, that is only possible when an orchestration engine of any kind is being used.
In this brief article, we’ll take a look at the most widely used container orchestration solutions, how they compare to each other, and the right environment where to use each option.
Knowing Our Actors: Docker Swarm, Google Kubernetes, and Marathon/Apache-Mesos
In order to better understand our actors, let’s briefly list them and their core concepts first.
Swarm is an in-house Docker orchestration solution. Being designed by Docker, and because it’s already included as a core Docker feature, we can argue that Swarm is the most compatible and natural way to orchestrate docker container infrastructures.
Like many orchestration solutions (taking by example: OpenStack), swarm includes a control layer called managers and a worker layer called “workers”.
All services running on both layers are running inside containers. Also included as a very important part of the whole orchestration solution is the discovery layer, container based too, and running a key-value solution like “etcd” or “consul”. This discovery layer is the registry part of the solution (you can compare it to OpenStack Keystone or AWS IAM). This layer knows where container-based services are being run. Basically, this is your “container-based yellow pages” in the orchestration infrastructure, providing both state and discovery for your cluster.
Our final deployed services run in the worker nodes (which you can compare to OpenStack compute nodes or, AWS EC2 service).
Finally, you have services and tasks. The services are sets of Docker containers running across your Swarm infrastructure. Those services are composed of tasks, which are your individual Docker containers (mostly) but also can include specific commands running along the containers.
Managers: Control layer. This layer should be redundant in your architecture. Normally, each manager is deployed in an independent node.
Discovery: Your state and service discovery layer. You can setup your discovery services in the same manager nodes or in an independent set of nodes. Always make them redundant.
Worker: Your beast of burden. All your end-services will run along your worker nodes. This layer should have as many workers as you need. Think about horizontal growth here.
Services: Your final deployed service composed of tasks.
Tasks: The docker containers and commands inside a service.
Note: You can begin with swarm with an "all-in-one” setup, running everything at first on a single server.
Google offers its own solution for container orchestration too. This one is more complex than Swarm, but it offers more features too. The first component of a Kubernetes setup is the master. The master is our control layer, which runs an "API" service that controls the whole orchestrator. While a single master can control our complete setup, in production environments multiple masters are the norm. The master runs many services inside containers, each with a very specific function inside the master. Note that, because the masters expose its API trough REST, a load balancer solution is required in front of the masters in order to have true high availability and multi-server load balancing.
Again, a discovery layer is needed here, being "etcd" one of the big players. Our etcd layer should be running inside their own nodes, with at least two servers in an etcd cluster in order to provide redundancy. You can run etcd either inside containers, or directly in the host operating system. Linux solutions like CoreOS already include etcd in its core packages. Like in Swarm, etcd will keep a registry of what is running and where is running across the entire Kubernetes infrastructure.
The next component is the node”. The nodes (previously known as “minions”) runs all our end services, in the form of “pods” (another Kubernetes concept). you can compare the nodes to an OpenStack-based compute node.
Then, you have the pods, being the ‘basic construct’ of your deployed services. A pod is a set of containers (or a single one) that runs encapsulated with its own storage resources and IP address. If you want an example of this, think about a LAMP service, or in this case, a LAMP pod that runs inside the pod one container with Apache and PHP, and another one with the MySQL/MariaDB database.
Labels: Another very important concept in Kubernetes. Labels are key-value pairs (much like AWS tags) that you can attach to any object in your Kubernetes-based infrastructure. You can (and should) set Labels to your Pods in order to organize them in, by example, environments (production, development), application layers (frontend, backend) and other uses similar to what you normally do with tags in AWS. Labels are used also by the replication controllers (replica sets) mentioned in few lines below.
Selectors: Selectors are used in order to search a group of objects sharing specific labels. (selectors and labels are used by replication controllers).
Replication controllers (replica set): This is the base of the high-redundancy features offered by Kubernetes. Replication controllers (and its new form, replica sets) ensures that a specific number of pod instances (or pod replicas) are running at a given time. If a node containing some of your pods goes down, replication controllers will ensure that new pods are started or more precisely scheduled on any of the surviving nodes. Replication controllers (and replica sets) uses selectors and labels in order to define what specific pods to manage.
Services: Services are the frontend (and load-balancing layer) for our pods, especially the ones running replicated. Normally, you need to access your pods through a single endpoint, especially when you have multiple pods running the same set of application. A good example here is a micro-service exposing a REST API through the service component in Kubernetes, and behind this service, you can have many replicated pods running in parallel across different hosts in your Kubernetes infrastructure.
Master: The base of your orchestrator. The master (or Masters) runs and expose the Kubernetes API. All management functions go through this API.
Discovery Layer: Of course, your etcd-based key/value store where all your components will register. Your etcd services can run on the same hosts where your Kubernetes master runs.
Nodes/Minions: The workload runs here. All your final services and pods will run inside your many nodes.
Labels and Selectors: The way you organize your objects, and the way replication controllers and replica sets knows which specific set of pods to manage.
Replication controllers / replica sets: The base of your high-redundancy and pods re-scheduling when failures happen.
Services: Your service endpoint and load-balancing layer for your deployed pods.
Note: Kubernetes was designed for medium to big deployments. At the very least, you’ll need a server for the Master, and another one for the Node/Minion. If you want a highly available setup, you’ll need at least twice the servers (two masters and two nodes).
Apache Mesos is ancient compared to Docker. Mesos is really a platform which manages computer clusters using Linux Cgroups in order to provide CPU, I/O, file-system, and memory isolated resources. Mesos is described as a distributed systems kernel, or in more simple terms, a cluster platform which provides computing resources to frameworks.
Marathon is a framework which uses Mesos in order to orchestrate Docker containers. Marathon can really do more than containers (a lot more), but for now, we’ll just concentrate our attention on the container part. Next, basic concepts for Marathon/Mesos combo:
Masters: Our control layer that includes zookeeper for storing state. In a clustered highly-available environment, a minimum of three servers is the norm for the masters' service layer. Marathon (our container orchestrator) runs on top of mesos/zookeeper. High availability is quorum based across the masters. A "leader” is elected in order to ensure 100% uptime.
Slaves: Your workload goes here. The slaves run all your tasks, passed and controlled by the framework layer (marathon).
Service discovery: Marathon uses Mesos-DNS service for all service-discovery related functions. By the smart use of SRV DNS Records, Mesos (and marathon) registers all tasks and application instances in the DNS record database.
Load balancing: Marathon load balancer provides port services through HAproxy. Marathon-lb is docker-based and can also provide the service discovery layer if we don’t want to use Mesos-DNS.
Constraints: One of the strongest features of Marathon. Marathon is host and rack aware. Application launching can be limited per node, per racks or other useful constraints. Example: In a complex mail platform, we can ensure that only two smtp-inbound applications will be launched in the same rack.
Metrics: Marathon includes metrics that can be easily integrated with solutions like graphite and statsd.
Applications: Our final deployed services. Also, PODS (like in Kubernetes) are supported since Marathon release 1.4, but they are not a core part of Marathon.
Complete REST API, with health checks and event subscription: Mesos/Marathon REST API includes all cluster management functions through REST. Also included in the REST API, is the ability to run health checks against specific applications, and run also event subscriptions defined by the administrator. This is very useful for integration to external applications (like an external load balancer and external monitoring platforms).
Master: Our control layer, highly available with a minimum of 3 serves.
Slaves: Our workload goes here. All services (and pods) are deployed on the slaves.
Service discovery: Using Mesos-DNS or Marathon-lb.
Load balancing: Marathon-lb HAproxy-based load balancer.
Constraints: A way to fine-control the way applications are deployed.
Metrics: Monitoring information available through the REST API to third party components.
Applications: our deployed services, PODS included (with some limitations).
REST API: All functions are available using Mesos/Marathon REST calls.
Note: Mesos was designed for very big platforms and to be extremely reliable. You can scale Mesos to thousands of nodes.
Ok, We Know the Actors and Their Specifics. Now, How Can We Compare Them?
In order to compare swarm, Kubernetes and Mesos/Marathon, let´s use the following table with the points we want to compare for each orchestrator solution:
Docker Swarm: Very easy to install and setup. All components are mostly docker-based, and can be integrated inside “systemd”.
Kubernetes: Slightly complex to setup. Extensive use of YAML files to define all services in the cluster. The YAML configuration is unique to Kubernetes.
Mesos: Generally easy to install and setup with small clusters, but considerably more complex with larger setups. Repositories available for some Linux distributions.
Docker Swarm: Completely Docker based and very easy to setup. Completely native to Docker.
Kubernetes: YAML based for all components in a deployed application (pods, services, and replication controllers).
Mesos: JSON based. All application definitions go inside a JSON file which is passed to the Mesos/Marathon REST API.
Minimum Size (Cluster)
Docker Swarm: One server running everything (for test purposes). In practical production environments, both the discovery and managers services need to be in highly available setups with at least two servers on each layer. Multiple workers are also needed for services distribution and replication.
Kubernetes: One server for Master (and discovery), and one server for node/minion. In production setups, discovery services and Master services should be clustered with at least 3 servers on each layer, and as many minions as your infrastructure requires.
Mesos: On master and one slave. In practical production environments, at least 3 masters and several slaves as needed.
Docker Swarm: This is an evolving point in Swarm. Consider swarm for small to medium scale setups.
Kubernetes: Medium to large clusters. Very well suited for complex applications with many containers inside pods.
Mesos: Large to very large scale clusters. Best choice if you want to combine containers and normal applications in the same cluster.
Docker Swarm: Mature, but still evolving.
Kubernetes: Very mature. Direct descendant of Google internal BORG Platform.
Mesos: Very mature, especially for very big clusters counting in the thousands of servers.
Docker Swarm: Easy to use, and more native to Docker.
Kubernetes: Best PODS scheduling features when complex applications are required to be deployed.
Mesos: Scale in the thousands, and rack/host based constraints features available in order to fine-tune where to deploy applications.
Then, is There a Winner?
No. There is not a winner here. Each of our compared container orchestration solutions has its own set of benefits. If you are thinking in something easy to use, and more native to Docker, and you don’t need to go big scale, then use swarm. If you need to do more complex things using PODS-like setups, and don’t need to scale up to thousands of servers, Kubernetes is for you. Finally, if you are a very big player and need to scale in the thousands, and/or need to set up scheduling constraints based on host, racks or any other specific variable, then Mesos/Marathon is the right option for you.
Thanks for Clarifying all the Ideas Behind Container Orchestration but, What about the Clouds?
Modern clouds also offer orchestration options. Let’s mention some of them here:
OpenStack now comes with its own component for Docker orchestration: Magnum. Magnum is not an orchestration really. It is a component that creates Docker orchestration clusters based on three models: Swarm, Kubernetes, and of course, Mesos. OpenStack Magnum just automates all the tasks behind the cluster deployment and configuration.
Amazon Web Services
Amazon, the cloud giant, offers its very own Docker orchestration solution, which they call ECS (EC2 Container Service). It is not based on any of the aforementioned solutions (swarm, Kubernetes, or Mesos) and uses its very own API, highly integrated with other AWS services.
Microsoft opted for an approach similar to OpenStack’s. Its container service offers clusters with Swarm, Kubernetes and Mesos. They call their offering Azure Container Service.
Google Cloud Engine
The guys from GCE also offers a container solution which they call Google Container Engine or “GKE” (yes, a “K” instead a “C”). GKE is based, with no surprise at all, on Kubernetes. Being the inventors of Kubernetes, it is the right call to use their own stuff for their container offering.
And our final conclusion is…
Understanding all 3 options is a must if your business depends on containers that need to live in a clustered environment. If you consider a cloud-based deployment (and with the possible exception of AWS ECS), you will really benefit yourself from the proper use of the right option for your deployment size and required features. we are not declaring a winner here. Instead, we are pinpointing when and where to use the container orchestrators described and compared in this article. Set your priorities right and you’ll know what of these options to use in your environment.