Kubernetes runtimes
Kubernetes originally only supported Docker as a container runtime engine. But that is no longer the case. Kubernetes now supports several different runtimes:
- Docker (via a CRI shim)
- rkt (direct integration to be replaced with Rktlet)
- CRI-O
- Frakti (Kubernetes on the Hypervisor, previously Hypernetes)
- rktlet (CRI implementation for rkt)
- CRI-containerd
The major design policy is that Kubernetes itself should be completely decoupled from specific runtimes. The Container Runtime Interface (CRI) enables it.
In this section, you'll get a closer look at the CRI and get to know the inpidual runtime engines. At the end of this section, you'll be able to make a well-informed decision about which runtime engine is appropriate for your use case and under what circumstances you may switch or even combine multiple runtimes in the same system.
The container runtime interface (CRI)
The CRI is a collection of a gRPC API, specifications/requirements, and libraries for container runtimes to integrate with a kubelet on a node. In Kubernetes 1.7, the internal Docker integration in Kubernetes was replaced with a CRI-based integration. This is a big deal. It opened the door to multiple implementations that can take advantage of advances in the container world. The kubelet doesn't need to interface directly with multiple runtimes. Instead, it can talk to any CRI-compliant container runtime. The following diagram illustrates the flow:
Figure 1.2: The container runtime interface (CRI) flow diagram
There are two gRPC service interfaces, ImageService and RuntimeService, that CRI container runtimes (or shims) must implement. ImageService is responsible for managing images. Here is the gRPC/protobuf interface (this is Google's Protobuf specification language and not Go):
service ImageService {
rpc ListImages(ListImagesRequest) returns (ListImagesResponse) {}
rpc ImageStatus(ImageStatusRequest) returns (ImageStatusResponse) {}
rpc PullImage(PullImageRequest) returns (PullImageResponse) {}
rpc RemoveImage(RemoveImageRequest) returns (RemoveImageResponse) {}
rpc ImageFsInfo(ImageFsInfoRequest) returns (ImageFsInfoResponse) {}
}
RuntimeService is responsible for managing pods and containers. Here is the gRPC/protobuf interface:
service RuntimeService {
rpc Version(VersionRequest) returns (VersionResponse) {}
rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {}
rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {}
rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {}
rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {}
rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {}
rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {}
rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {}
rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {}
rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {}
rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {}
rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {}
rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {}
rpc ExecSync(ExecSyncRequest) returns (ExecSyncResponse) {}
rpc Exec(ExecRequest) returns (ExecResponse) {}
rpc Attach(AttachRequest) returns (AttachResponse) {}
rpc PortForward(PortForwardRequest) returns (PortForwardResponse) {}
rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {}
rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {}
rpc UpdateRuntimeConfig(UpdateRuntimeConfigRequest) returns (UpdateRuntimeConfigResponse) {}
rpc Status(StatusRequest) returns (StatusResponse) {}
}
The data types used as arguments and return types are called messages and are also defined as part of the API. Here is one of them:
message CreateContainerRequest {
string pod\_sandbox\_id = 1; ContainerConfig config = 2; PodSandboxConfig sandbox\_config = 3;
}
As you can see, messages can be embedded inside each other. The CreateContainerRequest message has one string field and two other fields, which are themselves messages: ContainerConfig and PodSandboxConfig.
Now that you are familiar at the code level with what Kubernetes considers a runtime engine, let's look at the inpidual runtime engines briefly.
Docker
Docker is, of course, the 800-pound gorilla of containers. Kubernetes was originally designed to manage only Docker containers. The multi-runtime capability was first introduced in Kubernetes 1.3 and the CRI in Kubernetes 1.5. Until then, Kubernetes could only manage Docker containers.
I assume you're very familiar with Docker and what it brings to the table if you are reading this book. Docker enjoys tremendous popularity and growth, but there is also a lot of criticism of it. Critics often mention the following concerns:
- Security
- Difficulty setting up multi-container applications (in particular, networking)
- Development, monitoring, and logging
- The limitations of Docker containers running one command
- Releasing half-baked features too fast
Docker is aware of the criticisms and has addressed some of these concerns. In particular, Docker invested in its Docker Swarm product. Docker Swarm is a Docker-native orchestration solution that competes with Kubernetes. It is simpler to use than Kubernetes, but it's not as powerful or mature.
Starting with Docker 1.12, swarm mode is included in the Docker daemon natively, which upset some people due to bloat and scope creep. As a result, more people turned to CoreOS rkt as an alternative solution.
Starting with Docker 1.11, released in April 2016, Docker has changed the way it runs containers. The runtime now uses containerd and runC to run Open Container Initiative (OCI) images in containers:
Figure 1.3: Architecture of Docker 1.11 after building it on runC and containerd
rkt
rkt is a container manager from CoreOS (the developers of the CoreOS Linux distro, etcd, flannel, and more). It is not developed anymore as CoreOS was acquired by Red Hat, who was later acquired by IBM. However, the legacy of rkt is the proliferation of multiple container runtimes beyond Docker and pushing Docker toward the standardized OCI effort.
The rkt runtime prides itself on its simplicity and a strong emphasis on security and isolation. It doesn't have a daemon like Docker Engine and relies on the OS init system, such as systemd, to launch the rkt executable. rkt can download images (both App Container (appc) images and OCI images), verify them, and run them in containers. Its architecture is much simpler.
App container
CoreOS started a standardization effort in December 2014 called appc. This includes a standard image format (ACI – Application Container Image), runtime, signing, and discovery. A few months later, Docker started its own standardization effort with OCI. At this point, it seems these efforts will converge. This is a great thing as tools, images, and runtime will be able to interoperate freely. We're not there yet.
CRI-O
CRI-O is a Kubernetes incubator project. It is designed to provide an integration path between Kubernetes and OCI-compliant container runtimes like Docker. CRI-O provides the following capabilities:
- Support for multiple image formats, including the existing Docker image format
- Support for multiple means to download images, including trust and image verification
- Container image management (managing image layers, overlay filesystems, and so on)
- Container process lifecycle management
- Monitoring and logging required to satisfy the CRI
- Resource isolation as required by the CRI
It supports runc and Kata containers right now, but any OCI-compliant container runtime can be plugged in and be integrated with Kubernetes.
Hyper containers
Hyper containers are another option. A Hyper container has a lightweight VM (its own guest kernel) and it can run on bare metal. Instead of relying on Linux cgroups for isolation, it relies on a hypervisor. This approach presents an interesting mix compared to standard bare-metal clusters, which are difficult to set up, and public clouds, where containers are deployed on heavyweight VMs.
Frakti
Frakti lets Kubernetes use hypervisors via the OCI-compliant runV project to run its pods and containers. It's a lightweight, portable, and secure approach that provides strong isolation with its own kernel compared to the traditional Linux namespace-based approaches, but not as heavyweight as a full-fledged VM.
Stackube
Stackube (previously called Hypernetes) is a multi-tenant distribution that uses Hyper containers as well as some OpenStack components for authentication, persistent storage, and networking. Since containers don't share the host kernel, it is safe to run containers of different tenants on the same physical host. Stackube uses Frakti, of course, as its container runtime.
In this section, we've covered the various runtime engines that Kubernetes supports as well as the trend toward standardization, convergence, and externalizing the runtime support from core Kubernetes. In the next section, we'll take a step back and look at the big picture, and how Kubernetes fits into the CI/CD pipeline.