Communicating inside a pod
Containers inside the same pod share the same pod IP address. Usually, it is up to application developers to bundle the container images together and to resolve any possible resource usage conflicts such as port listening. In this section, we will pe into the technical details of how the communication happens among the containers inside the pod and will also highlight the communications that take place beyond the network level.
Linux namespaces and the pause container
Linux namespaces are a feature of the Linux kernel to partition resources for isolation purposes. With namespaces assigned, a set of processes sees one set of resources, while another set of processes sees another set of resources. Namespaces are a major fundamental aspect of modern container technology. It is important for readers to understand this concept in order to know Kubernetes in depth. So, we set forth all the Linux namespaces with explanations. Since Linux kernel version 4.7, there are seven kinds of namespaces, listed as follows:
- cgroup: Isolate cgroup and root directory. cgroup namespaces virtualize the view of a process's cgroups. Each cgroup namespace has its own set of cgroup root directories.
- IPC: Isolate System V Interprocess Communication (IPC) objects or Portable Operating System Interface (POSIX) message queues.
- Network: Isolate network devices, protocol stacks, ports, IP routing tables, firewall rules, and more.
- Mount: Isolate mount points. Thus, the processes in each of the mount namespace instances will see distinct single-directory hierarchies.
- PID: Isolate process IDs (PIDs). Processes in different PID namespaces can have the same PID.
- User: Isolate user IDs and group IDs, the root directory, keys, and capabilities. A process can have a different user and group ID inside and outside a user namespace.
- Unix Time Sharing (UTS): Isolate the two system identifiers: the hostname and Network Information Service (NIS) domain name.
Though each of these namespaces is powerful and serves an isolation purpose on different resources, not all of them are adopted for containers inside the same pod. Containers inside the same pod share at least the same IPC namespace and network namespace; as a result, K8s needs to resolve potential conflicts in port usage. There will be a loopback interface created, as well as the virtual network interface, with an IP address assigned to the pod. A more detailed diagram will look like this:
In this diagram, there is one Pause container running inside the pod alongside containers A and B. If you Secure Shell (SSH) into a Kubernetes cluster node and run the docker ps command inside the node, you will see at least one container that was started with the pause command. The pause command suspends the current process until a signal is received. Basically, these containers do nothing but sleep. Despite the lack of activity, the Pause container plays a critical role in the pod. It serves as a placeholder to hold the network namespace for all other containers in the same pod. Meanwhile, the Pause container acquires an IP address for the virtual network interface that will be used by all other containers to communicate with each other and the outside world.
Beyond network communication
We decide to go beyond network communication a little bit among the containers in the same pod. The reason for doing so is that the communication path could sometimes become part of the kill chain. Thus, it is very important to know the possible ways to communicate among entities. You will see more coverage of this in Chapter 3, Threat Modeling.
Inside a pod, all containers share the same IPC namespace so that containers can communicate via the IPC object or a POSIX message queue. Besides the IPC channel, containers inside the same pod can also communicate via a shared mounted volume. The mounted volume could be a temporary memory, host filesystem, or cloud storage. If the volume is mounted by containers in the Pod, then containers can read and write the same files in the volume. Last but not least, in beta, since the 1.12 Kubernetes release, the shareProcessNamespace feature finally graduates to stable in 1.17. To allow containers within a pod to share a common PID namespace, users can simply set the shareProcessNamespace option in the Podspec. The result of this is that Application A in Container A is now able to see Application B in Container B. Since they're both in the same PID namespace, they can communicate using signals such as SIGTERM, SIGKILL, and so on. This communication can be seen in the following diagram:
As the previous diagram shows, containers inside the same pod can communicate to each other via a network, an IPC channel, a shared volume, and through signals.