Defining Pods through declarative syntax
Even though a Pod can contain any number of containers, the most common use case is to use the single-container-in-a-Pod model. In such a case, a Pod is a wrapper around one container. From Kubernetes' perspective, a Pod is the smallest unit. We cannot tell Kubernetes to run a container. Instead, we ask it to create a Pod that wraps around a container.
Let's take a look at a simple Pod definition:
cat pod/db.yml
The output is as follows:
apiVersion: v1 kind: Pod metadata: name: db labels: type: db vendor: Mongo Labs spec: containers: - name: db image: mongo:3.3 command: ["mongod"] args: ["--rest", "--httpinterface"]
We're using v1 of Kubernetes Pods API. Both apiVersion and kind are mandatory. That way, Kubernetes knows what we want to do (create a Pod) and which API version to use.
The next section is metadata. It provides information that does not influence how the Pod behaves. We used metadata to define the name of the Pod (db) and a few labels. Later on, when we move into Controllers, labels will have a practical purpose. For now, they are purely informational.
The last section is the spec in which we defined a single container. As you might have guessed, we can have multiple containers defined as a Pod. Otherwise, the section would be written in singular (container without s). We'll explore multi-container Pods later.
In our case, the container is defined with the name (db), the image (mongo), the command that should be executed when the container starts (mongod), and, finally, the set of arguments. The arguments are defined as an array with, in this case, two elements (--rest and --httpinterface).
We won't go into details of everything you can use to define a Pod. Throughout the book, you'll see quite a few other commonly (and not so commonly) used things we should define in Pods. Later on, when you decide to learn all the possible arguments you can apply, explore the official, and ever-changing, Pod v1 core (https://v1-9.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#pod-v1-core) documentation.
Let's create the Pod defined in the db.yml file.
kubectl create -f pod/db.yml
You'll notice that we did not need to specify pod in the command. The command will create the kind of resource defined in the pod/db.yml file. Later on, you'll see that a single YAML file can contain definitions of multiple resources.
Let's take a look at the Pods in the cluster:
kubectl get pods
The output is as follows:
NAME READY STATUS RESTARTS AGE db 1/1 Running 0 11s
Our Pod named db is up and running.
In some cases, you might want to retrieve a bit more information by specifying wide output.
kubectl get pods -o wide
The output is as follows:
NAME READY STATUS RESTARTS AGE IP NODE db 1/1 Running 0 1m 172.17.0.4 minikube
As you can see, we got two additional columns; the IP and the node.
If you'd like to parse the output, using json format is probably the best option.
kubectl get pods -o json
The output is too big to be presented in the book, especially since we won't go through all the information provided through the json output format.
When we want more information than provided with the default output, but still in a format that is human-friendly, yaml output is probably the best choice.
kubectl get pods -o yaml
Just as with the json output, we won't go into details of everything we got from Kubernetes. With time, you'll become familiar with all the information related to Pods. For now, we want to focus on the most important aspects.
Let's introduce a new kubectl sub-command.
kubectl describe pod db
The describe sub-command returned details of the specified resource. In this case, the resource is the Pod named db.
The output is too big for us to go into every detail. Besides, most of it should be self-explanatory if you're familiar with containers. Instead, we'll briefly comment on the last section called events.
... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m default-scheduler Successfully assigned db to minikube Normal SuccessfulMountVolume 2m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-x27md" Normal Pulling 2m kubelet, minikube pulling image "mongo:3.3" Normal Pulled 2m kubelet, minikube Successfully pulled image "mongo:3.3" Normal Created 2m kubelet, minikube Created container Normal Started 2m kubelet, minikube Started container
We can see that the Pod was created and went through several stages as shown in the following sequence diagram. Even though the process was simple from a user's perspective, quite a few things happened in the background.
This might be a right moment to pause with our exercises, discuss some of the details of Kubernetes components, and try to get an understanding of how Pod scheduling works.
Three major components were involved in the process.
The API server is the central component of a Kubernetes cluster and it runs on the master node. Since we are using Minikube, both master and worker nodes are baked into the same virtual machine. However, a more serious Kubernetes cluster should have the two separated on different hosts.
All other components interact with API server and keep watch for changes. Most of the coordination in Kubernetes consists of a component writing to the API Server resource that another component is watching. The second component will then react to changes almost immediately.
The scheduler is also running on the master node. Its job is to watch for unassigned pods and assign them to a node which has available resources (CPU and memory) matching Pod requirements. Since we are running a single-node cluster, specifying resources would not provide much insight into their usage so we'll leave them for later.
Kubelet runs on each node. Its primary function is to make sure that assigned pods are running on the node. It watches for any new Pod assignments for the node. If a Pod is assigned to the node Kubelet is running on, it will pull the Pod definition and use it to create containers through Docker or any other supported container engine.
The sequence of events that transpired with the kubectl create -f pod/db.yml command is as follows:
- Kubernetes client (kubectl) sent a request to the API server requesting creation of a Pod defined in the pod/db.yml file.
- Since the scheduler is watching the API server for new events, it detected that there is an unassigned Pod.
- The scheduler decided which node to assign the Pod to and sent that information to the API server.
- Kubelet is also watching the API server. It detected that the Pod was assigned to the node it is running on.
- Kubelet sent a request to Docker requesting the creation of the containers that form the Pod. In our case, the Pod defines a single container based on the mongo image.
- Finally, Kubelet sent a request to the API server notifying it that the Pod was created successfully.
The process might not make much sense right now since we are running a single-node cluster. If we had more VMs, scheduling might have happened somewhere else, and the complexity of the process would be easier to grasp. We'll get there in due time.
In many cases, it is more useful to describe resources by referencing the file that defines them. That way there is no confusion nor need to remember the names of resources. We could have executed the command that follows:
kubectl describe -f pod/db.yml
The output should be the same since, in both cases, kubectl sent a request to Kubernetes API requesting information about the Pod named db.
Just as with Docker, we can execute a new process inside a running container inside a Pod.
kubectl exec db ps aux
The output is as follows:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.5 2.9 967452 59692 ? Ssl 21:47 0:03 mongod --rest --httpinterface root 31 0.0 0.0 17504 1980 ? Rs 21:58 0:00 ps aux
We told Kubernetes that we'd like to execute a process inside the first container of the Pod db. Since our Pod defines only one container, this container and the first container are one and the same. The --container (or -c) argument can be set to specify which container should be used. That is particularly useful when running multiple containers in a Pod.
Apart from using Pods as the reference, kubectl exec is almost the same as the docker container exec command. The significant difference is that kubectl allows us to execute a process in a container running in any node inside a cluster, while docker container exec is limited to containers running on a specific node.
Instead of executing a new short-lived process inside a running container, we can enter into it. For example, we can make the execution interactive with -i (stdin) and -t (terminal) arguments and run shell inside a container.
kubectl exec -it db sh
We're inside the sh process inside the container. Since the container hosts a Mongo database, we can, for example, execute db.stats() to confirm that the database is indeed running.
echo 'db.stats()' | mongo localhost:27017/test
We used mongo client to execute db.stats() for the database test running on localhost:27017. Since we're not trying to learn Mongo (at least not in this book), the only purpose of this exercise was to prove that the database is up-and-running. Let's get out of the container.
exit
Logs should be shipped from containers to a central location. However, since we did not yet explore that subject, it would be useful to be able to see logs of a container in a Pod.
The command that outputs logs of the only container in the db Pod is as follows:
kubectl logs db
The output is too big and not that important in its entirety. One of the last line is as follows:
... 2017-11-10T22:06:20.039+0000 I NETWORK [thread1] waiting for connections on port 27017 ...
With the -f (or --follow) we can follow the logs in real-time. Just as with the exec sub-command, if a Pod defines multiple containers, we can specify which one to use with the -c argument.
What happens when a container inside a Pod dies? Let's simulate a failure and observe what happens.
kubectl exec -it db pkill mongod kubectl get pods
We killed the main process of the container and listed all the Pods. The output is as follows:
NAME READY STATUS RESTARTS AGE db 1/1 Running 1 13m
The container is running (1/1). Kubernetes guarantees that the containers inside a Pod are (almost) always running. Please note that the RESTARTS field now has the value of 1. Every time a container fails, Kubernetes will restart it:
Finally, we can delete a Pod if we don't need it anymore.
kubectl delete -f pod/db.yml kubectl get pods
We removed the Pods defined in db.yml and retrieved the list of all the Pods in the cluster. The output of the latter command is as follows:
NAME READY STATUS RESTARTS AGE db 0/1 Terminating 1 3h
The number of ready containers dropped to 0, and the status of the db Pod is terminating.
When we sent the instruction to delete a Pod, Kubernetes tried to terminate it gracefully. The first thing it did was to send the TERM signal to all the main processes inside the containers that form the Pod. From there on, Kubernetes gives each container a period of thirty seconds so that the processes in those containers can shut down gracefully. Once the grace period expires, the KILL signal is sent to terminate all the main processes forcefully and, with them, all the containers. The default grace period can be changed through the gracePeriodSeconds value in YAML definition or --grace-period argument of the kubectl delete command.
If we repeat the get pods command thirty seconds after we issued the delete instruction, the Pod should be removed from the system:
kubectl get pods
This time, the output is different.
No resources found.
The only Pod we had in the system is no more.