Create and Manage Pods in Kubernetes

A pod is a group of one or more containers in Kubernetes and it is the smallest unit of deployment for compute. The containers in a pod lives in their own cgroups but share a number of linux namespaces. Applications running in the same Pod share the same IP address and port space (network namespace), have the same hostname (UTS namespace), and can communicate using native interprocess communication channels over System V IPC or POSIX message queues (IPC namespace). 

The containers in the pod are not managed individually, they are managed at pod level. The pod may also include init containers, sidecar containers and ephemeral containers, other than containers running actual application processes.

Creating Pods

Pods are seldom created directly, as pods are designed to be ephemeral or disposable. When the pod is created, it will be scheduled to one of the worker nodes in the cluster. It remains there until the pod is evicted (for lack of resources or if node fails), or if the pod has completed execution, or deleted.

Pods are mostly created using workload resources such as deployments or jobs or daemonsets. These resources create Pods from a pod template and manage those Pods on your behalf. Pod templates are nothing but specifications for creating pods.

However, if you do want to create a singleton pod, you can do that by using a pod object in the resource manifest. For example, we can create a pod manifest as below:

apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
spec:
  containers:
  - image: docker.io/mohitgoyal/demo-ui03:1.0.0
    name: pod-demo
    ports:
    - containerPort: 80
      name: http
      protocol: TCP

In above template, there are few things to note:

apiVersion: Kubernetes follows a agile development process. They allow different features to be shipped, created and managed at different point in development process using apiVersion. Between this apiVersions, resource specification can vary a lot or not, depends upon various requirements. Be sure to check that you are using correct api version for resource that you are creating.

kind: It is used to specify the type of resource that needs to be created. In our case, we are creating a pod object, so its mentioned as same.

metadata: It is used to create user-identifiable metadata about the object being created. Since Kubernetes is highly scalable, the number of resources deployed can be very high and it can become daunting to identify the resource that you need to work with. Using metadata, you can use key-value pairs to attach to your object and later same can be used to identify the object.

spec: This is a section to contain specification about the object being created. Since pod is nothing but a group of containers, we will need to specify the same under the same under sub-section containers.

After this, we have specified that image to be used to create container, give container a name, and expose a TCP port.

To create a pod, save above manifest to a file say pod.yaml and apply it with kubectl apply command:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f pod.yaml 
pod/pod-demo created

After this, kubectl will pass the manifest to Kubernetes API server, which will then validate it and schedules for execution on a healthy node in the cluster.

Modifying the resource manifest file, after pod is created, has no effect on properties of pod that already exists. However, Kubernetes allows you to perform some non-critical operations on the pod using kubectl patch and kubectl replace command.

Listing Pod and Pod Details

We can view the pods running in a cluster, with kubectl get pods command:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl get pods
NAME       READY   STATUS    RESTARTS   AGE
pod-demo   1/1     Running   0          11s

The above output shows the name of the pod, its readiness state (1/1), status of pod (creating, pending, terminating etc.) and time passed since it was created. If we want more information, we can use -o wide option:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl get pods -o wide
NAME       READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
pod-demo   1/1     Running   0          21m   10.42.1.3   k3d-worker-0   <none>           <none>

It shows us extra information like the node it was scheduled on, the ip address of the pod, etc.

A lot more information can be obtained by using -o yaml or -o json, which will print the complete object in respective format specified.

In addition, Kubernetes maintains numerous events about Pods that are present in the event stream, not attached to the Pod object. To find out more information about a Pod (or any Kubernetes object) you can use the kubectl describe command followed by the object name. For example, we can find about more information about the pod created above as below:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe pod pod-demo
Name:         pod-demo
Namespace:    default
Priority:     0
Node:         k3d-worker-0/172.18.0.4
Start Time:   Sat, 01 May 2021 15:25:27 +0000
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.42.1.3

...
...
 
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  27m   default-scheduler  Successfully assigned default/pod-demo to k3d-worker-0
  Normal  Pulling    27m   kubelet            Pulling image "docker.io/mohitgoyal/demo-ui03:1.0.0"
  Normal  Pulled     26m   kubelet            Successfully pulled image "docker.io/mohitgoyal/demo-ui03:1.0.0" in 8.656787977s
  Normal  Created    26m   kubelet            Created container pod-demo
  Normal  Started    26m   kubelet            Started container pod-demo

Accessing Pod and Logs

As we mentioned previously, it is helpful to get certain events about Kubernetes resources (Pod in this case) with kubectl describe. However, these are the events as recorded by the Kubernetes. To view application logs, you can use kubectl logs:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl logs pod-demo
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up

It contains the output as provided on the stdout by processes running inside container. Adding the -f flag will cause you to continuously stream logs. Use --all-containers=true to return logs from all containers in the pod.

Adding the --previous flag will get logs from a previous instance of the container. This is useful, for example, if your containers are continuously restarting due to a problem at container startup.

However, the limitation of this lies in the issue, that not all applications would like to print application logs using stdout, for security or other reasons. The recommended practice is to use log aggregation service, or write logs to a file and running a sidecar container to read log file and send to log aggregation service.

At times, you want to connect to a container and run certain commands. For example, you may want to read application log file being written to by application process inside container. Alternatively, you may want to checkout config file. For this, you can use kubectl exec command:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl exec pod-demo -- cat /etc/nginx/nginx.conf

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
...
...

You can also get an interactive session with -it flag.

Container Restart Policy

The spec of a Pod has a restartPolicy field with possible values as Always, OnFailure, and Never. The default value is Always.

The restartPolicy applies to all containers in the Pod. restartPolicy only refers to restarts of the containers by the kubelet on the same node. After containers in a Pod exit, the kubelet restarts them with an exponential back-off delay (10s, 20s, 40s, …), that is capped at five minutes. Once a container has executed for 10 minutes without any problems, the kubelet resets the restart backoff timer for that container.

Performing Health Check

By default, the container readiness is marked simply by using process checks. This health check simply ensures that the main process of your application is always running. If it is not running, Kubernetes restarts the container. If all the containers in the pod are running, the pod is marked as running.

However, that’s not usually ideal. While the pod is scheduled and marked as running, the application process inside container may not be ready to handle incoming requests. To overcome this issue, one can use probes. Probe is a way to specify kubectl to check application readiness.

There are 3 types of probes supported: livenessProbe, readinessProbe and startupProbe

Startup Probe

Startup probes are useful for services that take a long time to start or get into working state. This is specially useful for the database containers. If the startup probe is not specified, it is considered as success. If the probe is specified, then other two probes, namely liveness and readiness, are disabled until the probe returns result as success.

We can create a startup probe as in below example manifest:

apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
spec:
  containers:
  - image: docker.io/mohitgoyal/demo-ui03:1.0.0
    name: pod-demo
    ports:
    - containerPort: 80
      name: http
      protocol: TCP
    startupProbe:
      httpGet: 
        path: /
        port: http
      failureThreshold: 10
      periodSeconds: 5

In above example, kubectl will perform a http GET request to application process on port named http i.e. 80, once the container is created. If the probe is failed, it will wait for a period of 5 seconds and will try probing up to 10 failures. If the startup probe never succeeds, the container will be killed after 5*10s and subject to its restart policy.

Liveness Probe

It may happen that if the application is running for a long period of time, it becomes stale. It may also get broken due to external events such as database or middleware queue unavailability and not perform as expectedly. Kubernetes provides liveness probes to detect and remedy such situations by restarting the container.

After the startup probe is finished, liveness probe takes over. Consider the below simple manifest for example of liveness probe:

apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
spec:
  restartPolicy: Always
  containers:
  - image: docker.io/mohitgoyal/demo-ui03:1.0.0
    name: pod-demo    
    ports:
    - containerPort: 80
      name: http
      protocol: TCP
    startupProbe:
      httpGet: 
        path: /
        port: http
      failureThreshold: 10
      periodSeconds: 5
    livenessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 3
      periodSeconds: 10

In above manifest, the periodSeconds field specifies that the kubelet should perform a liveness probe every 10 seconds. The initialDelaySeconds field tells the kubelet that it should wait 3 seconds before performing the first probe. 

In case of HTTP requests, any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure.

Readiness Probe

Readiness probe indicates whether the container is ready to respond to requests. If the readiness probe fails, the endpoints controller removes the pod’s ip address from the endpoints of all services that match the Pod. The default state of readiness before the initial delay is failure. If a container does not provide a readiness probe, the default state is considered as success.

Readiness probes are configured similarly to liveness probes. The only difference is that you use he readinessProbe field instead of the livenessProbe field:

apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
spec:
  restartPolicy: Always
  containers:
  - image: docker.io/mohitgoyal/demo-ui03:1.0.0
    name: pod-demo    
    ports:
    - containerPort: 80
      name: http
      protocol: TCP
    startupProbe:
      httpGet: 
        path: /
        port: http
      failureThreshold: 10
      periodSeconds: 5
    livenessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 3
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 3
      periodSeconds: 10

As we can see, we can use all three probes together to make sure that traffic is reaching application container, when it is ready to handle incoming requests.

In above examples, we have used HTTP probe only. However, we can choose to use exec probe (to execute a particular command inside container) or TCP probe (to query process response on a specific port) as well.

Deleting a Pod

We can choose to delete a pod imperatively, by using kubectl delete pod command. Alternatively, we can pass the same manifest file to kubectl delete command:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl delete -f pod.yaml 
pod "pod-demo" deleted

When a pod is deleted, its not immediately killed. It will first move to a terminating state and complete the grace period. By default, the grace period has a value of 30 seconds. During this period, it will continue to serve any active requests that it already has and stop accepting new requests.

It is important to note that once you delete a pod, any data stored in the containers associated with that pod will be deleted as well. If you want to persist data across multiple instances of pod, you need to use PersistantVolumes object to store data.

When a Pod is deleted, it is not immediately killed. Instead, if you run kubectl get pods you will see that the Pod is in the Terminating state. All Pods have a termination grace period. By default, this is 30 seconds. When a Pod is transitioned to Terminating it no longer receives new requests. In a serving scenario, the grace period is important for reliability because it allows the Pod to finish any active requests that it may be in the middle of processing before it is terminated.

It’s important to note that when you delete a Pod, any data stored in the containers associated with that Pod will be deleted as well. If you want to persist data across multiple instances of a Pod, you need to use PersistentVolume.

Forced Pod Termination

At times, it can be useful to terminate pods forcefully. The kubectl delete command supports the --grace-period=<seconds> option which allows you to override the default and specify your own value.

If you need to immediately delete pod, you need to use both --force and --grace-period=0. When a force deletion is performed, the API server does not wait for confirmation from the kubelet that the Pod has been terminated on the node it was running on.

However, this only stops it from receiving incoming traffic. It may still take few seconds between kubelet node and API server communication and subsequently pod termination.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s