Working with ReplicaSets in Kubernetes

In Kubernetes, pods are the smallest unit of deployment for compute. A pod can contain one or more containers for the application. However, you would often want one or more instances of pods running in parallel for reasons like scalability, sharding or other reasons. A ReplicaSet ensures that a specified number of pod replicas (one or more) are running at any given time.

ReplicaSets are the building blocks used for the higher level concepts such as deployments or horizontal pod autoscalers. By ensuring that specified number of pod replicas are running, they provide self-healing for applications for certain failure conditions such as node failures or network partitions. Most of the time, one would be using higher level concepts like deployments instead of directly using ReplicaSets.

Define and Create a ReplicaSet

Like all objects in Kubernetes, to create a replicaset, we first need to define it using a manifest. All ReplicaSets must have a unique name defined using metadata.name field, a spec section that describes the number of pods (or replicas) should be running using spec.replicas field, a way to select and match pods with spec.selector field and pod creation template using spec.template field.

For example, consider the below manifest for creating a replicaset named nginx-rs:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-rs
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      env: dev
  template:
    metadata:
      labels:
        env: dev
    spec:
      containers:
      - name: nginx
        image: nginx:latest

Lets create the replicaset using kubectl apply in the separate namespace rsdemo:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl create namespace rsdemo
namespace/rsdemo created

cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f rs-simple.yaml -n rsdemo
replicaset.apps/nginx-rs created

cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-5nsm7   1/1     Running   0          22s
pod/nginx-rs-5v6nk   1/1     Running   0          22s

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       22s

We can see more details on the replicaset with kubectl describe:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe rs nginx-rs -n rsdemo
Name:         nginx-rs
Namespace:    rsdemo
Selector:     env=dev
Labels:       app=nginx
Annotations:  <none>
Replicas:     2 current / 2 desired
Pods Status:  2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  env=dev
  Containers:
   nginx:
    Image:        nginx:latest
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From                   Message
  ----    ------            ----  ----                   -------
  Normal  SuccessfulCreate  46s   replicaset-controller  Created pod: nginx-rs-5nsm7
  Normal  SuccessfulCreate  46s   replicaset-controller  Created pod: nginx-rs-wq99z

If the pods are created by the replica set, pod name will be generated by adding randomly generated characters to the replica set name.

Checking Pod failures with ReplicaSet

As mentioned previously, you can specify how many Pods should run concurrently by setting spec.replicas. The ReplicaSet will create/delete its Pods to match this number. If its not specified, it defaults to 1.

To replicate a pod’s failure, lets kill one of the pods and see how replica set behaves:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl delete pod nginx-rs-5nsm7 -n rsdemo
pod "nginx-rs-5nsm7" deleted

# watch live status of the pods running in namespace rsdemo
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get pods -n rsdemo --watch
NAME             READY   STATUS    RESTARTS   AGE
nginx-rs-wq99z   1/1     Running   0          21m
nginx-rs-5nsm7   1/1     Running   0          2m2s
nginx-rs-5nsm7   1/1     Terminating   0          2m38s
nginx-rs-7p8hl   0/1     Pending       0          0s
nginx-rs-7p8hl   0/1     Pending       0          0s
nginx-rs-7p8hl   0/1     ContainerCreating   0          0s
nginx-rs-5nsm7   0/1     Terminating         0          2m39s
nginx-rs-7p8hl   1/1     Running             0          3s
nginx-rs-5nsm7   0/1     Terminating         0          2m49s
nginx-rs-5nsm7   0/1     Terminating         0          2m49s


# verify pods and rs after some time
^Ccloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-wq99z   1/1     Running   0          22m
pod/nginx-rs-7p8hl   1/1     Running   0          60s

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       22m

If we live watch the status of pods, we can see that while pod nginx-rs-5nsm7 was being terminated, it spins up a new pod nginx-rs-7p8hl. In some time, it will reconcile and verify that number of running replicas is as desired.

Non Template Pod Acquisitions

A replicaset is not limited to create pod as desired to nullify pod failures. It can acquire other pods, based on the selector criteria specified in its manifest, if they were running previously or created separately.

In our example above, we specified criteria that it should acquire pods which have label of env=dev. If we create more pods separately having the same set of labels, replicaset would acquire them and depending on its state, it would either using or terminate them. To understand more, lets create a pod based on below manifest:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-standalone      
  labels:
    env: dev
    version: "1"
spec:
  restartPolicy: Always
  containers:
  - name: nginx-pod
    image: nginx:latest
    ports:
    - name: http
      containerPort: 80
      protocol: TCP

As soon as we create it, it will be acquired by replicaset and since it already has desired number of replicas, it would terminate the new standalone pod created:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f rs-extrapod.yaml -n rsdemo
pod/nginx-standalone created

cloud_user@d7bfd02ab81c:~/workspace$ kubectl get pod -n rsdemo --watch
NAME             READY   STATUS    RESTARTS   AGE
nginx-rs-wq99z   1/1     Running   0          41m
nginx-rs-7p8hl   1/1     Running   0          19m
nginx-standalone   0/1     Pending   0          0s
nginx-standalone   0/1     Pending   0          0s
nginx-standalone   0/1     Terminating   0          0s
nginx-standalone   0/1     Terminating   0          0s
^C

cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe rs nginx-rs -n rsdemo
Name:         nginx-rs
Namespace:    rsdemo
Selector:     env=dev
Labels:       app=nginx
Annotations:  <none>
Replicas:     2 current / 2 desired
Pods Status:  2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  env=dev
  Containers:
   nginx:
    Image:        nginx:latest
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From                   Message
  ----    ------            ----  ----                   -------
  Normal  SuccessfulCreate  44m   replicaset-controller  Created pod: nginx-rs-wq99z
  Normal  SuccessfulCreate  25m   replicaset-controller  Created pod: nginx-rs-5nsm7
  Normal  SuccessfulCreate  23m   replicaset-controller  Created pod: nginx-rs-7p8hl
  Normal  SuccessfulDelete  22s   replicaset-controller  Deleted pod: nginx-standalone

So while using replicaset, be sure that you are not creating pods based on matching labels with existing replicaset or running two replicasets with matching labels in same namespace.

Isolating Pods from a ReplicaSet

You can remove Pods from a ReplicaSet by changing their labels. This technique may be used to remove Pods from service for debugging, data recovery, etc. Pods that are removed in this way will be replaced automatically, assuming that the number of replicas is not also changed. Consider below example:

# gets current state of pods and replicaset
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-7p8hl   1/1     Running   0          100m
pod/nginx-rs-wq99z   1/1     Running   0          122m

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       122m


# remove the label env=dev from pod nginx-rs-wq99z
cloud_user@d7bfd02ab81c:~/workspace$ kubectl label pod nginx-rs-wq99z env- -n rsdemo
pod/nginx-rs-wq99z labeled

# verify that replicaset has created a new pod nginx-rs-698fz 
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-7p8hl   1/1     Running   0          101m
pod/nginx-rs-wq99z   1/1     Running   0          123m
pod/nginx-rs-698fz   1/1     Running   0          6s

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       123m


# verify the same in replicaset events
cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe rs nginx-rs -n rsdemo
Name:         nginx-rs
Namespace:    rsdemo
Selector:     env=dev
Labels:       app=nginx
Annotations:  <none>
Replicas:     2 current / 2 desired
Pods Status:  2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  env=dev
  Containers:
   nginx:
    Image:        nginx:latest
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From                   Message
  ----    ------            ----  ----                   -------
  Normal  SuccessfulCreate  39s   replicaset-controller  Created pod: nginx-rs-698fz

Scaling a ReplicaSet

ReplicaSets are scaled up or down by updating the spec.replicas key on the ReplicaSet object stored in Kubernetes. The ReplicaSet controller ensures that a desired number of Pods with a matching label selector are available and operational.

If the number of replicas is scaled up, new pods will be created based on the pod template defined. If the number of replicas is reduced, it will terminate existing pods to match the desired state. You need to make sure that application behavior should not be impacted by which pod is selected for termination.

We can also scale imperatively, by using kubectl scale command:

# get current number of pods and replicaset
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
kNAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-7p8hl   1/1     Running   0          115m
pod/nginx-rs-wq99z   1/1     Running   0          137m
pod/nginx-rs-698fz   1/1     Running   0          14m

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       137m

# increase number of replicas from 2 to 4
cloud_user@d7bfd02ab81c:~/workspace$ kubectl scale rs nginx-rs --replicas=4 -n rsdemo
replicaset.apps/nginx-rs scaled


# verify that number of desired and ready pods increased to 4
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-7p8hl   1/1     Running   0          116m
pod/nginx-rs-wq99z   1/1     Running   0          138m
pod/nginx-rs-698fz   1/1     Running   0          15m
pod/nginx-rs-4tx8x   1/1     Running   0          9s
pod/nginx-rs-mltbf   1/1     Running   0          9s

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   4         4         4       138m


# scale down the number of replicas from 4 to 1
cloud_user@d7bfd02ab81c:~/workspace$ kubectl scale rs nginx-rs --replicas=1 -n rsdemo
replicaset.apps/nginx-rs scaled


# verify that desired and ready pods reduced to 1
cloud_user@d7bfd02ab81c:~/workspace$ kubectl get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-7p8hl   1/1     Running   0          116m
pod/nginx-rs-wq99z   1/1     Running   0          138m

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   1         1         1       138m

Autoscaling a ReplicaSet

While you can scale replicaset to a desired number of replicas, you more often than not, want to use enough number of replicas. What number of replicas is good enough, can vary a lot from time to time and day to day. To avoid this issue of changing replicas in accordance with the incoming traffic, we can use horizontal pod autoscalers (HPAs).

When defining the HPA, you need to define the maximum and minimum number of replicas, and criteria to scale up or down by the autoscaler.

To scale a ReplicaSet, you can run a command like the following:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl autoscale rs nginx-rs --min=3 --max=10 --cpu-percent=80 -n rsdemo
horizontalpodautoscaler.autoscaling/nginx-rs autoscaled

cloud_user@d7bfd02ab81c:~/workspace$ kubectl get hpa -n rsdemo
NAME       REFERENCE             TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
nginx-rs   ReplicaSet/nginx-rs   10% /80%   3         10        3          20s

The only supported metrics is based on cpu in stable version autoscaling/v1. The beta version, which includes support for scaling on memory and custom metrics, can be found in autoscaling/v2beta2. The new fields introduced in autoscaling/v2beta2 are preserved as annotations when working with autoscaling/v1.

Deleting a ReplicaSet

We can delete the replicaset by using kubectl delete command either imperatively or by passing the manifest file for the replicaset:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl delete -f rs-simple.yaml  -n rsdemo
replicaset.apps "nginx-rs" deleted

cloud_user@d7bfd02ab81c:~/workspace$ kubectl  get all -n rsdemo
NAME                 READY   STATUS        RESTARTS   AGE
pod/nginx-rs-wq99z   1/1     Running       2          6h17m
pod/nginx-rs-b6gvd   0/1     Terminating   2          148m
pod/nginx-rs-ng6m4   0/1     Terminating   2          148m
pod/nginx-rs-7p8hl   0/1     Terminating   2          5h55m


cloud_user@d7bfd02ab81c:~/workspace$ kubectl  get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-wq99z   1/1     Running   2          6h17m

If we do not want to delete pods, but just delete replicaset, we can use --cascade=orphan:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl  get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-wq99z   1/1     Running   2          6h19m
pod/nginx-rs-cnfrp   1/1     Running   0          35s
pod/nginx-rs-l8252   1/1     Running   0          35s

NAME                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-rs   2         2         2       35s

cloud_user@d7bfd02ab81c:~/workspace$ kubectl delete -f rs-simple.yaml --cascade=orphan -n rsdemo
replicaset.apps "nginx-rs" deleted

cloud_user@d7bfd02ab81c:~/workspace$ kubectl  get all -n rsdemo
NAME                 READY   STATUS    RESTARTS   AGE
pod/nginx-rs-wq99z   1/1     Running   2          6h20m
pod/nginx-rs-cnfrp   1/1     Running   0          112s
pod/nginx-rs-l8252   1/1     Running   0          112s

After that, we’ll need to delete pod ourselves by using kubectl delete pod. Since namespace rsdemo was created specifically for this post’s purpose, we’ll delete namespace and it will delete everything under it:

cloud_user@d7bfd02ab81c:~/workspace$ kubectl delete namespace rsdemo 
namespace "rsdemo" deleted

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s