By default, containers / pods are allocated unbound resources in the Kubernetes cluster. This allows them to consume as much resources as they need on a given node. However, this is a not a pretty scenario for cluster administrators. With the resource quotas, admins can restrict the amount of cpu and memory resources available, on a namespace basis. Within a namespace, the resource per container or pod can be controlled by using limit ranges.
Limit Range policy can be also be used to minimum and maximum storage request per PersistentVolumeClaim as well.
If no resource requests or limits are defined by the pod/container, limit range policy can be used to do the default allocation.
Create a Limit Range for a namespace
Since Limit Range policy is scoped at namespace level, we need to create a namespace first. By default, it no namespace is specified, the command is executed against the default namespace. We can create a new namespace using the kubectl create namspace
command:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl create namespace lrdemo namespace/lrdemo created
To create a limit range, we need to describe policy in a manifest. We can define a limit range policy as below as example:
apiVersion: v1
kind: LimitRange
metadata:
name: lr-demo
spec:
limits:
- type: Container
max:
memory: 1Gi
cpu: 1
min:
memory: 200Mi
cpu: "200m"
In above manifest, min
defines the minimum value of resource to be allocated and max
defines the maximum resource to be allocated for the resource type of container.
It is preferable to define policy at container level, as different containers have different requirements depending upon the underlying application workload. The total resource consumption by a pod is the sum of resources consumption by containers within it.
Lets go ahead and create the object with kubectl apply
command:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f limit-range.yaml -n lrdemo limitrange/lr-demo created cloud_user@d7bfd02ab81c:~/workspace$ kubectl get limitrange lr-demo -n lrdemo NAME CREATED AT lr-demo 2021-05-01T22:18:14Z cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe limitrange lr-demo -n lrdemo Name: lr-demo Namespace: lrdemo Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio ---- -------- --- --- --------------- ------------- ----------------------- Container cpu 200m 1 1 1 - Container memory 200Mi 1Gi 1Gi 1Gi -
As you can see above that even though we did not specified the default values the resource type container, they are automatically set as per the maximum available values. This is again not a desired behavior. You would often want to set default value as somewhat middle between min and max, so that you can accommodate more than one resource of its type.
To set default values for allocation of cpu and memory, we can use property default
:
apiVersion: v1
kind: LimitRange
metadata:
name: lr-demo
spec:
limits:
- type: Container
max:
memory: 1Gi
cpu: 1
min:
memory: 200Mi
cpu: "200m"
default:
memory: 250Mi
cpu: "250m"
After this, save and apply the configuration:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f limit-range.yaml -n lrdemo limitrange/lr-demo configured cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe limitrange lr-demo -n lrdemo Name: lr-demo Namespace: lrdemo Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio ---- -------- --- --- --------------- ------------- ----------------------- Container cpu 200m 1 250m 250m - Container memory 200Mi 1Gi 250Mi 250Mi -
Create a Pod without specifying resource constraints
If we create a pod in a namespace, where limit range is defined, and if we have not defined any resource constraints in pod manifest, it will be allocated resources as per default values specified in the policy. For example, consider below pod manifest:
apiVersion: v1
kind: Pod
metadata:
name: pod-lr-demo
spec:
restartPolicy: Never
containers:
- name: ui-demo
image: docker.io/mohitgoyal/demo-ui01:1.0.0
ports:
- name: http
containerPort: 80
protocol: TCP
Now, create the pod and view the limits applicable to it:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f pod-limits.yaml -n lrdemo pod/pod-lr-demo created cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe pod pod-lr-demo -n lrdemo Name: pod-lr-demo Namespace: lrdemo Priority: 0 Node: k3d-worker-0/172.18.0.3 Start Time: Sat, 01 May 2021 22:41:31 +0000 Labels: <none> Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu, memory request for container ui-demo; cpu, memory limit for container ui-demo Status: Running IP: 10.42.1.11 IPs: IP: 10.42.1.11 Containers: ui-demo: Container ID: containerd://de28174f20e0f098dfe937fc1446da0eadbb0c1db7d7b1f5c37c1ae2e2a4c0f4 Image: docker.io/mohitgoyal/demo-ui01:1.0.0 Image ID: docker.io/mohitgoyal/demo-ui01@sha256:29fcddd0b6b1a6e70b1a40b19834ef783d27622bb92461a42686dac1d81dd514 Port: 80/TCP Host Port: 0/TCP State: Running Started: Sat, 01 May 2021 22:41:39 +0000 Ready: True Restart Count: 0 Limits: cpu: 250m memory: 250Mi Requests: cpu: 250m memory: 250Mi Environment: <none>
Create a Pod with resource constraints defined
To specify a request for a Container, include the resources:requests
field in the Container’s resource manifest. To specify a limit, include resources:limits
:
apiVersion: v1
kind: Pod
metadata:
name: pod-lr-demo
spec:
restartPolicy: Never
containers:
- name: ui-demo
image: docker.io/mohitgoyal/demo-ui01:1.0.0
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
requests:
cpu: "250m"
memory: "300Mi"
limits:
cpu: "500m"
memory: "500Mi"
Note that requests
specifies a minimum amount of resources that must be allocated to the container. The minimum amount of resources for the pod is equal to the sum of the all the minimum amount of resources for the containers within it. This parameter is used by Kubernetes scheduler, to schedule the pod onto a node.
While the requests
is used to specify a minimum, limits
is used to specify the maximum. If we do not defined the limits
, it means the container is free to use as much as maximum amount of resources allocated as per the max
resources defined in the limit range policy or maximum amount of resources available on the node (if limit range policy is not defined).
Lets apply the pod manifest and check resource allocated with kubectl describe
:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f pod-limits.yaml -n lrdemo pod/pod-lr-demo created cloud_user@d7bfd02ab81c:~/workspace$ kubectl describe pod pod-lr-demo -n lrdemo Name: pod-lr-demo Namespace: lrdemo Priority: 0 Node: k3d-worker-0/172.18.0.3 Start Time: Sun, 02 May 2021 06:57:12 +0000 Labels: <none> Annotations: <none> Status: Running IP: 10.42.1.13 IPs: IP: 10.42.1.13 Containers: ui-demo: Container ID: containerd://47396b9ce2c6a796ffb1ac0e291d1018abee484c86513b2ae5faefc922648a35 Image: docker.io/mohitgoyal/demo-ui01:1.0.0 Image ID: docker.io/mohitgoyal/demo-ui01@sha256:29fcddd0b6b1a6e70b1a40b19834ef783d27622bb92461a42686dac1d81dd514 Port: 80/TCP Host Port: 0/TCP State: Running Started: Sun, 02 May 2021 06:57:13 +0000 Ready: True Restart Count: 0 Limits: cpu: 500m memory: 500Mi Requests: cpu: 250m memory: 300Mi Environment: <none>
Crossing the limits defined
As we discussed above, While the requests
is used to specify a minimum, limits
is used to specify the maximum. If we do not defined the limits
, it means the container is free to use as much as maximum amount of resources allocated as per the max
resources defined in the limit range policy or maximum amount of resources available on the node (if limit range policy is not defined). If the container tries to allocate more memory or cpu resources than defined limits
, it becomes a target for termination by Kubelet.
Lets create a pod with below manifest:
apiVersion: v1
kind: Pod
metadata:
name: memory-demo-2
spec:
restartPolicy: Always
containers:
- name: memory-demo-2-ctr
image: polinux/stress
resources:
requests:
memory: "250Mi"
limits:
memory: "500Mi"
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "550M", "--vm-hang", "1"]
If we see the pod status after some time, it will be marked as OOMKilled:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f pod-oom.yaml -n lrdemo pod/memory-demo-2 created cloud_user@d7bfd02ab81c:~/workspace$ kubectl get pod memory-demo-2 -n lrdemo NAME READY STATUS RESTARTS AGE memory-demo-2 0/1 OOMKilled 2 35s
The container can be restarted based on the restartPolicy
defined for it. The default restart policy is Always. So kubelet will try to restart it. However, since it will again crosses the memory limit, it will be killed again and it goes like this. After some time, the Kubelet will mark the status as BackOff
and wait before attempting it all over again:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 71s default-scheduler Successfully assigned lrdemo/memory-demo-2 to k3d-worker-0 Normal Pulled 69s kubelet Successfully pulled image "polinux/stress" in 1.761727056s Normal Pulled 67s kubelet Successfully pulled image "polinux/stress" in 1.774770704s Normal Pulled 49s kubelet Successfully pulled image "polinux/stress" in 1.772412722s Normal Pulling 22s (x4 over 71s) kubelet Pulling image "polinux/stress" Normal Pulled 20s kubelet Successfully pulled image "polinux/stress" in 1.766717291s Normal Created 20s (x4 over 69s) kubelet Created container memory-demo-2-ctr Normal Started 20s (x4 over 69s) kubelet Started container memory-demo-2-ctr Warning BackOff 6s (x6 over 65s) kubelet Back-off restarting failed container
If we choose to define restartPolicy
as Never
, Kubelet will not attempt to restart the container, if its failed. For example, consider below manifest:
apiVersion: v1
kind: Pod
metadata:
name: memory-demo-2
spec:
restartPolicy: Never
containers:
- name: memory-demo-2-ctr
image: polinux/stress
resources:
requests:
memory: "250Mi"
limits:
memory: "500Mi"
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "550M", "--vm-hang", "1"]
Events associated with above pod would be:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 70s default-scheduler Successfully assigned lrdemo/memory-demo-2 to k3d-worker-0 Normal Pulling 69s kubelet Pulling image "polinux/stress" Normal Pulled 67s kubelet Successfully pulled image "polinux/stress" in 1.78066592s Normal Created 67s kubelet Created container memory-demo-2-ctr Normal Started 67s kubelet Started container memory-demo-2-ctr
Specifying Requests that are too big
The memory request for the pod is the sum of the memory requests for all the containers in the pod. Likewise, the memory limit for the pod is the sum of the limits of all the containers in the pod. Same goes for the cpu’s.
Pod scheduling is based on requests. A pod is scheduled to run on a Node only if the Node has enough available resources to satisfy the pod’s resource request and its within the limit range defined for that namespace.
Lets consider the below pod manifest where we are requesting more resources than the limit range defined:
apiVersion: v1
kind: Pod
metadata:
name: pod-lr-demo
spec:
restartPolicy: Never
containers:
- name: ui-demo
image: docker.io/mohitgoyal/demo-ui01:1.0.0
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
requests:
cpu: 2
memory: "1500Mi"
limits:
cpu: 2
memory: "2500Mi"
If we try to create pod, kubectl will deny the request as it crosses the threshold defined:
cloud_user@d7bfd02ab81c:~/workspace$ kubectl apply -f pod-oor.yaml -n lrdemo Error from server (Forbidden): error when creating "pod-oor.yaml": pods "pod-lr-demo" is forbidden: [maximum cpu usage per Container is 1, but limit is 2, maximum memory usage per Container is 1Gi, but limit is 2500Mi]
This is an easy situation to avoid, as the error message is pretty helpful. In a more likely scenario, when several users or team share a cluster, they are segregated into different namespaces. To divide the resources fairly among the teams, an admin can use the ResourceQuota
object.
A resource quota, defined by a ResourceQuota
object, provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.
It is more likely that you would be crossing the threshold defined for the namespace, compared to crossing policy limits.