The Resource limit and requests configuration in Kubernetes is the most neglected configuration. But if you want to manage your Kubernetes cluster properly, you must devote time in understanding and observing the memory and CPU requirements of your applications running inside containers in different pods.
In a Kubernetes cluster, each node is like a virtual machine which has a fixed number of CPU core, a fixed amount of memory, some flavour of linux installed as OS, Kernel version, architecture, etc. In short it's just another machine which can get overloaded if the applications or services running on it demands more resources than the node has.
A picture is worth a thousand words, so let's have a picture to understand the basics of a Kubernetes cluster.
The diagram above is a very high level picturization of a Kubernetes cluster. As you can see we have 3 nodes in this cluster which means 3 virtual machines. All the nodes are on a same network but have different IP addresses. When we start any pod inside a namespace in Kubernetes, Kubernetes scheduler, assigns the free node to the application based on the memory and CPU, etc requirements of the App.
In the diagram, we have shown the perfect scenario where 3 replicas of applications App1, App2, App3, each run on different nodes. But in real life setup, it is possible that two replicas of App3 gets the same node, or two replicas of App2 gets the same node, etc.
The small purple box with text App1, App2, and App3 represents a pod, inside which application containers run.
So if App1 represents an application, then it is running inside a container, which is inside a pod. And a Pod can have multiple containers running inside it although it is recommended that each pod runs a single container.
Now that we have a basic understanding of Kubernetes architecture, let's see how we can manage resource allocation like CPU and memory for application and services running in Kubernetes pods. And what will be the affect of not specifying these resources correctly.
Request and Limit are the two mechanism that Kubernetes use to manage the CPU and memory resource requirement of the Pods. There are other resources as well but CPU and memory are the main resources which must be configured properly for containers running in pods.
These resources requirements can be configured at the container level, but because its recommended that one pod should run a single container(although it can run multiple containers), so we will talk in terms of pod resource requirements. Also, if there are multiple containers running inside a pod, the resource requirement of the pod is the combined resource requirements of all the containers running inside it.
Requests is used to specify the amount of resource that should be allocated to any pod for utilization, which means the pod is assigned this much resource when it starts. Whereas the Limit is used to specify the maximum amount of resource available for the pod, it makes sure a pod never goes above a certain value.
When we start a container in a pod using a YAML file with requests and limit value specified, then the Kubernetes scheduler looks for a node which has the required resources for the pod and then the node with enough resources gets allocated to the pod. If no node is found free, then the pod goes into Pending state, and waits for the resources to get free.
If you have automatic scaling enabled in the Kubernetes cluster then a new node is added to the cluster in case of lack of resource. If we do not configure the requests and limits, by default a minimum value is assigned which is generally 0 for both memory and CPU in most Kubernetes services.
Lack of resources can lead to your pods getting evicted frequently due to insufficient memory or CPU. You can use the
kubectl describe command to find the reason for an Evicted pod in your Kubernetes cluster. Here is the command:
kubectl describe pod POD_NAME -n NAMESPACE_NAME
In the output if you see something like this then it is a resource issue:
The node was low on resource: memory. Container XYZ was using 711252Ki, which exceeds its request of 0.
Similar message is shown for CPU resource insufficient.
CPU resource refers to the number of CPU core required by the pod to start or to run smoothly. This is measured in CPU units which is generally millicore or millicpu in most cloud platforms. So if you mention 2 as the value that means that your container requires 2 CPU cores to start. The value 2 is equivalent to 2000m in terms of millicore or millicpu.
We can specify value in fraction too, like 0.5 which means half CPU core is required, or 250m which means 1/4th CPU core is required.
One important point to note here is that if you set a value larger than the core count of your biggest node, your pod will never be scheduled, it will always stay in Pending state. For example if you have a pod that needs 4 cores, but your Kubernetes cluster is comprised of dual core VMs - your pod will never be scheduled!
Below we have shown the part of YAML which specifies the CPU limit and requests values:
resources: limits: cpu: 1000m requests: cpu: 500m
The above configuration means the container will be assigned 1/2 CPU core, while its can never utilise more than 1 whole CPU core.
Just like CPU, we can specify the memory requirements of a container. Memory resources are defined in bytes. But we can use a simple Integer value or a value with suffix like P, T, G, M, K or two letter suffix like Pi, Ti, Gi, Mi, Ki, etc. Usually we give value is mebibyte which is same as megabyte.
Just like CPU, if you put in a memory request that is larger than the amount of memory on your nodes, the pod will never be scheduled.
Here's a sample YAML configuration,
resources: limits: memory: 1Gi requests: memory: 500Mi
Consider we have a pod which will have two containers running in it. Here is the configuration YAML for them with memory and CPU requests and limits defined.
apiVersion: v1 kind: Pod metadata: name: frontend spec: containers: - name: db image: mysql env: - name: MYSQL_ROOT_PASSWORD value: "password" resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m" - name: wp image: wordpress resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m"
The following Pod has two Containers as we can see in the YAML above. Each Container has a request of 0.25 cpu and 64MiB of memory and each Container has a limit of 0.5 cpu and 128MiB of memory. Hence the Pod has a request of 0.5 cpu and 128 MiB of memory, and a limit of 1 cpu and 256MiB of memory. So when this pod is run, atleast 0.5 cpu and 128 MiB of memory should be available on a node to be assigned for this pod.
So in this tutorial, we learned about two important confuguration parameters Requests and Limits which should not be ignored as these are the key for stable pods.