In Kubernetes, pod evictions are a common occurrence that can disrupt application availability. Understanding why pods are evicted is essential for maintaining a healthy cluster. This article provides insights into pod evictions, focusing on identifying the reasons behind them and exploring potential solutions.
-
Evicted pods are terminated by Kubernetes, usually due to resource constraints like insufficient memory or disk space.
- To find the reason for eviction, use:
kubectl describe pod <pod-id> | grep Message
.
- Evicted pods don't disappear immediately. They remain until reaching the
terminated-pod-gc-threshold
limit set in kube-controller-manager
.
-
Node pressure can lead to eviction. The kubelet monitors resources and evicts pods when thresholds are exceeded.
-
Repeated evictions might occur if a deployment keeps trying to schedule pods on a node with persistent resource issues.
-
Understanding eviction reasons is crucial for troubleshooting and ensuring application stability in Kubernetes.
This code provides examples for understanding and troubleshooting pod evictions in a Kubernetes cluster. It shows how to find the reason for a pod eviction using kubectl describe and grep commands. It also includes a YAML configuration file and command to create a stress pod that simulates memory pressure on a node, potentially triggering evictions. The example guides users on how to monitor eviction events using kubectl get events and analyze the output for insights into eviction reasons. It emphasizes the importance of setting resource limits for pods, ensuring sufficient node capacity, and implementing monitoring and alerting systems to proactively manage resource constraints and eviction issues.
This example demonstrates how to find the reason for pod eviction and simulate node pressure to trigger evictions.
1. Finding the Reason for Eviction:
# Replace <pod-id> with the actual pod ID
kubectl describe pod <pod-id> | grep Message
Example Output:
Message: The node was low on resource: memory.
2. Simulating Node Pressure (Memory):
This script creates a pod that consumes a significant amount of memory, potentially leading to evictions of other pods on the same node.
# stress-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: stress-pod
spec:
containers:
- name: memory-hog
image: polinux/stress
resources:
limits:
memory: "2Gi"
command: ["stress", "-m", "1", "--vm-bytes", "1500M", "--vm-keep", "-t", "60s"]
Apply the configuration:
kubectl apply -f stress-pod.yaml
Monitor Pod Evictions:
kubectl get events --field-selector type=Warning --watch
3. Analyzing Eviction Events:
Observe the output of the kubectl get events
command. Look for events related to pod evictions, which will provide insights into the reasons behind the evictions.
Example Event:
LAST SEEN TYPE REASON OBJECT MESSAGE
3m42s Warning Evicted pod/app-1 The node was low on resource: memory.
Important Notes:
-
Resource Limits: Always set resource limits (CPU and memory) for your pods to prevent resource starvation and uncontrolled evictions.
-
Node Capacity: Ensure your nodes have sufficient resources to accommodate the workloads you deploy.
-
Monitoring and Alerting: Implement monitoring and alerting systems to proactively identify and address potential resource constraints and eviction issues.
This code example provides a starting point for understanding and troubleshooting pod evictions in Kubernetes. Remember to adapt the scripts and commands to your specific environment and requirements.
-
Evicted pods count towards your replica count: Even though they are technically "dead," evicted pods still take up a slot in your deployment's desired replica count. This can lead to partial or even complete outages if not addressed.
-
kubectl get pods
can be misleading: A pod might show a status other than "Evicted" even if it was evicted. Always check the pod's events (kubectl describe pod <pod-name>
) for a more accurate history.
-
Eviction thresholds are configurable: You can fine-tune eviction thresholds for different resources (CPU, memory, disk) at the node level. This allows for customization based on your workload's needs and node resources.
-
Pod Priority: Using Pod Priority classes can influence eviction order. Lower-priority pods are more likely to be evicted first when resources are scarce.
-
Resource requests vs. limits: While setting limits is crucial, setting appropriate resource requests helps the scheduler make better placement decisions, reducing the likelihood of evictions due to resource contention.
-
Log aggregation is essential: Centralized logging is crucial for debugging evictions. Ensure your logs capture eviction events and relevant metrics for analysis.
-
Consider using tools: Various tools can help monitor and analyze evictions, such as Prometheus, Grafana, and dedicated Kubernetes monitoring solutions.
-
Proactive prevention is key: Don't wait for evictions to happen. Regularly monitor resource usage, set up alerts, and optimize your applications and deployments to minimize the risk of resource contention.
Feature |
Description |
What is it? |
Kubernetes terminating pods, usually due to resource constraints. |
How to identify? |
kubectl describe pod <pod-id> | grep Message |
Lifecycle |
Evicted pods remain until reaching the terminated-pod-gc-threshold . |
Common Cause |
Node pressure exceeding resource thresholds monitored by kubelet. |
Potential Issue |
Repeated evictions due to persistent resource problems on a node. |
Importance |
Understanding eviction reasons is crucial for troubleshooting and application stability. |
In conclusion, Kubernetes pod evictions, while disruptive, are a mechanism for cluster stability. By understanding their causes, primarily resource constraints, and utilizing tools like kubectl for analysis, developers can mitigate their impact. Proactive measures like resource limits, monitoring, and appropriate deployment strategies are essential for maintaining a healthy and resilient Kubernetes environment. Remember that evicted pods, though terminated, impact replica counts and necessitate prompt action. Through a combination of understanding, monitoring, and proactive management, developers can minimize the occurrence and impact of pod evictions in their Kubernetes clusters.
-
Understanding Kubernetes Evicted Pods | Sysdig | What does it mean that Kubernetes Pods are evicted? They are terminated, usually due to a lack of resources. But, why does this happen?
-
kubernetes - what should I do to find the pod evicted reason - Stack ... | Apr 17, 2022 ... Try this: kubectl describe pod | grep Message. This should give you the reason why the pod was evicted. You could also use "less" ...
-
Understanding Kubernetes Evicted Pods: Causes, Prevention, and ... | Kubernetes, the popular container orchestration platform, is designed to manage and scale containerized applications seamlessly. However…
-
Too many Evicted pods - General Discussions - Discuss Kubernetes | What should I look for if I see too many evicted pods? Can anyone guide me?
-
Scheduling, Preemption and Eviction | Kubernetes | In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes so that the kubelet can run them. Preemption is the process of terminating Pods with lower Priority so that Pods with higher Priority can schedule on Nodes. Eviction is the process of terminating one or more Pods on Nodes.
Scheduling Kubernetes Scheduler Assigning Pods to Nodes Pod Overhead Pod Topology Spread Constraints Taints and Tolerations Scheduling Framework Dynamic Resource Allocation Scheduler Performance Tuning Resource Bin Packing for Extended Resources Pod Scheduling Readiness Descheduler Pod Disruption Pod disruption is the process by which Pods on Nodes are terminated either voluntarily or involuntarily.
-
Kubelet does not delete evicted pods · Issue #55051 · kubernetes ... | /kind feature What happened: Kubelet has evicted pods due to disk pressure. Eventually, the disk pressure went away and the pods were scheduled and started again, but the evicted pods remained in t...
-
Node-pressure Eviction | Kubernetes | Node-pressure eviction is the process by which the kubelet proactively terminates pods to reclaim resources on nodes.
FEATURE STATE: Kubernetes v1.31 [beta] (enabled by default: true) Note:The split image filesystem feature, which enables support for the containerfs filesystem, adds several new eviction signals, thresholds and metrics. To use containerfs, the Kubernetes release v1.32 requires the KubeletSeparateDiskGC feature gate to be enabled. Currently, only CRI-O (v1.29 or higher) offers the containerfs filesystem support.
-
many evicted pods produced repeatedly under disk pressure · Issue ... | What happened: when a node got disk pressure condition, and a deployment(replicas:1) is deployed to this node using nodename/nodeselector, it will produce many evicted pods until removing this depl...
-
Kubernetes pods evicted: understanding why! | Padok | Have you ever had pods evicted without understanding why? Discover what exactly is the eviction process and why your pods may be evicted!