Back to all scenarios
Scenario #24
Cluster Management
K8s v1.25, self-managed, containerd
Pod Eviction Storm Due to DiskPressure
A sudden spike in image pulls caused all nodes to hit disk pressure, leading to massive pod evictions.
Find this helpful?
What Happened
A nightly batch job triggered a container image update across thousands of pods. Pulling these images used all available space in /var/lib/containerd, which led to node condition DiskPressure, forcing eviction of critical workloads.
Diagnosis Steps
- 1Used kubectl describe node – found DiskPressure=True.
- 2Inspected /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/.
- 3Checked image pull logs.
Root Cause
No image GC and too many simultaneous pulls filled up disk space.
Fix/Workaround
• Pruned unused images.
• Enabled container runtime garbage collection.Lessons Learned
DiskPressure can take down entire nodes without warning.
How to Avoid
- 1Set eviction thresholds properly in kubelet.
- 2Enforce rolling update limits (maxUnavailable).