Pod Eviction Storm Due to DiskPressure

A sudden spike in image pulls caused all nodes to hit disk pressure, leading to massive pod evictions.

Find this helpful?

What Happened

A nightly batch job triggered a container image update across thousands of pods. Pulling these images used all available space in /var/lib/containerd, which led to node condition DiskPressure, forcing eviction of critical workloads.

Diagnosis Steps

1Used kubectl describe node – found DiskPressure=True.
2Inspected /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/.
3Checked image pull logs.

Root Cause

No image GC and too many simultaneous pulls filled up disk space.

Fix/Workaround

• Pruned unused images.
• Enabled container runtime garbage collection.

Lessons Learned

DiskPressure can take down entire nodes without warning.

How to Avoid

1Set eviction thresholds properly in kubelet.
2Enforce rolling update limits (maxUnavailable).

Previous Scenario Next Scenario