Node Draining Delay During Maintenance

Node draining took an unusually long time during maintenance due to unscheduled pod disruption.

Find this helpful?

What Happened

During a scheduled node maintenance, draining took longer than expected because pods were not respecting PodDisruptionBudgets.

Diagnosis Steps

1Checked kubectl describe for affected pods and identified PodDisruptionBudget violations.
2Observed that some pods had hard constraints on disruption due to storage.

Root Cause

PodDisruptionBudget was too strict, preventing pods from being evicted quickly.

Fix/Workaround

• Adjusted PodDisruptionBudget to allow more flexibility for pod evictions.
• Manually evicted the pods to speed up the node draining process.

Lessons Learned

PodDisruptionBudgets should be set based on actual disruption tolerance.

How to Avoid