Back to all scenarios
Scenario #85
Cluster Management
K8s v1.22, Azure AKS

Failed Node Drain Due to In-Use Pods

A node failed to drain due to pods that were in use, preventing the drain operation from completing.

Find this helpful?
What Happened

When attempting to drain a node, the operation failed because some pods were still in use or had pending termination grace periods.

Diagnosis Steps
  • 1Ran kubectl describe node and checked pod evictions.
  • 2Identified pods that were in the middle of long-running processes or had insufficient termination grace periods.
Root Cause

Pods with long-running tasks or improper termination grace periods caused the drain to hang.

Fix/Workaround
• Increased termination grace periods for the affected pods.
• Forced the node drain operation after ensuring that the pods could safely terminate.
Lessons Learned

Ensure that pods with long-running tasks have adequate termination grace periods.

How to Avoid
  • 1Configure appropriate termination grace periods for all pods.
  • 2Monitor node draining and ensure pods can gracefully shut down.