Back to all scenarios
Scenario #338
Storage
Kubernetes v1.22, iSCSI

Incomplete Volume Detach Breaks Node Scheduling

Scheduler skipped a healthy node due to a ghost VolumeAttachment that was never cleaned up.

Find this helpful?
What Happened

Node marked as ready, but volume controller skipped scheduling new pods due to “in-use” flag on volumes from a deleted pod.

Diagnosis Steps
  • 1Described unscheduled pod — failed to bind due to volume already attached.
  • 2VolumeAttachment still referenced old pod.
  • 3CSI logs showed no detach command received.
Root Cause

CSI controller restart dropped detach request queue.

Fix/Workaround
• Recreated CSI controller pod.
• Requeued detach operation via manual deletion.
Lessons Learned

CSI recovery from mid-state crash is critical.

How to Avoid
  • 1Persist attach/detach queues.
  • 2Use cloud-level health checks for cleanup.