Back to all scenarios
Scenario #369
Storage
Kubernetes v1.24, external CSI

CSI Driver Memory Leak on Volume Detach Loop

CSI plugin leaked memory due to improper garbage collection on detach failure loop.

Find this helpful?
What Happened

Detach failed repeatedly due to stale metadata, causing plugin to grow in memory use.

Diagnosis Steps
  • 1Plugin memory exceeded 1GB.
  • 2Logs showed repeated detach failed with no backoff.
Root Cause

Driver retry loop without cleanup or GC.

Fix/Workaround
• Restarted CSI plugin.
• Patched driver to implement exponential backoff.
Lessons Learned

CSI error paths need memory safety.

How to Avoid
  • 1Stress-test CSI paths for failure.
  • 2Add Prometheus memory alerts for plugins.