Back to all scenarios
Scenario #313
Storage
Kubernetes v1.23, Longhorn

CSI Controller Pod Crash Due to Log Overflow

The CSI controller crashed repeatedly due to unbounded logging filling up ephemeral storage.

Find this helpful?
What Happened

A looped RPC error generated thousands of log lines per second. Node /var/log/containers hit 100% disk usage.

Diagnosis Steps
  • 1kubectl describe pod: showed OOMKilled and failed to write logs.
  • 2Checked node disk: /var was full.
  • 3Logs rotated too slowly.
Root Cause

Verbose logging + missing log throttling + small disk.

Fix/Workaround
• Added log rate limits via CSI plugin config.
• Increased node ephemeral storage.
Lessons Learned

Logging misconfigurations can become outages.

How to Avoid
  • 1Monitor log volume and disk usage.
  • 2Use log rotation and retention policies.