Etcd Disk Full Causing API Server Timeout

etcd ran out of disk space, making API server unresponsive.

Find this helpful?

What Happened

The cluster started failing API requests. Etcd logs showed disk space errors, and API server logs showed failed storage operations.

Diagnosis Steps

Root Cause

Lack of compaction and snapshotting caused disk to fill up with historical revisions and WALs.

Fix/Workaround

bash
CopyEdit
etcdctl compact <rev>
etcdctl defrag
• Cleaned logs, snapshots, and increased disk space temporarily.

Lessons Learned

etcd requires periodic maintenance.

How to Avoid