Back to all scenarios
Scenario #286
Security
Kubernetes v1.22, EKS

Webhook Authentication Timing Out, Causing Denial of Service

Authentication webhook for custom RBAC timed out under load, rejecting valid users and causing cluster-wide issues.

Find this helpful?
What Happened

Spike in API requests caused the external webhook server to time out. This led to mass access denials and degraded API server performance.

Diagnosis Steps
  • 1Checked API server logs for webhook timeout messages.
  • 2Monitored external auth service – saw 5xx errors.
  • 3Replayed request load to replicate.
Root Cause

Auth webhook couldn't scale with API server traffic.

Fix/Workaround
• Increased webhook timeouts and horizontal scaling.
• Added local caching for frequent identities.
Lessons Learned

External dependencies can introduce denial of service risks.

How to Avoid
  • 1Stress-test webhooks.
  • 2Use token-based or in-cluster auth where possible.