Back to all scenarios
Scenario #462
Scaling & Load
Kubernetes v1.26, Azure AKS

Unstable Scaling During Traffic Spikes

Pod scaling became unstable during traffic spikes due to delayed scaling responses.

Find this helpful?
What Happened

During high-traffic periods, HPA (Horizontal Pod Autoscaler) did not scale pods fast enough, leading to slow response times.

Diagnosis Steps
  • 1Reviewed HPA logs and metrics and discovered scaling triggers were based on 5-minute intervals, which caused delayed reactions to rapid traffic increases.
  • 2Observed increased latency and 504 Gateway Timeout errors.
Root Cause

Autoscaler was not responsive enough to quickly scale up based on rapidly changing traffic.

Fix/Workaround
• Adjusted the scaling policy to use smaller time intervals for triggering scaling.
• Introduced custom metrics to scale pods based on response times and traffic patterns.
Lessons Learned

Autoscaling should be sensitive to real-time traffic patterns and latency.

How to Avoid
  • 1Tune HPA to scale more aggressively during traffic spikes.
  • 2Use more advanced metrics like response time, rather than just CPU and memory, for autoscaling decisions.