Back to all scenarios
Scenario #465
Scaling & Load
Kubernetes v1.23, AWS EKS
Resource Starvation During Infrequent Scaling Events
During infrequent scaling events, resource starvation occurred due to improper resource allocation.
Find this helpful?
What Happened
Infrequent scaling triggered by traffic bursts led to resource starvation on nodes, preventing pod scheduling.
Diagnosis Steps
- 1Analyzed the scaling logs and found that resource allocation during scaling events was inadequate to meet the traffic demands.
- 2Observed that resource starvation was particularly high for CPU and memory during scaling.
Root Cause
Improper resource allocation strategy during pod scaling events.
Fix/Workaround
• Adjusted resource requests and limits to better reflect the actual usage during scaling events.
• Increased node pool size to provide more headroom during burst scaling.
Lessons Learned
Resource requests must align with actual usage during scaling events to prevent starvation.
How to Avoid
- 1Implement more accurate resource monitoring and adjust scaling policies based on real traffic usage patterns.