Back to all scenarios
Scenario #495
Scaling & Load
Kubernetes v1.22, AWS EKS
HPA and Node Pool Scaling Conflict
Horizontal Pod Autoscaler (HPA) conflicted with Node Pool autoscaling, causing resource exhaustion.
Find this helpful?
What Happened
The Horizontal Pod Autoscaler scaled up pods too quickly, but the node pool autoscaler was slow to react, resulting in a resource bottleneck and pod eviction.
Diagnosis Steps
- 1Checked HPA and Cluster Autoscaler logs, where it was found that HPA rapidly increased the number of pods, while the Cluster Autoscaler was not scaling up the nodes at the same pace.
- 2Observed that the pod eviction policy was triggered because the cluster ran out of resources.
Root Cause
Mismatched scaling policies between HPA and the node pool autoscaler.
Fix/Workaround
• Adjusted the scaling policies of both the HPA and the Cluster Autoscaler to ensure they are aligned.
• Increased resource limits on the node pools to accommodate the increased load from scaling.
Lessons Learned
Scaling policies for pods and nodes should be coordinated to avoid resource contention.
How to Avoid
- 1Synchronize scaling policies for HPA and Cluster Autoscaler to ensure a smooth scaling process.
- 2Continuously monitor scaling behavior and adjust policies as needed.