HPA and Node Pool Scaling Conflict

Horizontal Pod Autoscaler (HPA) conflicted with Node Pool autoscaling, causing resource exhaustion.

Find this helpful?

What Happened

The Horizontal Pod Autoscaler scaled up pods too quickly, but the node pool autoscaler was slow to react, resulting in a resource bottleneck and pod eviction.

Diagnosis Steps

1Checked HPA and Cluster Autoscaler logs, where it was found that HPA rapidly increased the number of pods, while the Cluster Autoscaler was not scaling up the nodes at the same pace.
2Observed that the pod eviction policy was triggered because the cluster ran out of resources.

Root Cause

Mismatched scaling policies between HPA and the node pool autoscaler.

Fix/Workaround

• Adjusted the scaling policies of both the HPA and the Cluster Autoscaler to ensure they are aligned.
• Increased resource limits on the node pools to accommodate the increased load from scaling.

Lessons Learned

Scaling policies for pods and nodes should be coordinated to avoid resource contention.

How to Avoid

1Synchronize scaling policies for HPA and Cluster Autoscaler to ensure a smooth scaling process.
2Continuously monitor scaling behavior and adjust policies as needed.

Previous Scenario Next Scenario