Back to all scenarios
Scenario #471
Scaling & Load
Kubernetes v1.24, Google Cloud

Node Scaling Delayed Due to Cloud Provider API Limits

Node scaling was delayed because the cloud provider’s API rate limits were exceeded, preventing automatic node provisioning.

Find this helpful?
What Happened

During a scaling event, the Cloud Provider API rate limits were exceeded, and the Kubernetes Cluster Autoscaler failed to provision new nodes, causing pod scheduling delays.

Diagnosis Steps
  • 1Checked the autoscaler logs and found that the scaling action was queued due to API rate limit restrictions.
  • 2Observed that new nodes were not added promptly, leading to pod scheduling failures.
Root Cause

Exceeded API rate limits for cloud infrastructure.

Fix/Workaround
• Worked with the cloud provider to increase API rate limits.
• Configured autoscaling to use multiple API keys to distribute the API requests and avoid hitting rate limits.
Lessons Learned

Cloud infrastructure APIs can have rate limits that may affect scaling.

How to Avoid
  • 1Monitor cloud API rate limits and set up alerting for approaching thresholds.
  • 2Use multiple API keys for autoscaling operations to avoid hitting rate limits.