Back to all scenarios
Scenario #66
Cluster Management
K8s v1.21, GKE

Insufficient Cluster Capacity Due to Unchecked CronJobs

The cluster experienced resource exhaustion because CronJobs were running in parallel without proper capacity checks.

Find this helpful?
What Happened

Several CronJobs were triggered simultaneously, causing the cluster to run out of CPU and memory resources.

Diagnosis Steps
  • 1Checked CronJob schedules and found multiple jobs running at the same time.
  • 2Monitored resource usage and identified high CPU and memory consumption from the CronJobs.
Root Cause

Lack of resource limits and concurrent job checks in CronJobs.

Fix/Workaround
• Added resource requests and limits for CronJobs.
• Configured CronJobs to stagger their execution times to avoid simultaneous execution.
Lessons Learned

Always add resource limits and configure CronJobs to prevent them from running in parallel and exhausting cluster resources.

How to Avoid
  • 1Set appropriate resource requests and limits for CronJobs.
  • 2Use concurrencyPolicy to control parallel executions of CronJobs.