Memory Resource Overload During Scaling

Node memory resources were exhausted during a scaling event, causing pods to crash.

Find this helpful?

What Happened

As the cluster scaled, nodes did not have enough memory resources to accommodate the new pods, causing the pods to crash and leading to high memory pressure.

Diagnosis Steps

1Checked pod resource usage and found that memory limits were exceeded, leading to eviction of pods.
2Observed that the scaling event did not consider memory usage in the node resource calculations.

Root Cause

Insufficient memory on nodes during scaling events, leading to pod crashes.

Fix/Workaround

• Adjusted pod memory requests and limits to avoid over-provisioning.
• Increased memory resources on the nodes to handle the scaled workload.

Lessons Learned

Memory pressure is a critical factor in scaling, and it should be carefully considered during node provisioning.

How to Avoid

1Monitor memory usage closely during scaling events.
2Ensure that scaling policies account for both CPU and memory resources.

Previous Scenario Next Scenario