Back to all scenarios
Scenario #77
Cluster Management
K8s v1.24, GKE

Failed Pod Restart Due to Inadequate Node Affinity

Pods failed to restart on available nodes due to overly strict node affinity rules.

Find this helpful?
What Happened

A pod failed to restart after a node failure because the node affinity rules were too strict, preventing the pod from being scheduled on any available nodes.

Diagnosis Steps
  • 1Checked pod logs and observed affinity errors in scheduling.
  • 2Analyzed the affinity settings in the pod spec and found restrictive affinity rules.
Root Cause

Strict node affinity rules prevented the pod from being scheduled on available nodes.

Fix/Workaround
• Relaxed the node affinity rules in the pod spec.
• Redeployed the pod, and it successfully restarted on an available node.
Lessons Learned

Carefully configure node affinity rules to allow flexibility during pod rescheduling.

How to Avoid
  • 1Use less restrictive affinity rules for better pod rescheduling flexibility.
  • 2Test affinity rules during node maintenance and scaling operations.