Back to all scenarios
Scenario #62
Cluster Management
K8s v1.21, AWS EKS

Failed Pod Rescheduling Due to Node Affinity Misconfiguration

Pods failed to reschedule after a node failure due to improper node affinity rules.

Find this helpful?
What Happened

When a node was taken down for maintenance, the pod failed to reschedule due to restrictive node affinity settings.

Diagnosis Steps
  • 1Checked pod events and noticed affinity rule errors preventing the pod from scheduling on other nodes.
  • 2Analyzed the node affinity configuration in the pod spec.
Root Cause

Node affinity rules were set too restrictively, preventing the pod from being scheduled on other nodes.

Fix/Workaround
• Adjusted the node affinity rules to be less restrictive.
• Re-scheduled the pods to available nodes.
Lessons Learned

Affinity rules should be configured to provide sufficient flexibility for pod rescheduling.

How to Avoid
  • 1Set node affinity rules based on availability and workloads.
  • 2Regularly test affinity and anti-affinity rules during node maintenance windows.