Back to all scenarios
Scenario #119
Networking
K8s v1.20, Google GKE

Pod Disconnection During Network Partition

Pods were disconnected during a network partition between nodes in the cluster.

Find this helpful?
What Happened

A temporary network partition between nodes led to pods becoming disconnected from other services.

Diagnosis Steps
  • 1Used kubectl get events to identify the network partition event.
  • 2Checked network logs and found that the partition was caused by a temporary routing failure.
Root Cause

Network partition caused pods to lose communication with the rest of the cluster.

Fix/Workaround
• Re-established network connectivity and ensured all nodes could communicate with each other.
• Re-scheduled the disconnected pods to different nodes to restore connectivity.
Lessons Learned

Network partitioning can cause severe communication issues between pods.

How to Avoid
  • 1Use redundant network paths and monitor network stability.
  • 2Enable pod disruption budgets to ensure availability during network issues.