Back to all scenarios
Scenario #119
Networking
K8s v1.20, Google GKE
Pod Disconnection During Network Partition
Pods were disconnected during a network partition between nodes in the cluster.
Find this helpful?
What Happened
A temporary network partition between nodes led to pods becoming disconnected from other services.
Diagnosis Steps
- 1Used kubectl get events to identify the network partition event.
- 2Checked network logs and found that the partition was caused by a temporary routing failure.
Root Cause
Network partition caused pods to lose communication with the rest of the cluster.
Fix/Workaround
• Re-established network connectivity and ensured all nodes could communicate with each other.
• Re-scheduled the disconnected pods to different nodes to restore connectivity.
Lessons Learned
Network partitioning can cause severe communication issues between pods.
How to Avoid
- 1Use redundant network paths and monitor network stability.
- 2Enable pod disruption budgets to ensure availability during network issues.