Back to all scenarios
Scenario #19
Cluster Management
K8s v1.22, Bare-metal, bonded NICs
Multiple Nodes Marked Unreachable Due to Flaky Network Interface
Flapping interface on switch caused nodes to be marked NotReady intermittently.
Find this helpful?
What Happened
A network switch port had flapping issues, leading to periodic loss of node heartbeats.
Diagnosis Steps
- 1Node status flapped between Ready and NotReady.
- 2Checked NIC logs via dmesg and ethtool.
- 3Observed link flaps in switch logs.
Root Cause
Hardware or cable issue causing loss of connectivity.
Fix/Workaround
• Replaced cable and switch port.
• Set up redundant bonding with failover.
Lessons Learned
Physical layer issues can appear as node flakiness.
How to Avoid
- 1Monitor NIC link status and configure bonding.
- 2Proactively audit switch port health.