Back to all scenarios
Scenario #19
Cluster Management
K8s v1.22, Bare-metal, bonded NICs

Multiple Nodes Marked Unreachable Due to Flaky Network Interface

Flapping interface on switch caused nodes to be marked NotReady intermittently.

Find this helpful?
What Happened

A network switch port had flapping issues, leading to periodic loss of node heartbeats.

Diagnosis Steps
  • 1Node status flapped between Ready and NotReady.
  • 2Checked NIC logs via dmesg and ethtool.
  • 3Observed link flaps in switch logs.
Root Cause

Hardware or cable issue causing loss of connectivity.

Fix/Workaround
• Replaced cable and switch port.
• Set up redundant bonding with failover.
Lessons Learned

Physical layer issues can appear as node flakiness.

How to Avoid
  • 1Monitor NIC link status and configure bonding.
  • 2Proactively audit switch port health.