Back to all scenarios
Scenario #160
Networking
K8s v1.21, AWS EKS

Network Performance Degradation Due to Overloaded CNI Plugin

Network performance degraded due to the CNI plugin being overwhelmed by high traffic volume.

Find this helpful?
What Happened

A sudden spike in traffic caused the CNI plugin to become overloaded, resulting in significant packet loss and network latency between pods.

Diagnosis Steps
  • 1Monitored network traffic using kubectl top pods and observed unusually high traffic to and from a few specific pods.
  • 2Inspected CNI plugin logs and found errors related to resource exhaustion.
Root Cause

The CNI plugin lacked sufficient resources to handle the spike in traffic, leading to packet loss and network degradation.

Fix/Workaround
• Increased resource limits for the CNI plugin pods.
• Used network policies to limit the traffic spikes to specific services.
Lessons Learned

Ensure that the CNI plugin is properly sized to handle peak traffic loads, and monitor its health regularly.

How to Avoid
  • 1Set up traffic rate limiting to prevent sudden spikes from overwhelming the network.
  • 2Use resource limits and horizontal pod autoscaling for critical CNI components.