Back to all scenarios
Scenario #182
Networking
K8s v1.19, GKE
Sporadic DNS Failures Due to Resource Contention in CoreDNS Pods
Sporadic DNS resolution failures occurred due to resource contention in CoreDNS pods, which were not allocated enough CPU resources.
Find this helpful?
What Happened
CoreDNS pods were experiencing sporadic failures due to high CPU utilization. DNS resolution intermittently failed during peak load times.
Diagnosis Steps
- 1Used kubectl top pod to monitor resource usage and found that CoreDNS pods were CPU-bound.
- 2Monitored DNS query logs and found a correlation between high CPU usage and DNS resolution failures.
Root Cause
CoreDNS pods were not allocated sufficient CPU resources to handle the DNS query load during peak times.
Fix/Workaround
• Increased CPU resource requests and limits for CoreDNS pods.
• Enabled horizontal pod autoscaling for CoreDNS to scale during high demand.
Lessons Learned
CoreDNS should be adequately resourced, and autoscaling should be enabled to handle varying DNS query loads.
How to Avoid
- 1Set proper resource requests and limits for CoreDNS.
- 2Implement autoscaling for DNS services based on real-time load.