Back to all scenarios
Scenario #184
Networking
K8s v1.20, On-premise
Service Discovery Issues Due to DNS Cache Staleness
Service discovery failed due to stale DNS cache entries that were not updated when services changed IPs.
Find this helpful?
What Happened
The DNS resolver cached the old IP addresses for services, causing service discovery failures when the IPs of the services changed.
Diagnosis Steps
- 1Used kubectl exec to verify DNS cache entries.
- 2Observed that the cached IPs were outdated and did not reflect the current service IPs.
Root Cause
The DNS cache was not being properly refreshed, causing stale DNS entries.
Fix/Workaround
• Cleared the DNS cache manually and implemented shorter TTL (Time-To-Live) values for DNS records.
• Restarted CoreDNS pods to apply changes.
Lessons Learned
Ensure that DNS TTL values are appropriately set to avoid stale cache issues.
How to Avoid
- 1Regularly monitor DNS cache and refresh TTL values to ensure up-to-date resolution.
- 2Implement a caching strategy that works well with Kubernetes service discovery.