Back to all scenarios
Scenario #7
Cluster Management
K8s v1.20, On-prem
Node Goes NotReady Due to Clock Skew
One node dropped from the cluster due to TLS errors from time skew.
Find this helpful?
What Happened
TLS handshakes between the API server and a node started failing. Node became NotReady. Investigation showed NTP daemon was down.
Diagnosis Steps
- 1Checked logs for TLS errors: “certificate expired or not yet valid”.
- 2Used timedatectl to check drift – node was 45s behind.
- 3NTP service was inactive.
Root Cause
Large clock skew between node and control plane led to invalid TLS sessions.
Fix/Workaround
• Restarted NTP sync.
• Restarted kubelet after sync.
Lessons Learned
Clock sync is critical in TLS-based distributed systems.
How to Avoid
- 1Use chronyd or systemd-timesyncd.
- 2Monitor clock skew across nodes.