Back to all scenarios
Scenario #7
Cluster Management
K8s v1.20, On-prem

Node Goes NotReady Due to Clock Skew

One node dropped from the cluster due to TLS errors from time skew.

Find this helpful?
What Happened

TLS handshakes between the API server and a node started failing. Node became NotReady. Investigation showed NTP daemon was down.

Diagnosis Steps
  • 1Checked logs for TLS errors: “certificate expired or not yet valid”.
  • 2Used timedatectl to check drift – node was 45s behind.
  • 3NTP service was inactive.
Root Cause

Large clock skew between node and control plane led to invalid TLS sessions.

Fix/Workaround
• Restarted NTP sync.
• Restarted kubelet after sync.
Lessons Learned

Clock sync is critical in TLS-based distributed systems.

How to Avoid
  • 1Use chronyd or systemd-timesyncd.
  • 2Monitor clock skew across nodes.