Back to all scenarios
Scenario #27
Cluster Management
K8s v1.21, on-prem, private registry

Node Bootstrap Failure Due to Unavailable Container Registry

New nodes failed to join the cluster due to container runtime timeout when pulling base images.

Find this helpful?
What Happened

The internal Docker registry was down during node provisioning, so containerd couldn't pull pauseand CNI images. Nodes stayed in NotReady state.

Diagnosis Steps
  • 1journalctl -u containerd – repeated image pull failures.
  • 2Node conditions showed ContainerRuntimeNotReady.
Root Cause

Bootstrap process relies on image pulls from unavailable registry.

Fix/Workaround
• Brought internal registry back online.
• Pre-pulled pause/CNI images to node image templates.
Lessons Learned

Registry availability is a bootstrap dependency.

How to Avoid
  • 1Preload all essential images into AMI/base image.
  • 2Monitor registry uptime independently.