kubernetes Errors
20 error patterns
Pod in CrashLoopBackOff state
CrashLoopBackOff
- •Check pod logs: kubectl logs pod-name --previous
- •Verify the container entrypoint/command is correct
Cannot pull container image
ImagePullBackOff|ErrImagePull
- •Verify the image name and tag exist in the registry
- •Check imagePullSecrets is configured for private registries
Container killed due to out-of-memory
OOMKilled
- •Increase memory limits in the pod spec: resources.limits.memory
- •Profile the application to find memory leaks
Pod stuck in Pending due to insufficient resources
Pending.*Insufficient (cpu|memory)|FailedScheduling.*Insufficient
- •Scale up the cluster or add nodes with more resources
- •Reduce resource requests in the pod spec
Service not found or has no endpoints
service.*not found|no endpoints available
- •Verify the service exists: kubectl get svc -n namespace
- •Check service selector matches pod labels exactly
ConfigMap referenced but not found
configmap.*not found|Error.*configmaps.*not found
- •Create the ConfigMap: kubectl create configmap name --from-file=path
- •Check the ConfigMap name and namespace match the pod spec
Ingress returning 404 for requests
ingress.*404|default backend - 404
- •Verify ingress rules match the request host and path
- •Check the backend service and port exist and have endpoints
TLS certificate or cert-manager failure
certificate.*not found|tls.*secret.*not found|cert-manager.*Failed
- •Check Certificate resource status: kubectl describe certificate name
- •Verify cert-manager is running and ClusterIssuer is configured
Volume mount failed - PVC not found
MountVolume.*failed.*not found|PersistentVolumeClaim.*not found
- •Create the PersistentVolumeClaim before deploying the pod
- •Verify PVC name in pod spec matches the actual PVC name
Cannot connect to Kubernetes API server
connection refused.*:6443|Unable to connect to the server
- •Check if the cluster is running: systemctl status kubelet
- •Verify kubeconfig points to correct server: kubectl config view
Health check probe failure
Readiness probe failed|Liveness probe failed
- •Verify the probe endpoint returns 200 when the app is healthy
- •Increase initialDelaySeconds if the app needs more startup time
RBAC permission denied
forbidden.*User.*cannot.*resource
- •Create a RoleBinding/ClusterRoleBinding for the user/service account
- •Check which permissions are needed: kubectl auth can-i --list
Container repeatedly restarting
Back-off restarting failed container|restartCount:\s*[5-9]|restartCount:\s*\d{2,}
- •Check previous container logs: kubectl logs pod --previous
- •Inspect exit code: kubectl describe pod name (look for Exit Code)
HPA unable to fetch metrics
error.*Horizontal Pod Autoscaler.*unable to.*metric
- •Verify metrics-server is running: kubectl get pods -n kube-system
- •Check that resource requests are set on the deployment (required for CPU HPA)
Network policy blocking pod communication
NetworkPolicy.*denied|connection timed out.*between pods
- •Check NetworkPolicy rules: kubectl get networkpolicy -n namespace
- •Add an ingress/egress rule allowing traffic from the source pod's labels
Secret referenced but not found in namespace
secret.*not found|Error.*secrets.*not found
- •Create the secret: kubectl create secret generic name --from-literal=key=value
- •Verify secret exists in the same namespace as the pod
Deployment rollout stuck/timed out
Deployment.*exceeded its progress deadline|ProgressDeadlineExceeded
- •Check new pods: kubectl get pods -l app=name to see why they're not Ready
- •Increase progressDeadlineSeconds if startup is legitimately slow
Pod waiting for PersistentVolume to be bound
pod has unbound immediate PersistentVolumeClaims|waiting for.*volume
- •Check if a PV matching the PVC exists: kubectl get pv
- •Verify StorageClass provisioner is working
Admission webhook rejected the request
error.*admission webhook.*denied the request
- •Check the webhook error message for specific policy violations
- •Review the ValidatingWebhookConfiguration for the blocking webhook
Node at pod capacity limit
too many pods.*node|DaemonSet.*nodes.*misscheduled
- •Increase max-pods in kubelet configuration
- •Add more nodes to the cluster