Kubernetes Pod Crashes
Pod crashes in Kubernetes follow the same diagnostic path as Docker container crashes, with additional Kubernetes-specific concepts: Kubernetes restart policies and backoff timing, liveness and readiness probes, and the distinction between container crashes (application error) and pod scheduling failures (resource or taint issues). The first tool to reach for is always kubectl describe pod — it shows the complete event history including why a pod was killed and what Kubernetes did about it.
Pod crash states
Pod crash timeline (CrashLoopBackOff):
Pod created → container starts → crashes (exit code != 0)
Kubernetes: restart container (immediately)
Container starts → crashes (exit code != 0)
Kubernetes: wait 10s, then restart
Container starts → crashes
Kubernetes: wait 20s, then restart (doubling backoff)
Container starts → crashes
Kubernetes: wait 40s, then restart
... (maximum backoff: 5 minutes)
STATUS: CrashLoopBackOff ← Container keeps failing, Kubernetes backing off
To check current backoff state:
kubectl get pod mypod → Restart count shows how many attemptsCrashLoopBackOff diagnosis
# Step 1: get the detailed event history:
kubectl describe pod mypod-abc123
kubectl describe pod Events section
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m default-scheduler Successfully assigned default/mypod
Normal Pulling 5m kubelet Pulling image "myapp:latest"
Normal Started 4m55s kubelet Started container myapp
Warning BackOff 4m (x12 over 4m) kubelet Back-off restarting failed container
# x12 restarts in 4 minutes = CrashLoopBackOff with high restart count
# Step 2: get logs from the PREVIOUS (crashed) container instance:
kubectl logs mypod-abc123 --previous # Critical: --previous shows crash logs
# Step 3: get current container logs (may just show startup):
kubectl logs mypod-abc123
# Step 4: check exit code:
kubectl get pod mypod-abc123 -o json | python3 -c "
import json, sys
pod = json.load(sys.stdin)
for c in pod['status'].get('containerStatuses', []):
s = c.get('lastState', {}).get('terminated', {})
if s:
print(f'Container: {c["name"]}')
print(f'Exit code: {s.get("exitCode")}')
print(f'Reason: {s.get("reason")}')
"
OOMKilled in Kubernetes
# OOMKilled = container exceeded its memory limit:
kubectl describe pod mypod-abc123 | grep -A5 "Last State"
kubectl describe Last State showing OOMKilled
Last State: Terminated
Reason: OOMKilled ← Memory limit exceeded
Exit Code: 137
Started: Mon, 09 Jun 2025 14:30:05 +0000
Finished: Mon, 09 Jun 2025 14:32:18 +0000
# Check current memory usage vs limit:
kubectl top pod mypod-abc123
# NAME CPU(cores) MEMORY(bytes)
# mypod-abc123 50m 480Mi
# If the pod spec says limits.memory: 512Mi and usage is 480Mi = near limit
# Increase memory limit in the Deployment:
kubectl edit deployment myapp
# Under spec.template.spec.containers[0].resources.limits:
# memory: "1Gi" (increase from 512Mi)
# Or patch directly:
kubectl patch deployment myapp -p '{"spec":{"template":{"spec":{"containers":[{"name":"myapp","resources":{"limits":{"memory":"1Gi"}}}]}}}}'
Liveness and readiness probe failures
# Liveness probe failure → Kubernetes kills and restarts the container
# Readiness probe failure → pod removed from Service endpoints but NOT killed
# Check probe configuration and status:
kubectl describe pod mypod-abc123 | grep -A15 "Liveness\|Readiness"
kubectl describe probe section
Liveness: http-get http://:8080/health delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:8080/ready delay=5s timeout=1s period=10s #success=1 #failure=3
# If liveness probe is failing every 10 seconds:
# → Container gets killed every ~30 seconds (3 failures × 10s)
# → CrashLoopBackOff even if the app itself is healthy
# Debug probe: exec into the container and test the endpoint manually:
kubectl exec -it mypod-abc123 -- wget -qO- http://localhost:8080/health
# If this returns 200: probe config is wrong (wrong port, path, or timing)
# If this fails: application health endpoint is broken
# Common fix: increase initialDelaySeconds for slow-starting apps:
# livenessProbe:
# initialDelaySeconds: 30 # Give app 30s to start before first probe
Conclusion
The kubectl logs --previous flag is the most important flag for debugging CrashLoopBackOff. Without it, you see only the logs from the current (often just-started) container instance, not the logs from the container that actually crashed. Set initialDelaySeconds in liveness probes to be longer than your application's startup time — a liveness probe that fires before the application finishes starting will kill the pod in a restart loop even though the application code is perfectly healthy.
FAQ
Is Kubernetes Pod Crashes important for Ubuntu administrators?+
Yes. It supports practical Ubuntu administration because it connects directly to server reliability, security, troubleshooting, or daily operations.
Should I practice this on a live server?+
Use a lab VM first. After you understand the command output and rollback path, apply the workflow carefully on real systems.
What should I do after reading this article?+
Run the practice commands, write down what each one shows, and continue to the next article in the Ubuntu roadmap.
Need help with Ubuntu administration?
Work directly with Muhammad Irfan Aslam for Ubuntu Server, Linux, cloud, Docker, DevOps, CI/CD, or infrastructure troubleshooting support.
Hire Me for Support