Kubernetes Pod Crashes

Pod crashes in Kubernetes follow the same diagnostic path as Docker container crashes, with additional Kubernetes-specific concepts: Kubernetes restart policies and backoff timing, liveness and readiness probes, and the distinction between container crashes (application error) and pod scheduling failures (resource or taint issues). The first tool to reach for is always kubectl describe pod — it shows the complete event history including why a pod was killed and what Kubernetes did about it.

Pod crash states

Pod crash timeline (CrashLoopBackOff):

  Pod created → container starts → crashes (exit code != 0)
  Kubernetes: restart container (immediately)
  Container starts → crashes (exit code != 0)
  Kubernetes: wait 10s, then restart
  Container starts → crashes
  Kubernetes: wait 20s, then restart (doubling backoff)
  Container starts → crashes
  Kubernetes: wait 40s, then restart
  ... (maximum backoff: 5 minutes)
  STATUS: CrashLoopBackOff  ← Container keeps failing, Kubernetes backing off

  To check current backoff state:
  kubectl get pod mypod  →  Restart count shows how many attempts

CrashLoopBackOff diagnosis

# Step 1: get the detailed event history:
kubectl describe pod mypod-abc123

kubectl describe pod Events section

Events:
  Type     Reason     Age                From     Message
  ----     ------     ----               ----     -------
  Normal   Scheduled  5m                 default-scheduler  Successfully assigned default/mypod
  Normal   Pulling    5m                 kubelet  Pulling image "myapp:latest"
  Normal   Started    4m55s              kubelet  Started container myapp
  Warning  BackOff    4m (x12 over 4m)   kubelet  Back-off restarting failed container
# x12 restarts in 4 minutes = CrashLoopBackOff with high restart count

# Step 2: get logs from the PREVIOUS (crashed) container instance:
kubectl logs mypod-abc123 --previous    # Critical: --previous shows crash logs

# Step 3: get current container logs (may just show startup):
kubectl logs mypod-abc123

# Step 4: check exit code:
kubectl get pod mypod-abc123 -o json | python3 -c "
import json, sys
pod = json.load(sys.stdin)
for c in pod['status'].get('containerStatuses', []):
    s = c.get('lastState', {}).get('terminated', {})
    if s:
        print(f'Container: {c["name"]}')
        print(f'Exit code: {s.get("exitCode")}')
        print(f'Reason: {s.get("reason")}')
"

OOMKilled in Kubernetes

# OOMKilled = container exceeded its memory limit:
kubectl describe pod mypod-abc123 | grep -A5 "Last State"

kubectl describe Last State showing OOMKilled

Last State:     Terminated
  Reason:       OOMKilled     ← Memory limit exceeded
  Exit Code:    137
  Started:      Mon, 09 Jun 2025 14:30:05 +0000
  Finished:     Mon, 09 Jun 2025 14:32:18 +0000

# Check current memory usage vs limit:
kubectl top pod mypod-abc123
# NAME            CPU(cores)   MEMORY(bytes)
# mypod-abc123    50m          480Mi
# If the pod spec says limits.memory: 512Mi and usage is 480Mi = near limit

# Increase memory limit in the Deployment:
kubectl edit deployment myapp
# Under spec.template.spec.containers[0].resources.limits:
# memory: "1Gi"   (increase from 512Mi)

# Or patch directly:
kubectl patch deployment myapp -p '{"spec":{"template":{"spec":{"containers":[{"name":"myapp","resources":{"limits":{"memory":"1Gi"}}}]}}}}'

Liveness and readiness probe failures

# Liveness probe failure → Kubernetes kills and restarts the container
# Readiness probe failure → pod removed from Service endpoints but NOT killed

# Check probe configuration and status:
kubectl describe pod mypod-abc123 | grep -A15 "Liveness\|Readiness"

kubectl describe probe section

Liveness:      http-get http://:8080/health delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness:     http-get http://:8080/ready  delay=5s  timeout=1s period=10s #success=1 #failure=3
# If liveness probe is failing every 10 seconds:
# → Container gets killed every ~30 seconds (3 failures × 10s)
# → CrashLoopBackOff even if the app itself is healthy

# Debug probe: exec into the container and test the endpoint manually:
kubectl exec -it mypod-abc123 -- wget -qO- http://localhost:8080/health
# If this returns 200: probe config is wrong (wrong port, path, or timing)
# If this fails: application health endpoint is broken

# Common fix: increase initialDelaySeconds for slow-starting apps:
# livenessProbe:
#   initialDelaySeconds: 30   # Give app 30s to start before first probe

Conclusion

The kubectl logs --previous flag is the most important flag for debugging CrashLoopBackOff. Without it, you see only the logs from the current (often just-started) container instance, not the logs from the container that actually crashed. Set initialDelaySeconds in liveness probes to be longer than your application's startup time — a liveness probe that fires before the application finishes starting will kill the pod in a restart loop even though the application code is perfectly healthy.

FAQ

Is Kubernetes Pod Crashes important for Ubuntu administrators?+

Yes. It supports practical Ubuntu administration because it connects directly to server reliability, security, troubleshooting, or daily operations.

Should I practice this on a live server?+

Use a lab VM first. After you understand the command output and rollback path, apply the workflow carefully on real systems.

What should I do after reading this article?+

Run the practice commands, write down what each one shows, and continue to the next article in the Ubuntu roadmap.

Need help with Ubuntu administration?

Work directly with Muhammad Irfan Aslam for Ubuntu Server, Linux, cloud, Docker, DevOps, CI/CD, or infrastructure troubleshooting support.

Hire Me for Support

Kubernetes Pod Crashes