Troubleshooting Failed Services

When a service fails, systemd records the exact error in the journal and the service enters "failed" state. The systematic approach: find the failed service, read its status fully, check the full journal for that service, and identify the specific error. Skipping straight to "try restarting it" without reading the logs wastes time and misses the root cause.

Finding failed services

# Show all failed units
systemctl --failed
systemctl list-units --state=failed

Output when services have failed

  UNIT           LOAD   ACTIVE SUB    DESCRIPTION
● myapp.service  loaded failed failed My Application
● mysql.service  loaded failed failed MySQL Database Server

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state.
SUB    = The low-level unit activation state.

Reading systemctl status fully

sudo systemctl status myapp.service

Failed service status output (annotated)

● myapp.service - My Application
     Loaded: loaded (/etc/systemd/system/myapp.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2026-06-09 14:30:05 UTC; 5min ago
#           ↑                ↑
#       failed state     why it failed: non-zero exit code
    Process: 5678 ExecStart=/usr/bin/node /opt/myapp/server.js (code=exited, status=1/FAILURE)
   Main PID: 5678 (code=exited, status=1/FAILURE)
#   ↑ PID and exact exit code

Jun 09 14:30:05 server node[5678]: Error: Cannot find module '/opt/myapp/server.js'
Jun 09 14:30:05 server systemd[1]: myapp.service: Main process exited, code=exited, status=1/FAILURE
Jun 09 14:30:05 server systemd[1]: myapp.service: Failed with result 'exit-code'.
Jun 09 14:30:05 server systemd[1]: Failed to start My Application.

Using journalctl for diagnostics

# Get ALL logs for a service (status only shows last 10 lines)
sudo journalctl -u myapp.service --no-pager

# Logs from the most recent start attempt
sudo journalctl -u myapp.service -b --no-pager

# Show logs with high verbosity (some services log debug only to journal)
sudo journalctl -u myapp.service -p debug --no-pager

# Show logs across multiple services to see if a dependency failed first
sudo journalctl -b -p err --no-pager | head -50

# Check what happened right before the failure
sudo journalctl -u myapp.service --since "14:29:00" --until "14:31:00"

Common failure patterns

Symptom in logs	Root cause	Fix
No such file or directory	ExecStart path wrong or binary not installed	Check path with `which`, verify install
Permission denied	Wrong User= in unit, or file not readable	Check `User=`, fix permissions with `chown`
Address already in use	Port already bound by another process	`ss -tlnp \| grep PORT` to find the conflict
Start limit hit	Service crashed and restarted too many times	`systemctl reset-failed servicename`
Dependency failed	A required service failed to start first	Check the required service with journalctl
Timeout on start	Service never reported ready (wrong Type= or slow start)	Increase `TimeoutStartSec=` or fix Type=

# Investigate a dependency failure chain
# Step 1: See what failed
systemctl --failed

# Step 2: Check if it was a dependency that caused it
systemctl status myapp.service | grep -i "dependency\|require\|failed"

# Step 3: Check the dependency's own logs
sudo journalctl -u postgresql.service -b --no-pager | tail -30

# Step 4: After fixing, reset failed state and retry
sudo systemctl reset-failed myapp.service
sudo systemctl start myapp.service

Analysing boot performance

# How long did boot take and where was the time spent?
systemd-analyze
# Output: Startup finished in 1.234s (kernel) + 3.456s (userspace) = 4.690s

# Which services took the longest to start?
systemd-analyze blame | head -20

# Visualize the boot sequence (creates an HTML timeline)
systemd-analyze plot > /tmp/boot.svg
# Open in browser to see the parallel service activation timeline

# Check for critical chain (the slowest sequential path)
systemd-analyze critical-chain

Conclusion

The diagnostic sequence for any failed service: systemctl --failed to list failures, systemctl status servicename for a summary, journalctl -u servicename -b --no-pager for complete logs, and then fix the specific error shown. The most common causes are wrong file paths, permission issues, port conflicts, and dependency failures. After fixing the root cause, always run systemctl reset-failed servicename before trying to restart — a service stuck in failed state will not restart automatically even if the problem is fixed.

FAQ

Is Troubleshooting Failed Services important for Ubuntu administrators?+

Yes. It supports practical Ubuntu administration because it connects directly to server reliability, security, troubleshooting, or daily operations.

Should I practice this on a live server?+

Use a lab VM first. After you understand the command output and rollback path, apply the workflow carefully on real systems.

What should I do after reading this article?+

Run the practice commands, write down what each one shows, and continue to the next article in the Ubuntu roadmap.

Need help with Ubuntu administration?

Work directly with Muhammad Irfan Aslam for Ubuntu Server, Linux, cloud, Docker, DevOps, CI/CD, or infrastructure troubleshooting support.

Hire Me for Support

Troubleshooting Failed Services