Server Load Suddenly Increased

A sudden spike in server load is one of the most common production alerts. The key is working quickly through a logical diagnostic sequence: confirm the load is real, identify which process is consuming resources, determine whether it is CPU-bound or I/O-bound, and take appropriate action. The wrong response is to immediately restart services — that destroys the evidence and the problem often returns within minutes.

Understanding load average

uptime

uptime output showing high load

14:32:05 up 45 days, 2:10, 2 users, load average: 12.45, 8.23, 4.12
#                                                   1min  5min  15min
# On a 4-CPU server:
#   load average: 4.0 = fully utilized (1 job per CPU)
#   load average: 12.45 = 3x overloaded (12 jobs waiting for 4 CPUs)
# High 1min + low 15min = recent spike (just started)
# High all three = sustained problem (hours old)

Identifying the culprit process

# Interactive view — press P to sort by CPU, M for memory:
top
htop    # More user-friendly alternative

# Non-interactive: show top 10 CPU consumers right now:
ps aux --sort=-%cpu | head -11

ps aux output

USER       PID %CPU %MEM    VSZ   RSS COMMAND
mysql    12345 245  8.2  1234567 342123 /usr/sbin/mysqld
www-data 12400  45  1.2    89234  23456 php-fpm: pool www
www-data 12401  44  1.1    88123  22890 php-fpm: pool www
# mysql is consuming 245% CPU on an 8-core server (about 30% of total)
# Identify what the high-CPU process is doing:
# For MySQL: check slow queries
sudo mysqladmin -u root -p processlist

# For web server: check recent access logs for spikes
sudo tail -100 /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -20
# Shows which URLs are getting the most hits

# Look for runaway cron jobs:
ps aux | grep cron
sudo journalctl -u cron --since "30 minutes ago"

CPU load vs I/O wait

# CPU load and I/O wait look the same in load average but need different fixes:
vmstat 1 5

vmstat output — high I/O wait

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2 15      0 456789 123456 2345678    0    0    45 12456  234  567  8  3 12 77  0
#  ^b=15 blocked processes   wa=77% = 77% CPU time waiting for disk I/O
# This is an I/O problem, not CPU. Adding CPUs won't help.
# Find what is causing heavy disk I/O:
iostat -x 1 5    # Look for high %util (device utilization)
iotop -o         # Show only processes doing active I/O (interactive)

# Common I/O causes:
# MySQL doing full table scan (no index)
# Log file writes (check disk-full situation)
# Backup job running during business hours
# Excessive swap use (server running out of RAM)

Remediation steps

# If MySQL is the culprit:
sudo mysqladmin -u root -p kill PROCESS_ID    # Kill specific query
# Then investigate: EXPLAIN the slow query, add missing index

# If runaway process:
# First check if it is legitimate (backup, import, deploy):
ps -ef | grep PID
ls -la /proc/PID/exe    # What binary is running?

# If confirmed runaway:
kill -15 PID    # Graceful termination (SIGTERM)
# If it doesn't stop within 10 seconds:
kill -9 PID     # Force kill (SIGKILL) — last resort

# If load is from legitimate traffic surge:
# Scale horizontally (add more app servers)
# Or scale vertically (add more CPU/RAM if cloud)

Conclusion

The diagnostic sequence for high load: (1) uptime to confirm load is elevated and estimate duration; (2) top or ps aux --sort=-%cpu to identify the highest CPU consumer; (3) vmstat to determine if load is CPU-bound (high us+sy) or I/O-bound (high wa); (4) act based on root cause. Restarting a process without identifying why it is consuming high CPU almost always results in the same problem recurring, because the underlying cause (missing database index, infinite loop, traffic spike) is still present.

FAQ

Is Server Load Suddenly Increased important for Ubuntu administrators?+

Yes. It supports practical Ubuntu administration because it connects directly to server reliability, security, troubleshooting, or daily operations.

Should I practice this on a live server?+

Use a lab VM first. After you understand the command output and rollback path, apply the workflow carefully on real systems.

What should I do after reading this article?+

Run the practice commands, write down what each one shows, and continue to the next article in the Ubuntu roadmap.

Need help with Ubuntu administration?

Work directly with Muhammad Irfan Aslam for Ubuntu Server, Linux, cloud, Docker, DevOps, CI/CD, or infrastructure troubleshooting support.

Hire Me for Support