Finding Large Files on Ubuntu
When a disk fills up on a production server, the first question is always "what is using all this space?" Knowing how to quickly locate and identify large files is a core diagnostic skill. The usual suspects are log files that rotated but weren’t compressed, database dumps that were forgotten, core dump files from crashed processes, and Docker image layers.
Using find for large files
# Find all files larger than 100 MB
find / -type f -size +100M 2>/dev/null
# Find large files in a specific directory
find /var -type f -size +50M 2>/dev/null
# Show size and path, sorted by size (largest first)
find / -type f -size +100M 2>/dev/null -exec ls -lh {} + | sort -k5 -rh | head -20
# Find files larger than 1 GB
find / -xdev -type f -size +1G 2>/dev/null -exec ls -lh {} \; | sort -k5 -rh
# Use printf for a clean output
find / -xdev -type f -size +100M -printf '%s %p
' 2>/dev/null | sort -rn | awk '{printf "%.1f MB %s
", $1/1024/1024, $2}' | head -20
Example output: largest files on system
22542.3 MB /var/log/mysql/mysql-slow.log
4096.0 MB /var/lib/docker/overlay2/abc123.../merged/var/log/app.log
2048.0 MB /home/irfan/backup-2024-01-15.tar.gz
1024.0 MB /tmp/core.12345
Finding old log files
# Find log files older than 30 days
find /var/log -type f -mtime +30 -name "*.log"
# Find compressed old logs (rotated and compressed)
find /var/log -type f -mtime +7 -name "*.gz"
# Find log files modified more than 7 days ago AND larger than 10 MB
find /var/log -type f -mtime +7 -size +10M
# Check what logrotate is managing
ls /etc/logrotate.d/
cat /etc/logrotate.d/nginx
# Force logrotate immediately (for testing)
sudo logrotate -f /etc/logrotate.conf
Finding duplicate files
# Install fdupes (finds duplicate files by content)
sudo apt install -y fdupes
# Find duplicates in a directory
fdupes -r /home/irfan/
# Find duplicates and show total wasted space
fdupes -rS /home/
# Find and delete duplicates interactively
fdupes -rd /home/irfan/backups/
Common disk hogs on Ubuntu
| Location | What fills up | How to clean |
|---|---|---|
/var/log/ | Application logs not rotated, slow query logs | Truncate with > file.log, configure logrotate |
/var/lib/docker/ | Unused images, volumes, build cache | docker system prune |
/tmp/ | Core dumps, upload temp files | Delete files > X days: find /tmp -mtime +7 -delete |
/home/ | User data, forgotten backups | du -sh /home/*, investigate per-user |
/boot/ | Old kernels | sudo apt autoremove |
/var/cache/apt/ | Downloaded packages | sudo apt clean |
/var/lib/snapd/ | Old snap revisions | Remove old revisions: snap list --all |
Safe cleanup procedures
# Truncate a log file safely (preserves the file, clears content)
# NEVER delete an open log file — the process will keep writing to inode
# but the space won't be freed until the process closes it
sudo truncate -s 0 /var/log/mysql/mysql-slow.log
# Find and remove core dump files
find / -name "core" -o -name "core.[0-9]*" 2>/dev/null | xargs ls -lh
# Check if a file is open by any process before deleting
sudo lsof /var/log/mysql/mysql-slow.log
# Clean Docker disk usage
docker system df # Show Docker disk usage breakdown
docker system prune # Remove unused containers, networks, images
docker volume prune # Remove unused volumes
docker image prune -a # Remove all unused images
# Find and clean orphaned package configs
dpkg -l | grep "^rc" | awk '{print $2}' | xargs sudo apt purge
Conclusion
When a disk fills unexpectedly, start with df -hT to identify the full partition, then du -h --max-depth=1 /partition | sort -rh to drill down, and find /suspicious-dir -type f -size +100M -exec ls -lh {} + | sort -k5 -rh to find specific large files. The most common causes on production Ubuntu servers are slow query logs, Docker image accumulation, core dumps, and old kernel files. Never rm a currently open log file — use truncate -s 0 instead to free space without killing the writing process.
FAQ
Is Finding Large Files important for Ubuntu administrators?+
Yes. It supports practical Ubuntu administration because it connects directly to server reliability, security, troubleshooting, or daily operations.
Should I practice this on a live server?+
Use a lab VM first. After you understand the command output and rollback path, apply the workflow carefully on real systems.
What should I do after reading this article?+
Run the practice commands, write down what each one shows, and continue to the next article in the Ubuntu roadmap.
Need help with Ubuntu administration?
Work directly with Muhammad Irfan Aslam for Ubuntu Server, Linux, cloud, Docker, DevOps, CI/CD, or infrastructure troubleshooting support.
Hire Me for Support