Finding Large Files in Linux: What Most People Get Wrong

Finding Large Files in Linux: What Most People Get Wrong

You're staring at a terminal. The disk usage is at 99%. Your server is choking, and you have no idea why. We’ve all been there, panicking while trying to remember that one specific command string we saw on Stack Overflow three years ago. Honestly, finding large files in Linux isn't just about running a command; it's about understanding how your filesystem hides data in plain sight.

Linux is a bit of a hoarder. Log files grow until they consume gigabytes of space. Docker containers leave "dangling" volumes that sit like lead weights on your NVMe drive. Old kernels pile up in /boot. If you just run a basic search, you might miss the real culprits.

The Find Command is Your Best Friend (and Worst Enemy)

Most people jump straight to find. It's the standard. But if you don't use the right flags, you'll be sitting there for ten minutes while it crawls through /proc and /sys, which is a total waste of time. You want the heavy hitters. You want the files that are actually causing the "Disk Full" alerts.

Try this: find / -type f -size +100M -exec ls -lh {} + 2>/dev/null.

Let's break that down because it looks like alphabet soup. The / tells it to start at the root. The -type f ensures you're only looking at files, not directories (which are technically files in Linux, but you get what I mean). The -size +100M is the filter. It says, "Only show me the big boys." Finally, that 2>/dev/null at the end is vital. It silences all those annoying "Permission Denied" errors that pop up when you're searching directories you don't own.

But wait. There's a catch.

What if the file is 99MB? You missed it. What if you have a million 1MB files? The find command won't help you there, even though they’re collectively killing your storage. This is where du comes in.

Why Disk Usage (du) Usually Wins

I personally prefer du for a quick health check. It doesn't just find files; it finds the bloat. If you have a directory full of tiny log files, find won't flag it, but du will show that the folder is a 10GB monster.

Run du -ah / | sort -rh | head -n 20.

👉 See also: How Many Seconds Are in an Hour: The Math and Why It Actually Matters

This gives you the top 20 biggest things on your system, ranked. It’s human-readable (-h), starts from the root, and sorts them so the biggest is at the top. It’s simple. It works. It’s basically the "who is eating my lunch?" command.

The nuance here is the -x flag. If you have network mounts or external drives plugged in, du will try to scan those too. That can take forever. Use du -axh / to keep the search restricted to the actual local filesystem you're worried about.

The Mystery of the Deleted File

Here is something that trips up even experienced sysadmins. You find a massive 50GB log file. You rm it. You check df -h and... the space didn't come back.

Why?

Because a process (like Nginx or a database) still has that file open. In Linux, a file isn't truly deleted until the last process closes its "file descriptor." The directory entry is gone, but the data stays on the disk because the app is still writing to it.

To find these "ghost" files, use lsof +L1. This shows files that have a link count of less than one but are still open. If you see a massive file there, you'll need to restart the associated service to actually reclaim that space. Just deleting the file isn't enough; you've gotta kill the ghost.

The Modern Way: ncdu and dust

If you're tired of typing long strings of commands, you should probably install ncdu. It’s an interactive version of du built with ncurses. It’s fast. It lets you navigate through your folders using arrow keys and delete things on the fly with the 'd' key.

For the Rust fans out there, dust is a newer alternative that provides a much prettier, more intuitive visual representation of where your bytes are going. It uses colored bars to show which directories are the greediest. Honestly, once you use dust or ncdu, going back to raw find commands feels like using a typewriter in a Tesla factory.

Finding Large Files in Linux via Logs and Cache

Usually, the problem isn't your personal photos. It's the system.

✨ Don't miss: Why Words That Contain Photo Keep Evolving in the Digital Age

  1. Check /var/log. This is the #1 culprit. If a service is erroring out, it might be writing millions of lines of "I'm broken" to a log file every hour.
  2. Look at /var/cache/apt (on Debian/Ubuntu) or /var/cache/dnf (on Fedora/RHEL). Every time you update your system, Linux keeps the old packages. Clear them out with sudo apt clean.
  3. Docker. Oh boy, Docker. If you use containers, run docker system df. You’ll probably find gigabytes of "Images" and "Local Volumes" that haven't been touched in months. docker system prune is your friend here, but be careful—it’s a sledgehammer, not a scalpel.

Sorting by Time Instead of Size

Sometimes the biggest file isn't the problem; it's the newest large file. If your disk filled up in the last hour, you need to sort by modification time.

find / -type f -mmin -60 -size +10M

This finds files larger than 10MB that were modified in the last 60 minutes. It’s incredibly useful for catching a runaway process in the act. You’re basically playing detective, looking for the most recent footprint in the snow.

Actionable Next Steps for a Clean System

Don't just clear the space and walk away. You’ll be back here in a month if you do.

First, set up logrotate. It’s a standard utility that automatically compresses and deletes old logs. If it's not configured correctly, your /var/log will always be a ticking time bomb.

Second, use df -i. This checks inodes. Sometimes you have plenty of disk space, but you've run out of "file slots" because some weird script created ten million empty text files. If your inode usage is at 100%, you can't create new files, even if you have 1TB of free space.

Third, check your trash. Seriously. If you’re using a desktop environment like GNOME or KDE, files you "delete" in the GUI go to ~/.local/share/Trash. They aren't gone. They're just moved. Empty that folder manually or through your file manager to actually see the space return.

Lastly, make a habit of running a "disk audit" once a quarter. Linux is robust, but it isn't self-cleaning. A little bit of proactive maintenance with ncdu or a quick find script can save you from a middle-of-the-night server crash.

Keep your partitions lean, watch those log files, and always remember to check for those open file descriptors before you assume the space is gone for good.