How to find large files

One of the beauties of Linux is that you can combine multiple commands to get exact results. For example if you have loads of files in your home drive, here’s how to find files larger than a certain size, listing the files by size:

$ find ~/ -type f -size +10000k -exec ls -lh {} \; 2> /dev/null | awk '{ print $NF ": " $5 }'  | sort -nrk 2,2 | more

Breakdown

A translation of this command would be: Look in my home drive for files that are larger than 10MB, then list them while not displaying error messages, showing only the size and the path/filename, sort the whole lot by size, and scroll through the list.

Let’s look at each part to see exactly what it does.

find – /usr/bin/find on a Debian-based distro.

~/ – A short way of pointing to the currently logged in user’s home drive, i.e. /home/orfels/

-type f – Optional argument that specifies that we are looking for files only.

-size +10000k – Optional argument that says we are looking for something larger than 10000k, or around 10MB.

-exec ls -lh {} \; – The results are passed to the execution of ls to list the files, with the addition of showing it in a list, l and in human-readable format, h.

2> /dev/null – This means that error output will be discarded. This would only really apply to paths that the account doesn’t have permissions to access, but it is included here anyway.

| – The preceding command is piped to the commands after the pipe | character.

awk – awk will only print specific fields of the output, in this case: path/filename: size. $NF indicates the last field or column of the output, while $5 refers to the fifth column. Try just doing ls -lh and count the columns to see what we’re getting at, which is to just show the files and their sizes, skipping details like permission, ownership etc.

| – Again we pipe the output to the following commands.

sort -nrk 2,2 – Sort the output by the second column of the output.

| – Once again we will pipe the output to the final command.

more – More will prompt you to press a key to proceed once the output screen is full. This is useful when a large amount of results are returned and the scroll buffer is exceeded.

Example:

orfels@vmorfels01:~$ find ~/ -type f -size +10000k -exec ls -lh {} \; 2> /dev/null | awk '{ print $NF ": " $5 }'  | sort -nrk 2,2 | more
/home/orfels/backups/14052012.gz: 806M
/home/orfels/backups/pics.tgz: 544M
/home/orfels/vids/christmas.ogg: 403M
Be Sociable, Share!

No related posts.