1. sort
-
Let's try and compare these commands:
du -s /usr/share/* | lessdu -s /usr/share/* | sort | lessdu -s /usr/share/* | sort -r | lessdu -s /usr/share/* | sort -nr | lessdu -s /usr/share/* | sort -nr | headThe command
dugets the size (disk usage) of the files and directories of/usr/share, andheadfilters the top 10 results.Then we try to sort them with
sortandsort -r(reverse), but it does not seem to work as expected (sorting results by the size). This is becausesortby default sorts the first column alphabetically, so2is bigger than10(because2comes after1on the character set).With the option
-nwe tell sort to do a numerical sort. So, the last command returns the top 10 biggest files and directories on/usr/share. -
This example works because the numerical values happen to be on the first column of the output. What if we want to sort a list based on another column? For example the result of
ls -l:ls -l /usr/bin | headIgnoring for the moment that
lscan sort its results by size, we could usesortto sort them like this:ls -l /usr/bin | sort -nr -k 5 | headThe option
-k5tellssortto use the fifth field as the key for sorting. By the way,lslike most of the commands, separates the fields of its output by a TAB. -
For testing we are going to use the file distros.txt, which is like a history of some Linux distributions (containing their versions and release dates).
wget https://linux-cli.fs.al/examples/lesson07/distros.txt
cat distros.txtcat -A distros.txtThe option
-Amakes it show any special characters. The tab character is represented by^I, and the$shows the end of line. -
Let's try to sort it:
sort distros.txtThe result is almost correct, but Fedora version numbers are not in the correct order (since
1comes before5in the character set).To fix this we are going to sort on multiple keys. We want an alphabetic sort on the first field and a numeric sort on the second field:
sort --key=1,1 --key=2n distros.txtsort -k 1,1 -k 2n distros.txtsort -k1,1 -k2n distros.txtNotice that if we don't use a range of fields (like
1,1, which means start at field 1 and end at field 1), it is not going to work as expected:sort -k 1 -k 2n distros.txtThis is because in this case it starts at field 1 and goes up to the end of the line, ignoring thus the second key.
The modifier
nstands for numerical sort. Other modifiers arerfor reverse,bfor ignore blanks, etc. -
Suppose that we want to sort the list in reverse chronological order (by release date). We can do it like this:
sort -k 3.7nbr -k 3.1nbr -k 3.4nbr distros.txtThe
--keyoption allows specification of offsets within fields. So3.7means start sorting from the 7-th character of the 3-rd field, which is the year. The modifiernmakes it a numerical sort,rdoes reverse sorting, and withbwe are suppressing any leading spaces of the third field.In a similar way, the second sort key
3.1sorts by the month, and the third key3.4sorts by day. -
Some files don't use tabs and spaces as delimiters, for example the file
/etc/passwd:head /etc/passwdIn this case we can use the option
-tto define the field separator character. For example to sort/etc/passwdon the seventh field (the account's default shell), we could do this:sort -t ':' -k 7 /etc/passwd | head