Using the comm command (with examples)

Using the comm command (with examples)

1: Producing three tab-separated columns

comm file1 file2

Motivation: The comm command allows us to compare the lines in two files and identify the lines that are present only in one file, as well as the lines that are common to both files. By default, comm produces three tab-separated columns.

Explanation: This command takes two sorted files, file1 and file2, as input. The comm command then compares the lines in these files and outputs three columns. The first column contains the lines that are only present in file1, the second column contains the lines that are only present in file2, and the third column contains the lines that are common to both files.

Example Output:

apple            banana          (blank)
grape            kiwi            pineapple
lemon            (blank)         watermelon

2: Printing only lines common to both files

comm -12 file1 file2

Motivation: There are situations where we only need to identify the lines that are common to both files, without the need for additional information. This use case helps in achieving that.

Explanation: This command uses the -12 option to print only the lines that are common to both file1 and file2. The input files must be sorted for this command to work correctly.

Example Output:

grape
lemon

3: Printing only lines common to both files, reading one file from stdin

cat file1 | comm -12 - file2

Motivation: Sometimes we may need to read one of the input files from stdin instead of specifying it as a command-line argument. This use case demonstrates how we can achieve that.

Explanation: In this command, cat is used to read the contents of file1 and pipe it to the comm command. The -12 option is still used to print only the lines that are common to both file1 and file2. The hyphen (-) is used to represent stdin as the input file.

Example Output:

grape
lemon

4: Getting lines only found in first file, saving the result to a third file

comm -23 file1 file2 > file1_only

Motivation: Sometimes we may need to extract the lines that are only found in one file and save them to another file. This use case helps in accomplishing that.

Explanation: This command uses the -23 option to print the lines that are only found in file1. The output is then redirected to a file named file1_only using the > operator.

Example Output: (Contents of file1_only)

apple

5: Printing lines only found in second file, when the files aren’t sorted

comm -13 <(sort file1) <(sort file2)

Motivation: The comm command requires the input files to be sorted. However, there may be situations where the files are not sorted. This use case demonstrates a workaround for that by using the sort command in conjunction with comm.

Explanation: This command uses the <(sort file1) and <(sort file2) syntax to sort the contents of file1 and file2 on the fly. The -13 option is then used to print the lines that are only found in file2.

Example Output:

banana
kiwi
pineapple
watermelon

These examples demonstrate different use cases of the comm command, ranging from simple comparisons to extracting specific lines to third files. By understanding and utilizing the various options and arguments of comm, we can efficiently compare and analyze the contents of two files.

Related Posts

How to use the command `gh help` (with examples)

How to use the command `gh help` (with examples)

This article will guide you through various use cases of the gh help command, which is a command-line interface (CLI) tool provided by GitHub.

Read More
How to use the command 'systemd-cgls' (with examples)

How to use the command 'systemd-cgls' (with examples)

Systemd-cgls is a command used to show the contents of the selected Linux control group hierarchy in a tree.

Read More
How to use the command `htop` (with examples)

How to use the command `htop` (with examples)

htop is a command-line tool that displays dynamic real-time information about running processes.

Read More