How to use the command 'join' (with examples)
The join
command is a powerful tool that allows you to join lines of two sorted files on a common field. It is particularly useful when you need to combine data from multiple files based on a shared value. This article will walk you through several use cases of the join
command, each demonstrating a different way to utilize its capabilities.
Use case 1: Join two files on the first (default) field
Code:
join file1 file2
Motivation:
This use case is the most basic form of joining files using the join
command. It combines the lines of file1
and file2
based on a common field. By default, join
uses the first field as the common field for matching.
Explanation:
file1
andfile2
: The names of the files you want to join.
Example output:
Assuming file1
contains the following lines:
John 123 Main St
Jane 456 Elm St
And file2
contains the following lines:
Doe 12345
Smith 67890
Running the command join file1 file2
will produce the following output:
123 Main St John Doe 12345
This output combines the lines from both files, matching them based on the first field. In this case, it matches the line with “John” in file1
with the line with “12345” in file2
.
Use case 2: Join two files using a comma (instead of a space) as the field separator
Code:
join -t ',' file1 file2
Motivation:
In some cases, files might use a different field separator instead of a space. By using the -t
option, you can specify a different character to be used as the field separator. This example shows how to join two files using a comma as the field separator.
Explanation:
-t ','
: Specifies the comma (,) character as the field separator.
Example output:
Assuming file1
contains the following lines:
John,123 Main St
Jane,456 Elm St
And file2
contains the following lines:
Doe,12345
Smith,67890
Running the command join -t ',' file1 file2
will produce the following output:
123 Main St,John,Doe,12345
This output joins the lines from both files using a comma as the field separator instead of the default space.
Use case 3: Join field3 of file1 with field1 of file2
Code:
join -1 3 -2 1 file1 file2
Motivation:
Sometimes you may need to specify different fields in each file to perform the join. The -1
and -2
options allow you to specify the field numbers in each file. This example demonstrates how to join the third field of file1
with the first field of file2
.
Explanation:
-1 3
: Specifies the third field offile1
as the common field.-2 1
: Specifies the first field offile2
as the common field.
Example output:
Assuming file1
contains the following lines:
John Doe 123 Main St
Jane Smith 456 Elm St
And file2
contains the following lines:
12345 John
67890 Jane
Running the command join -1 3 -2 1 file1 file2
will produce the following output:
123 Main St Doe 12345 John
This output joins the lines based on the specified fields, resulting in the common values being matched correctly.
Use case 4: Produce a line for each unpairable line from file1
Code:
join -a 1 file1 file2
Motivation:
When joining files, it is possible that some lines may not have a match in the other file. The -a
option allows you to include unpairable lines from a specific file. In this example, we include unpairable lines from file1
.
Explanation:
-a 1
: Includes unpairable lines fromfile1
in the output.
Example output:
Assuming file1
contains the following lines:
John 123 Main St
Jane 456 Elm St
Mark 789 Maple St
And file2
contains the following lines:
Doe 12345
Smith 67890
Running the command join -a 1 file1 file2
will produce the following output:
123 Main St John Doe 12345
456 Elm St Jane
789 Maple St Mark
This output includes all the lines from file1
, even the ones without a match in file2
.
Use case 5: Join a file from stdin
Code:
cat path/to/file1 | join - path/to/file2
Motivation:
The join
command also allows you to join a file from stdin
. This can be useful when you want to process data from a pipe or redirect input from another command.
Explanation:
cat path/to/file1
: Reads the contents offile1
and sends them tostdin
.join -
: Reads the input fromstdin
, which is the output ofcat path/to/file1
.path/to/file2
: The name of the file you want to join.
Example output:
Assuming file1
contains the following lines:
John 123 Main St
Jane 456 Elm St
And file2
contains the following lines:
Doe 12345
Smith 67890
Running the command cat file1 | join - file2
will produce the following output:
123 Main St John Doe 12345
This output is the same as in Use case 1, but here we used cat
to read the contents of file1
and passed it to join
through stdin
.