How to use the command mlr (with examples)
Miller is a command-line utility that acts as a combination of multiple tools like awk
, sed
, cut
, join
, and sort
specifically designed to work with name-indexed data such as CSV, TSV, and tabular JSON. It provides a wide range of functionalities for manipulating and processing structured data in an efficient manner. In this article, we will explore several practical use cases of the mlr
command.
Use case 1: Pretty-print a CSV file in a tabular format
Code:
mlr --icsv --opprint cat example.csv
Motivation: When working with large CSV files, it can be difficult to read and interpret the data due to its flat structure. The mlr
command allows us to pretty-print the CSV file, presenting the data in a more concise and readable tabular format.
Explanation:
--icsv
: Specifies that the input file is in CSV format.--opprint
: Formats the output in a visually appealing tabular format.cat
: Thecat
verb is used to display the contents of the CSV file.
Example output:
+-------+-----+-----+
| field1|field2|field3|
+-------+-----+-----+
| A | 10 | 20 |
| B | 30 | 40 |
+-------+-----+-----+
Use case 2: Receive JSON data and pretty print the output
Code:
echo '{"hello":"world"}' | mlr --ijson --opprint cat
Motivation: When dealing with JSON data, it is often challenging to interpret the structure and values. By using mlr
, we can easily pretty print the JSON data, making it more readable and easy to comprehend.
Explanation:
--ijson
: Specifies that the input data is in JSON format.--opprint
: Formats the output in a visually appealing tabular format.cat
: Thecat
verb is used to display the JSON data.
Example output:
+-------+--------+
| hello | world |
+-------+--------+
Use case 3: Sort alphabetically on a field
Code:
mlr --icsv --opprint sort -f field example.csv
Motivation: Sometimes it is necessary to sort the data based on a specific field for better analysis or organization. The mlr
command allows us to easily sort the data alphabetically based on a chosen field.
Explanation:
--icsv
: Specifies that the input file is in CSV format.--opprint
: Formats the output in a visually appealing tabular format.sort
: Sorts the input data.-f field
: Specifies the field to sort by.
Example output:
+-------+-----+-----+
| field1|field2|field3|
+-------+-----+-----+
| A | 10 | 20 |
| B | 30 | 40 |
+-------+-----+-----+
Use case 4: Sort in descending numerical order on a field
Code:
mlr --icsv --opprint sort -nr field example.csv
Motivation: In some cases, sorting data in descending order based on a numerical field can be crucial for accurate analysis or identifying patterns. The mlr
command provides an easy way to sort data in descending numerical order.
Explanation:
--icsv
: Specifies that the input file is in CSV format.--opprint
: Formats the output in a visually appealing tabular format.sort
: Sorts the input data.-nr field
: Specifies the field to sort by in descending order.
Example output:
+-------+-----+-----+
| field1|field2|field3|
+-------+-----+-----+
| B | 30 | 40 |
| A | 10 | 20 |
+-------+-----+-----+
Use case 5: Convert CSV to JSON, perform calculations, and display those calculations
Code:
mlr --icsv --ojson put '$newField1 = $oldFieldA/$oldFieldB' example.csv
Motivation: Converting CSV data to JSON and performing calculations on specific fields can be a useful task in data analysis or processing workflows. The mlr
command allows us to perform such calculations and display the results.
Explanation:
--icsv
: Specifies that the input file is in CSV format.--ojson
: Specifies the output format as JSON.put '$newField1 = $oldFieldA/$oldFieldB'
: Calculates the division of two fields in the record and assigns the result to a new field.
Example output:
{"field1":"A","field2":"10","field3":"20","newField1":0.5}
{"field1":"B","field2":"30","field3":"40","newField1":0.75}
Use case 6: Receive JSON and format the output as vertical JSON
Code:
echo '{"hello":"world", "foo":"bar"}' | mlr --ijson --ojson --jvstack cat
Motivation: Formatting JSON data in a vertical layout can provide a better visualization of the structure, especially when working with complex JSON objects. The mlr
command allows us to convert the JSON data into a vertical JSON format.
Explanation:
--ijson
: Specifies that the input data is in JSON format.--ojson
: Specifies the output format as JSON.--jvstack
: Converts the JSON data into a vertical layout.cat
: Thecat
verb is used to display the JSON data.
Example output:
{
"hello": "world"
}
{
"foo": "bar"
}
Use case 7: Filter lines of a compressed CSV file treating numbers as strings
Code:
mlr --prepipe 'gunzip' --csv filter -S '$fieldName =~ "regular_expression"' example.csv.gz
Motivation: Filtering specific lines from a large compressed CSV file based on certain conditions can be difficult. The mlr
command allows us to filter lines by providing regular expressions and additional preprocessing steps.
Explanation:
--prepipe 'gunzip'
: Preprocesses the input by unzipping the compressed CSV file usinggunzip
.--csv
: Specifies that the input file is in CSV format.filter
: Filters the input data based on specific conditions.-S '$fieldName =~ "regular_expression"'
: Filters the lines where the field matches the given regular expression.
Example output:
field1,field2,field3
A,10,20
Conclusion:
The mlr
command provides a powerful set of functionalities for working with name-indexed data such as CSV, TSV, and tabular JSON. Its ability to combine multiple data processing tools into a single command simplifies and streamlines data manipulation tasks. By exploring the provided use cases, you now have a better understanding of how to utilize mlr
in your data processing workflows.