How to use the command `duckdb` (with examples)
The duckdb
command-line client is used to interact with DuckDB, an in-process analytical SQL engine. It allows users to start an interactive shell, run SQL queries, execute scripts, and perform various database operations. In this article, we will explore the different use cases of the duckdb
command with examples.
Use case 1: Start an interactive shell with a transient in-memory database
Code:
duckdb
Motivation: Starting an interactive shell with a transient in-memory database is useful when you want to quickly experiment with SQL queries without the need to connect to an external database. This is particularly helpful for testing and prototyping.
Explanation:
This use case invokes the duckdb
command without any arguments. It starts an interactive shell connected to an in-memory database. Any changes made to the database during the session will not persist beyond the shell session.
Example output:
DuckDB Command Line Client
Shell connected to transient in-memory database
duckdb >
Use case 2: Start an interactive shell on a database file
Code:
duckdb path/to/dbfile
Motivation: By providing the path to a database file, you can start an interactive shell connected to an existing database. If the file does not exist, a new database will be created at the specified location.
Explanation:
In this use case, the duckdb
command is followed by the path to a database file. If the file exists, the shell will connect to the existing database. If the file does not exist, a new database will be created at the specified location.
Example output:
DuckDB Command Line Client
Shell connected to database: path/to/dbfile
duckdb >
Use case 3: Directly query a CSV, JSON, or Parquet file
Code:
duckdb -c "SELECT * FROM 'data_source.[csv|csv.gz|json|json.gz|parquet]'"
Motivation: Directly querying a data file without the need to import it into a database can save time and simplify the data analysis process. This is particularly useful when dealing with large datasets in CSV, JSON, or Parquet formats.
Explanation:
In this use case, the -c
option is used to directly query a CSV, JSON, or Parquet file without connecting to a database. The query passed as an argument selects all columns (*
) from the specified file.
Example output:
DuckDB Command Line Client
SELECT * FROM 'data_source.csv'
...
Use case 4: Run a SQL script
Code:
duckdb -c ".read path/to/script.sql"
Motivation: Running SQL scripts allows for the execution of multiple SQL statements at once, simplifying complex database operations and automation tasks. This use case is helpful when you have a predefined set of SQL statements in a script file that you want to execute.
Explanation:
In this use case, the -c
option is used to run a SQL script file. The .read
command within the script file specifies the file path to be executed.
Example output:
DuckDB Command Line Client
Script path/to/script.sql executed successfully.
duckdb >
Use case 5: Run a query on a database file and keep the interactive shell open
Code:
duckdb path/to/dbfile -cmd "SELECT DISTINCT * FROM tbl"
Motivation: Running a query on a database file and keeping the interactive shell open enables you to perform ad-hoc queries and further analyze the query results interactively. This use case is especially useful when you want to explore the data in a specific table within the database.
Explanation:
In this use case, the duckdb
command is followed by the path to a database file. The -cmd
option is used to specify a query to be executed on the provided database. After executing the query, the interactive shell remains open for further interaction.
Example output:
DuckDB Command Line Client
Shell connected to database: path/to/dbfile
SELECT DISTINCT * FROM tbl
...
duckdb >
Use case 6: Run SQL queries in a file on a database and keep the interactive shell open
Code:
duckdb path/to/dbfile -init path/to/script.sql
Motivation: Running SQL queries in a file on a database and keeping the interactive shell open allows for the execution of a pre-defined set of SQL statements on a specific database. This use case is beneficial when you have a script file containing multiple SQL queries that you want to execute on a database.
Explanation:
In this use case, the duckdb
command is followed by the path to a database file. The -init
option is used to specify a script file containing SQL queries. After executing the queries, the interactive shell remains open for further interaction.
Example output:
DuckDB Command Line Client
Shell connected to database: path/to/dbfile
Script path/to/script.sql executed successfully.
duckdb >
Use case 7: Read CSV from stdin
and write CSV to stdout
Code:
cat path/to/source.csv | duckdb -c "COPY (FROM read_csv_auto('/dev/stdin')) TO '/dev/stdout' WITH (FORMAT CSV, HEADER)"
Motivation:
Reading CSV data from stdin
and writing CSV data to stdout
allows for efficient piping and integration with other command-line tools, enabling data processing workflows.
Explanation:
In this use case, the cat
command is used to read a CSV file and pipe the content to the duckdb
command. The -c
option is used to specify the query within double-quotes. The query uses the COPY
command to read the CSV data from read_csv_auto('/dev/stdin')
and write it to /dev/stdout
in CSV format with a header.
Example output:
DuckDB Command Line Client
SELECT * FROM 'data_source.csv'
...
Use case 8: Display help
Code:
duckdb -help
Motivation:
Sometimes, it is necessary to refer to the command’s documentation or understand the available options. The -help
option provides detailed information about the duckdb
command and its available options.
Explanation:
In this use case, the -help
option is used to display the help message, which contains information about the duckdb
command and its available options.
Example output:
DuckDB Command Line Client
Usage: duckdb [OPTIONS...] [path/to/database]
-h|help | Show this help
-c [SQL] | Execute a query and exit
-init [script.sql] | Execute SQL queries from a script file
-cmd [SQL] | Execute a query and continue to interactive shell
...
Conclusion
The duckdb
command-line client provides a versatile set of options for interacting with DuckDB. Whether you need to start an interactive shell, run SQL queries on database files, execute scripts, or perform data import and export operations, the duckdb
command offers a flexible and efficient solution. By understanding the different use cases and command arguments, you can effectively leverage the power of DuckDB for analytical SQL tasks.