How to use the command `duckdb` (with examples)

How to use the command `duckdb` (with examples)

The duckdb command-line client is used to interact with DuckDB, an in-process analytical SQL engine. It allows users to start an interactive shell, run SQL queries, execute scripts, and perform various database operations. In this article, we will explore the different use cases of the duckdb command with examples.

Use case 1: Start an interactive shell with a transient in-memory database

Code:

duckdb

Motivation: Starting an interactive shell with a transient in-memory database is useful when you want to quickly experiment with SQL queries without the need to connect to an external database. This is particularly helpful for testing and prototyping.

Explanation: This use case invokes the duckdb command without any arguments. It starts an interactive shell connected to an in-memory database. Any changes made to the database during the session will not persist beyond the shell session.

Example output:

DuckDB Command Line Client
Shell connected to transient in-memory database
duckdb >

Use case 2: Start an interactive shell on a database file

Code:

duckdb path/to/dbfile

Motivation: By providing the path to a database file, you can start an interactive shell connected to an existing database. If the file does not exist, a new database will be created at the specified location.

Explanation: In this use case, the duckdb command is followed by the path to a database file. If the file exists, the shell will connect to the existing database. If the file does not exist, a new database will be created at the specified location.

Example output:

DuckDB Command Line Client
Shell connected to database: path/to/dbfile
duckdb >

Use case 3: Directly query a CSV, JSON, or Parquet file

Code:

duckdb -c "SELECT * FROM 'data_source.[csv|csv.gz|json|json.gz|parquet]'"

Motivation: Directly querying a data file without the need to import it into a database can save time and simplify the data analysis process. This is particularly useful when dealing with large datasets in CSV, JSON, or Parquet formats.

Explanation: In this use case, the -c option is used to directly query a CSV, JSON, or Parquet file without connecting to a database. The query passed as an argument selects all columns (*) from the specified file.

Example output:

DuckDB Command Line Client
SELECT * FROM 'data_source.csv'
...

Use case 4: Run a SQL script

Code:

duckdb -c ".read path/to/script.sql"

Motivation: Running SQL scripts allows for the execution of multiple SQL statements at once, simplifying complex database operations and automation tasks. This use case is helpful when you have a predefined set of SQL statements in a script file that you want to execute.

Explanation: In this use case, the -c option is used to run a SQL script file. The .read command within the script file specifies the file path to be executed.

Example output:

DuckDB Command Line Client
Script path/to/script.sql executed successfully.
duckdb >

Use case 5: Run a query on a database file and keep the interactive shell open

Code:

duckdb path/to/dbfile -cmd "SELECT DISTINCT * FROM tbl"

Motivation: Running a query on a database file and keeping the interactive shell open enables you to perform ad-hoc queries and further analyze the query results interactively. This use case is especially useful when you want to explore the data in a specific table within the database.

Explanation: In this use case, the duckdb command is followed by the path to a database file. The -cmd option is used to specify a query to be executed on the provided database. After executing the query, the interactive shell remains open for further interaction.

Example output:

DuckDB Command Line Client
Shell connected to database: path/to/dbfile
SELECT DISTINCT * FROM tbl
...
duckdb >

Use case 6: Run SQL queries in a file on a database and keep the interactive shell open

Code:

duckdb path/to/dbfile -init path/to/script.sql

Motivation: Running SQL queries in a file on a database and keeping the interactive shell open allows for the execution of a pre-defined set of SQL statements on a specific database. This use case is beneficial when you have a script file containing multiple SQL queries that you want to execute on a database.

Explanation: In this use case, the duckdb command is followed by the path to a database file. The -init option is used to specify a script file containing SQL queries. After executing the queries, the interactive shell remains open for further interaction.

Example output:

DuckDB Command Line Client
Shell connected to database: path/to/dbfile
Script path/to/script.sql executed successfully.
duckdb >

Use case 7: Read CSV from stdin and write CSV to stdout

Code:

cat path/to/source.csv | duckdb -c "COPY (FROM read_csv_auto('/dev/stdin')) TO '/dev/stdout' WITH (FORMAT CSV, HEADER)"

Motivation: Reading CSV data from stdin and writing CSV data to stdout allows for efficient piping and integration with other command-line tools, enabling data processing workflows.

Explanation: In this use case, the cat command is used to read a CSV file and pipe the content to the duckdb command. The -c option is used to specify the query within double-quotes. The query uses the COPY command to read the CSV data from read_csv_auto('/dev/stdin') and write it to /dev/stdout in CSV format with a header.

Example output:

DuckDB Command Line Client
SELECT * FROM 'data_source.csv'
...

Use case 8: Display help

Code:

duckdb -help

Motivation: Sometimes, it is necessary to refer to the command’s documentation or understand the available options. The -help option provides detailed information about the duckdb command and its available options.

Explanation: In this use case, the -help option is used to display the help message, which contains information about the duckdb command and its available options.

Example output:

DuckDB Command Line Client
Usage: duckdb [OPTIONS...] [path/to/database]

  -h|help             | Show this help
  -c [SQL]            | Execute a query and exit
  -init [script.sql]  | Execute SQL queries from a script file
  -cmd [SQL]          | Execute a query and continue to interactive shell
  ...

Conclusion

The duckdb command-line client provides a versatile set of options for interacting with DuckDB. Whether you need to start an interactive shell, run SQL queries on database files, execute scripts, or perform data import and export operations, the duckdb command offers a flexible and efficient solution. By understanding the different use cases and command arguments, you can effectively leverage the power of DuckDB for analytical SQL tasks.

Related Posts

How to use the command 'go tool' (with examples)

How to use the command 'go tool' (with examples)

The ‘go tool’ command is a powerful tool in the Go programming language that allows users to run specific Go tools or commands.

Read More
How to use the command gh gist (with examples)

How to use the command gh gist (with examples)

The gh gist command allows users to work with GitHub Gists, a way to share and collaborate on code snippets, notes, and more.

Read More
How to use the command cksum (with examples)

How to use the command cksum (with examples)

The cksum command is used to calculate CRC checksums and byte counts of a file.

Read More