How to use the command blastn (with examples)

How to use the command blastn (with examples)

Blastn is a command-line tool for aligning nucleotide sequences against a nucleotide database. It is widely used in bioinformatics for various applications such as sequence similarity search, sequence alignment, and annotation.

Use case 1: Align two or more sequences using megablast

Code:

blastn -query query.fa -subject subject.fa -evalue 1e-9

Motivation: This use case aligns two or more sequences using the megablast algorithm, which is the default algorithm for blastn. The -evalue parameter specifies the E-value threshold, which is the maximum expected number of matches by chance.

Explanation:

  • -query query.fa: Specifies the query sequence file.
  • -subject subject.fa: Specifies the subject sequence file.
  • -evalue 1e-9: Sets the E-value threshold to 1e-9.

Example output: The output of this command will be the alignment results in a pairwise format.

Use case 2: Align two or more sequences using blastn

Code:

blastn -task blastn -query query.fa -subject subject.fa

Motivation: This use case aligns two or more sequences using the blastn algorithm explicitly instead of relying on the default algorithm (megablast).

Explanation:

  • -task blastn: Specifies the blastn algorithm explicitly.
  • -query query.fa: Specifies the query sequence file.
  • -subject subject.fa: Specifies the subject sequence file.

Example output: The output of this command will be the alignment results using the blastn algorithm.

Use case 3: Align two or more sequences with custom tabular output format

Code:

blastn -query query.fa -subject subject.fa -outfmt '6 qseqid qlen qstart qend sseqid slen sstart send bitscore evalue pident' -out output.tsv

Motivation: This use case aligns two or more sequences, but instead of the default output format, it generates a custom tabular output format. This can be useful for specific analysis or downstream processing.

Explanation:

  • -query query.fa: Specifies the query sequence file.
  • -subject subject.fa: Specifies the subject sequence file.
  • -outfmt '6 qseqid qlen qstart qend sseqid slen sstart send bitscore evalue pident': Defines the tabular output format. This format includes specific fields such as query sequence ID, query sequence length, start and end positions of the alignment, subject sequence ID, subject sequence length, start and end positions of the alignment on the subject sequence, alignment score, E-value, and percentage identity.
  • -out output.tsv: Specifies the output file to save the results in.

Example output: The output of this command will be a tab-separated values (TSV) file containing the alignment results in the specified custom format.

Use case 4: Search nucleotide databases with specific parameters

Code:

blastn -query query.fa -db path/to/blast_db -num_threads 16 -max_target_seqs 10

Motivation: This use case demonstrates how to perform a nucleotide sequence search against a specific nucleotide database with custom parameters. It specifies the number of threads (CPU cores) to use and limits the maximum number of aligned sequences to keep.

Explanation:

  • -query query.fa: Specifies the query sequence file.
  • -db path/to/blast_db: Specifies the path to the nucleotide database for the search.
  • -num_threads 16: Sets the number of threads (CPU cores) to use during the search. Increasing the number of threads can speed up the search process.
  • -max_target_seqs 10: Limits the maximum number of aligned sequences to keep in the output. This can help to focus on the top hits or reduce the output size.

Example output: The output of this command will be the top 10 aligned sequences against the specified nucleotide database.

Use case 5: Search the remote non-redundant nucleotide database

Code:

blastn -query query.fa -db nt -remote

Motivation: This use case illustrates how to search the remote non-redundant nucleotide database using a nucleotide query. The remote option allows accessing the database hosted on the NCBI server.

Explanation:

  • -query query.fa: Specifies the query sequence file.
  • -db nt: Specifies to search the non-redundant nucleotide database.
  • -remote: Enables remote BLAST search to access the database hosted on the NCBI server.

Example output: The output of this command will be the aligned sequences from the remote non-redundant nucleotide database.

Use case 6: Display help

Code:

blastn -h

Motivation: This use case shows how to display the help information for the blastn command. The help information provides details about the command options and their usage.

Example output: The output of this command will be the help information for the blastn command.

Conclusion:

In this article, we have covered various use cases of the blastn command. Blastn is a versatile tool for aligning nucleotide sequences and offers a wide range of options to customize the search and output formats. By understanding these use cases, you can effectively utilize blastn for your nucleotide sequence analysis and annotation tasks.

Related Posts

How to use the command 'dpkg-reconfigure' (with examples)

How to use the command 'dpkg-reconfigure' (with examples)

The ‘dpkg-reconfigure’ command is used to reconfigure an already installed package on a Debian or Ubuntu system.

Read More
Using the lspci command (with examples)

Using the lspci command (with examples)

The lspci command is a Linux utility that lists all the PCI devices present in your system.

Read More
How to use the command brightnessctl (with examples)

How to use the command brightnessctl (with examples)

Brightnessctl is a utility command for reading and controlling device brightness on GNU/Linux operating systems.

Read More