How to use the command 'wget' (with examples)

How to use the command 'wget' (with examples)

Wget is a command-line utility for downloading files from the web. It supports HTTP, HTTPS, and FTP protocols and is commonly used for various tasks such as downloading web pages, files, images, and more. In this article, we will explore different use cases of the ‘wget’ command with examples.

Use case 1: Download the contents of a URL to a file

Code:

wget https://example.com/foo

Motivation: This use case is useful when you need to download a file from a specific URL. In this example, we are downloading the contents of the URL ‘https://example.com/foo' and saving it as a file named ‘foo’.

Explanation:

  • wget: The command itself that initiates the download process.
  • https://example.com/foo: The URL from which the file will be downloaded.

Example output: The file ‘foo’ will be downloaded and saved in the current directory.

Use case 2: Download the contents of a URL to a specific file name

Code:

wget --output-document bar https://example.com/foo

Motivation: Sometimes, we may want to specify a custom file name for the downloaded content. In this use case, we are downloading the contents of the URL ‘https://example.com/foo' and saving it as a file named ‘bar’.

Explanation:

  • --output-document bar: This option specifies the name of the output file (in this case, ‘bar’).

Example output: The content from ‘https://example.com/foo' will be downloaded and saved in the file ‘bar’ in the current directory.

Use case 3: Download a web page and all its resources with intervals between requests

Code:

wget --page-requisites --convert-links --wait=3 https://example.com/somepage.html

Motivation: This use case is helpful when you want to download a web page along with all its resources, like images, stylesheets, scripts, etc. The ‘–page-requisites’ flag is used to download these resources and the ‘–wait’ option ensures a 3-second delay between consecutive requests to avoid overloading the server.

Explanation:

  • --page-requisites: This option tells wget to download all resources required to render the web page correctly.
  • --convert-links: With this option, wget converts the links in the downloaded file to point to the local versions of the resources.
  • --wait=3: This option adds a 3-second delay between consecutive requests to the server.

Example output: The web page at ‘https://example.com/somepage.html' will be downloaded along with its resources. The links in the downloaded file will be modified to point to the locally downloaded versions.

Use case 4: Download all listed files within a directory and its sub-directories

Code:

wget --mirror --no-parent https://example.com/somepath/

Motivation: If you want to download all the files within a directory, including its sub-directories, this use case is suitable. The ‘–mirror’ option enables a recursive download, while the ‘–no-parent’ option ensures that embedded page elements aren’t downloaded.

Explanation:

  • --mirror: This option enables a recursive download, downloading files and directories within the given URL.
  • --no-parent: With this option, wget avoids ascending to the parent directory and only focuses on the specified URL.

Example output: All files within the ‘https://example.com/somepath/' directory, along with its sub-directories, will be downloaded to the current directory.

Use case 5: Limit download speed and connection retries

Code:

wget --limit-rate=300k --tries=100 https://example.com/somepath/

Motivation: In situations where you want to limit the download speed to avoid overloading the network or the server, and also control the number of connection retries, this use case can be handy.

Explanation:

  • --limit-rate=300k: This option sets the maximum download rate to 300 kilobytes per second.
  • --tries=100: With this option, wget retries failed connections up to 100 times.

Example output: The files from ‘https://example.com/somepath/' will be downloaded with a maximum speed of 300 kilobytes per second, and wget will attempt to establish a connection up to 100 times if necessary.

Use case 6: Download a file from an HTTP server using Basic Auth

Code:

wget --user=username --password=password https://example.com

Motivation: If you need to download a file from an HTTP server that requires basic authentication, this use case provides the necessary options to pass the username and password.

Explanation:

  • --user=username: This option specifies the username for basic authentication.
  • --password=password: With this option, the password for basic authentication is provided.

Example output: Wget will authenticate using the specified username and password and download the file from the authenticated URL.

Use case 7: Continue an incomplete download

Code:

wget --continue https://example.com

Motivation: If a previous download was interrupted or incomplete, this use case allows you to continue the download from where it left off.

Explanation:

  • --continue: This option instructs wget to continue the download if the file already exists, picking up from the last downloaded byte.

Example output: If the file exists, wget resumes the download from where it was interrupted. Otherwise, it performs a normal download.

Use case 8: Download all URLs stored in a text file to a specific directory

Code:

wget --directory-prefix path/to/directory --input-file URLs.txt

Motivation: When you have a list of URLs stored in a text file and want to download all of them to a specific directory, this use case is useful.

Explanation:

  • --directory-prefix path/to/directory: This option sets the download directory to ‘path/to/directory’.
  • --input-file URLs.txt: With this option, wget reads the list of URLs from the ‘URLs.txt’ file.

Example output: Wget will read the URLs from the ‘URLs.txt’ file and download all of them to the specified directory (‘path/to/directory’).

Conclusion:

The ‘wget’ command is a powerful tool for downloading files from the web. It can handle various use cases, such as downloading a single file, mirroring a website, limiting download speed, and resuming incomplete downloads. By understanding the different options and arguments it supports, you can make the most of wget and efficiently download files for your tasks.

Related Posts

How to use the command 'azurite' (with examples)

How to use the command 'azurite' (with examples)

The ‘azurite’ command is an Azure Storage API compatible server (emulator) that runs in a local environment.

Read More
How to use the command 'sstat' (with examples)

How to use the command 'sstat' (with examples)

The ‘sstat’ command is used to view information about running jobs in the Slurm workload manager.

Read More
Using the `virsh pool-start` command (with examples)

Using the `virsh pool-start` command (with examples)

The virsh pool-start command is used to start a previously configured but inactive virtual machine storage pool.

Read More