How to use the command 'wget' (with examples)

How to use the command 'wget' (with examples)

Wget is a widely used command-line utility for non-interactive downloading of files from the web. Utilizing protocols such as HTTP, HTTPS, and FTP, wget is highly versatile and can handle large volumes of data, automate downloads, and ensure seamless retrieval of web content. Its capability to manage recursive downloads, convert links, and handle authentication make it a requisite tool for network administrators and data enthusiasts.

Use case 1: Download the contents of a URL to a file (named “foo” in this case)

Code:

wget https://example.com/foo

Motivation: This basic download command is a staple for anyone needing to quickly grab a file hosted on the internet. Whether it be a document, image, or binary file, using wget in this way is straightforward and efficient.

Explanation:

  • wget: Invokes the wget command.
  • https://example.com/foo: The URL of the file that is desired to be downloaded. By specifying this, wget makes an HTTP GET request to fetch the file.

Example Output: A file named “foo” is created in the current directory, containing the data from the specified URL.

Use case 2: Download the contents of a URL to a file (named “bar” in this case)

Code:

wget --output-document bar https://example.com/foo

Motivation: Sometimes, the file name on the server is not descriptive or conflicts with existing files. Specifying an output file ensures that you control the naming at the time of download.

Explanation:

  • --output-document bar: This option is used to specify the name of the output file. Instead of using the name from the URL, wget writes the downloaded content to “bar”.
  • https://example.com/foo: The source URL of the file to download.

Example Output: A file named “bar” is created in the current directory, filled with the downloaded content.

Use case 3: Download a single web page and all its resources with 3-second intervals between requests

Code:

wget --page-requisites --convert-links --wait=3 https://example.com/somepage.html

Motivation: When needing to download a web page with all its associated resources like images, scripts, and stylesheets, it’s useful to use these options to ensure the page is fully functional offline.

Explanation:

  • --page-requisites: Instructs wget to download all resources necessary for the displaying of the HTML page.
  • --convert-links: Adjusts the links within the downloaded documents to ensure they are relative and function correctly offline.
  • --wait=3: Introduces a 3-second delay between each request to the server, to prevent potential server overload.

Example Output: A directory structure is created with the HTML page and its associated resources, ready to be viewed offline.

Use case 4: Download all listed files within a directory and its sub-directories

Code:

wget --mirror --no-parent https://example.com/somepath/

Motivation: Ideal for backing up a website or downloading a complete directory structure from a provided URL, this setup is useful for website replication and archiving purposes.

Explanation:

  • --mirror: Equates to using options for recursion and other settings for mirroring a website.
  • --no-parent: Prevents wget from following links that ascend to the parent of the directory specified. This confines the operation to the specified directory and its subfolders.

Example Output: A mirror of the whole directory and its subdirectories is downloaded, preserving the original structure.

Use case 5: Limit the download speed and the number of connection retries

Code:

wget --limit-rate=300k --tries=100 https://example.com/somepath/

Motivation: Essential if you want to manage bandwidth usage or persistent errors during downloads. This command ensures the process remains controlled and resilient against intermediate network interruptions.

Explanation:

  • --limit-rate=300k: Limits the download speed to 300 kilobytes per second to keep bandwidth within a specified threshold.
  • --tries=100: Sets the number of reattempts to 100 in case of transient download failures.

Example Output: The specified targets download at a capped speed, attempting retries up to 100 times before terminating with an error.

Use case 6: Download a file from an HTTP server using Basic Auth

Code:

wget --user=username --password=password https://example.com

Motivation: Many web services and repositories require a username and password before granting access to files, making this command critical for authenticated downloads.

Explanation:

  • --user=username: Provides the username for HTTP or FTP authentication.
  • --password=password: Provides the password for the specified user.
  • https://example.com: Target URL from which the file is intended to be downloaded with authentication credentials.

Example Output: The requested file is downloaded after successful authentication.

Use case 7: Continue an incomplete download

Code:

wget --continue https://example.com

Motivation: Disconnections can happen, and the ability to resume interrupted downloads without restarting from scratch saves time and bandwidth.

Explanation:

  • --continue: Instructs wget to resume previous partially completed downloads.

Example Output: The download resumes from the point it was stopped, completing the file if the server supports resuming.

Use case 8: Download all URLs stored in a text file to a specific directory

Code:

wget --directory-prefix path/to/directory --input-file URLs.txt

Motivation: Efficient for bulk downloads, when managing a long list of URLs, it’s optimal to store them in a file and specify a download directory to keep things organized.

Explanation:

  • --directory-prefix path/to/directory: Specifies the directory where the downloaded files should be saved.
  • --input-file URLs.txt: Points to the file containing a list of URLs to be downloaded.

Example Output: All files from the URLs listed in URLs.txt are downloaded into the specified directory.

Conclusion:

Wget is an indispensable tool for anyone working with online content. From simple downloads to managing large datasets and automating file retrieval, mastering wget and its diverse options can greatly enhance productivity and data handling capacities. With practicable examples and insights, users can exploit wget’s full potential to align with their tasks and objectives.

Related Posts

Mastering the 'cli53' Command Line Tool for Amazon Route 53 (with examples)

Mastering the 'cli53' Command Line Tool for Amazon Route 53 (with examples)

cli53 is a command line tool designed to facilitate the management of DNS records in Amazon Route 53.

Read More
How to use the command 'netlify' (with examples)

How to use the command 'netlify' (with examples)

Netlify is a platform that provides developers with a unified workflow to build, deploy, and manage web applications.

Read More
How to Use the Command 'Evil-WinRM' (with Examples)

How to Use the Command 'Evil-WinRM' (with Examples)

Evil-WinRM is a powerful tool designed specifically for penetration testing on Windows environments.

Read More