
Mastering the wget Command: A Comprehensive Guide
A Comprehensive Guide to the wget
Command
wget
is a powerful command-line utility in Linux designed for downloading files from the web. It’s capable of handling various protocols including HTTP, HTTPS, and FTP, making it a versatile tool for developers and system administrators. This post will delve into the essential functionalities of wget
, providing practical examples to enhance your file downloading experience.
Basic Usage
To download a file from a specific URL, simply execute:
wget https://example.com/foo
This command downloads the contents of the URL and saves it as “foo” in your current directory.
Customizing Output File Names
If you want to save the downloaded contents under a different name, use the -O
option:
wget -O bar https://example.com/foo
This command will save the contents from the URL to a file named “bar”.
Downloading Complete Web Pages
To download a single web page along with all its associated resources—like scripts, stylesheets, and images—use the following command while maintaining a pace between requests:
wget -p -k -w 3 https://example.com/somepage.html
Here:
-p
ensures all resources are downloaded.-k
converts links for offline viewing.-w 3
introduces a 3-second wait between requests to avoid overwhelming the server.
Mirroring Websites
If you want to download an entire directory and its sub-directories without grabbing embedded page elements, use:
wget -m -np https://example.com/somepath/
-m
enables mirroring, which effectively replicates the site structure locally.-np
preventswget
from ascending to the parent directory.
Managing Download Speed and Retries
To limit your download speed and specify a number of retries in case of failures, use:
wget --limit-rate=300k -t 100 https://example.com/somepath/
This command will cap the download speed at 300KB/s and retry the download up to 100 times.
Downloading with Authentication
For scenarios requiring Basic Authentication, such as downloading files from secure servers, you can specify the username and password:
wget --user username --password password https://example.com
This allows you to access protected resources seamlessly.
Continuing Incomplete Downloads
If your download was interrupted, you can resume it using:
wget -c https://example.com
The -c
option tells wget
to continue from where the download left off.
Downloading Multiple Files from a Text File
To download all URLs listed in a text file, directing them to a specific directory, use:
wget -P path/to/directory -i URLs.txt
The -P
option specifies the target directory, while -i
indicates the input file containing the URLs.
Conclusion
wget
is an invaluable utility for anyone needing to download files, whether you’re a seasoned developer or a casual user. With its multitude of options, you can customize your downloads to suit your needs, from simple file retrieval to complex web scraping tasks. For more information, visit the official GNU wget documentation. Happy downloading!