Wget

Updated 8 hours ago

Wget is a command-line tool built for one thing: getting files from the Internet when conditions are bad.

Where other tools assume your connection is stable, wget assumes it will fail. It retries automatically. It resumes interrupted downloads. It waits politely between requests so servers don't block you. It runs in the background while you do other things.

Wget assumes your download will fail. That paranoia is its gift.

The Simplest Case

Download a file:

wget https://example.com/file.pdf

The file saves to your current directory. If the download fails, wget retries. If your connection drops partway through, add -c and wget picks up where it left off:

wget -c https://example.com/largefile.zip

That -c flag embodies wget's philosophy: networks are unreliable, but your download should still complete.

Downloading in Hostile Conditions

Resume Interrupted Downloads

wget -c https://example.com/ubuntu.iso

Run this after a failed download. Wget checks how much you have, asks the server for the rest, and continues. No wasted bandwidth.

Retry Until It Works

wget --tries=0 --waitretry=10 https://example.com/file.pdf

--tries=0 means infinite retries. --waitretry=10 waits 10 seconds between attempts. Walk away. Come back to a completed download.

Background Downloads

wget -b https://example.com/largefile.zip

Wget forks to the background and logs progress to wget-log. Check on it with tail -f wget-log.

Limit Bandwidth

wget --limit-rate=500k https://example.com/largefile.zip

Don't saturate your connection. This caps wget at 500 KB/sec.

Recursive Downloads

Wget's real power: downloading entire directory structures.

Download Everything Under a Path

wget --recursive --no-parent https://example.com/docs/

This follows every link under /docs/ and downloads what it finds. --no-parent prevents wget from wandering up to the parent directory.

Mirror an Entire Website

wget --mirror --convert-links --page-requisites --no-parent https://example.com

--mirror: recursive with infinite depth, timestamps for efficiency
--convert-links: rewrite links to work offline
--page-requisites: grab CSS, images, JavaScript—everything needed to render pages
--no-parent: stay within the target directory

You get a complete offline copy.

Filter by File Type

# Only PDFs
wget --recursive --accept pdf https://example.com/documents/

# Only images
wget --recursive --accept jpg,png,gif https://example.com/gallery/

# Everything except executables
wget --recursive --reject exe,zip https://example.com/

Be Polite to Servers

wget --recursive --wait=2 --limit-rate=200k https://example.com

--wait=2 pauses 2 seconds between requests. Combined with rate limiting, this prevents you from hammering the server—and getting blocked.

Downloading Multiple Files

Create a file with URLs, one per line:

https://example.com/file1.pdf
https://example.com/file2.pdf
https://example.com/file3.pdf

Then:

wget -i urls.txt

Wget downloads each in sequence, with its usual retry logic.

Authentication

Basic Auth

wget --user=username --password=password https://example.com/secure/file.pdf

Or prompt for the password:

wget --user=username --ask-password https://example.com/secure/file.pdf

Cookie-Based Sessions

# Login and capture session
wget --save-cookies cookies.txt \
     --post-data='user=me&pass=secret' \
     https://example.com/login

# Use session for subsequent downloads
wget --load-cookies cookies.txt https://example.com/secure/data.zip

Common Options Reference

Option	Purpose
`-O filename`	Save with specific filename
`-c`	Continue/resume partial download
`-b`	Run in background
`-q`	Quiet—no output
`-i file`	Read URLs from file
`--tries=N`	Retry N times (0 = infinite)
`--limit-rate=N`	Limit bandwidth (e.g., 500k)
`--recursive`	Follow links and download
`--mirror`	Full site mirror with timestamps
`--convert-links`	Make links work offline
`--no-parent`	Don't ascend to parent directories
`--accept=ext`	Only download these extensions
`--reject=ext`	Skip these extensions
`-N`	Only download if newer than local

Scripting with Wget

Check if download succeeded:

if wget -q https://example.com/file.pdf; then
    echo "Download complete"
else
    echo "Download failed with exit code $?"
fi

Key exit codes:

0: Success
4: Network failure
5: SSL error
6: Authentication required
8: Server returned error

Configuration File

Store defaults in ~/.wgetrc:

timeout = 60
tries = 3
timestamping = on
wait = 1
limit_rate = 500k

Now every wget command uses these settings unless overridden.

When to Use Wget vs. Curl

Use wget for:

Downloading files, especially large ones
Mirroring websites for offline use
Batch downloads from URL lists
Anything that might fail and need automatic retry

Use curl for:

API testing and development
Custom HTTP methods (PUT, DELETE, PATCH)
Uploading data
Examining response headers

Wget downloads. Curl transfers. Both tools. Different jobs.

Frequently Asked Questions About Wget

Was this page helpful?

😔

🤨

😃