Fake Real Domains Locally — Docker Network Aliases + Caddy for E2E Testing
by Sylvain Artois on Feb 10, 2026
- #Testing
- #Docker
- #DevOps
When you build a crawler, you need websites to crawl. Hitting production sites during tests is slow, flaky, and rude. Mocking HTTP responses works for unit tests, but misses the point for end-to-end testing — you want your crawler to face real HTML, real navigation structures, real assets.
The solution is simple: mirror the website locally and make it available under its real domain name, so your crawler cannot tell the difference. Three tools are enough: wget, Caddy, and Docker Compose.
Step 1 — Mirror the Website with wget
wget --mirror downloads a full copy of a website, rewriting links for offline browsing. Here is the core command:
wget \
--mirror \
--convert-links \
--adjust-extension \
--page-requisites \
--no-parent \
--level="$depth" \
--directory-prefix="$SITES_DIR" \
--reject "*.exe,*.zip,*.tar.gz" \
--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ..." \
--wait=0.5 \
--random-wait \
--tries=3 \
--timeout=30 \
"$url"
Key flags:
--mirrorenables recursion, timestamping, and infinite depth.--convert-linksrewrites URLs in downloaded pages to point to local files.--adjust-extensionadds.htmlextensions when needed.--page-requisitesfetches images, stylesheets, and scripts referenced by each page.--no-parentprevents ascending into parent directories.--wait/--random-waitthrottle requests to avoid hammering the server.
The result is a static copy of the site under sites/<domain>/. For instance:
sites/
├── mgplomberie84.fr/
│ ├── index.html
│ ├── plomberie/index.html
│ ├── chauffage/index.html
│ └── ...
└── woemen.fr/
├── index.html
└── ...
See the GNU Wget Manual for the full option reference.
Step 2 — Serve Each Site Under Its Real Domain with Caddy
Caddy supports virtual hosts out of the box. Each site block in the Caddyfile matches requests by hostname, so one Caddy instance can serve multiple domains simultaneously.
# --- Virtual hosts (one per mirrored site) ---
mgplomberie84.fr:80 {
root * /srv/sites/mgplomberie84.fr
try_files {path} {path}/ {path}/index.html /index.html
file_server browse
header {
Access-Control-Allow-Origin *
Access-Control-Allow-Methods "GET, POST, OPTIONS"
Access-Control-Allow-Headers "*"
}
log {
output stdout
format console
}
}
woemen.fr:80 {
root * /srv/sites/woemen.fr
try_files {path} {path}/ {path}/index.html /index.html
file_server browse
header {
Access-Control-Allow-Origin *
Access-Control-Allow-Methods "GET, POST, OPTIONS"
Access-Control-Allow-Headers "*"
}
log {
output stdout
format console
}
}
# --- Fallback: path-based browsing from host ---
:9080 {
root * /srv/sites
try_files {path} {path}/ {path}/index.html /index.html
file_server browse
log {
output stdout
format console
}
}
Each virtual host block binds to port 80 and matches incoming requests by Host header. The fallback block on :9080 provides path-based access from the host machine (useful for debugging).
See Caddyfile Concepts for details on site blocks and address matching.
Step 3 — Docker Network Aliases for DNS Resolution
This is the key trick. Docker Compose lets you declare network aliases for a service. Any container on the same network can resolve those aliases as hostnames — no /etc/hosts hacking, no custom DNS server.
services:
sandbox:
image: caddy:2-alpine
container_name: wispra_website_sandbox
restart: unless-stopped
ports:
- "9999:9080" # Fallback path-based browsing from host
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- ./sites:/srv/sites:ro
networks:
wispra_network:
aliases:
- mgplomberie84.fr
- woemen.fr
networks:
wispra_network:
external: true
name: wispra_infrastructure_wispra_network
The aliases section is the key: every container attached to wispra_network can now resolve mgplomberie84.fr and woemen.fr to the sandbox container’s IP. Because Caddy uses virtual hosts, it routes each request to the correct site directory based on the Host header.
The network is declared external so the sandbox joins the same network as the API container — no extra configuration needed on the API side.
The Result
From any container on the same Docker network:
curl http://mgplomberie84.fr/ # served by Caddy from local mirror
curl http://woemen.fr/ # same container, different virtual host
From the host machine (for debugging):
curl http://localhost:9999/mgplomberie84.fr/
The crawler sees a real website structure — full HTML pages, CSS, images, internal links — but everything runs locally. Tests are fast, deterministic, and fully offline. Adding a new site is straightforward: run wget, add a Caddyfile block, add a network alias, restart.
Wrapping Up
This setup uses three well-documented, standard tools — nothing exotic. Yet the combination is surprisingly effective for E2E testing against real web content. No mocking frameworks, no fragile fixtures, no reliance on external servers. Just static files, a reverse proxy, and DNS aliases inside Docker.