Fake Real Domains Locally — Docker Network Aliases + Caddy for E2E Testing

by Sylvain Artois on Feb 10, 2026

  • #Testing
  • #Docker
  • #DevOps

Silver Sun, 1929 - Arthur Dove - www.nga.gov
Silver Sun, 1929 - Arthur Dove - www.nga.gov

When you build a crawler, you need websites to crawl. Hitting production sites during tests is slow, flaky, and rude. Mocking HTTP responses works for unit tests, but misses the point for end-to-end testing — you want your crawler to face real HTML, real navigation structures, real assets.

The solution is simple: mirror the website locally and make it available under its real domain name, so your crawler cannot tell the difference. Three tools are enough: wget, Caddy, and Docker Compose.

Step 1 — Mirror the Website with wget

wget --mirror downloads a full copy of a website, rewriting links for offline browsing. Here is the core command:

wget \
    --mirror \
    --convert-links \
    --adjust-extension \
    --page-requisites \
    --no-parent \
    --level="$depth" \
    --directory-prefix="$SITES_DIR" \
    --reject "*.exe,*.zip,*.tar.gz" \
    --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ..." \
    --wait=0.5 \
    --random-wait \
    --tries=3 \
    --timeout=30 \
    "$url"

Key flags:

  • --mirror enables recursion, timestamping, and infinite depth.
  • --convert-links rewrites URLs in downloaded pages to point to local files.
  • --adjust-extension adds .html extensions when needed.
  • --page-requisites fetches images, stylesheets, and scripts referenced by each page.
  • --no-parent prevents ascending into parent directories.
  • --wait / --random-wait throttle requests to avoid hammering the server.

The result is a static copy of the site under sites/<domain>/. For instance:

sites/
├── mgplomberie84.fr/
│   ├── index.html
│   ├── plomberie/index.html
│   ├── chauffage/index.html
│   └── ...
└── woemen.fr/
    ├── index.html
    └── ...

See the GNU Wget Manual for the full option reference.

Step 2 — Serve Each Site Under Its Real Domain with Caddy

Caddy supports virtual hosts out of the box. Each site block in the Caddyfile matches requests by hostname, so one Caddy instance can serve multiple domains simultaneously.

# --- Virtual hosts (one per mirrored site) ---

mgplomberie84.fr:80 {
    root * /srv/sites/mgplomberie84.fr
    try_files {path} {path}/ {path}/index.html /index.html
    file_server browse
    header {
        Access-Control-Allow-Origin *
        Access-Control-Allow-Methods "GET, POST, OPTIONS"
        Access-Control-Allow-Headers "*"
    }
    log {
        output stdout
        format console
    }
}

woemen.fr:80 {
    root * /srv/sites/woemen.fr
    try_files {path} {path}/ {path}/index.html /index.html
    file_server browse
    header {
        Access-Control-Allow-Origin *
        Access-Control-Allow-Methods "GET, POST, OPTIONS"
        Access-Control-Allow-Headers "*"
    }
    log {
        output stdout
        format console
    }
}

# --- Fallback: path-based browsing from host ---

:9080 {
    root * /srv/sites
    try_files {path} {path}/ {path}/index.html /index.html
    file_server browse
    log {
        output stdout
        format console
    }
}

Each virtual host block binds to port 80 and matches incoming requests by Host header. The fallback block on :9080 provides path-based access from the host machine (useful for debugging).

See Caddyfile Concepts for details on site blocks and address matching.

Step 3 — Docker Network Aliases for DNS Resolution

This is the key trick. Docker Compose lets you declare network aliases for a service. Any container on the same network can resolve those aliases as hostnames — no /etc/hosts hacking, no custom DNS server.

services:
  sandbox:
    image: caddy:2-alpine
    container_name: wispra_website_sandbox
    restart: unless-stopped
    ports:
      - "9999:9080" # Fallback path-based browsing from host
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
      - ./sites:/srv/sites:ro
    networks:
      wispra_network:
        aliases:
          - mgplomberie84.fr
          - woemen.fr

networks:
  wispra_network:
    external: true
    name: wispra_infrastructure_wispra_network

The aliases section is the key: every container attached to wispra_network can now resolve mgplomberie84.fr and woemen.fr to the sandbox container’s IP. Because Caddy uses virtual hosts, it routes each request to the correct site directory based on the Host header.

The network is declared external so the sandbox joins the same network as the API container — no extra configuration needed on the API side.

The Result

From any container on the same Docker network:

curl http://mgplomberie84.fr/          # served by Caddy from local mirror
curl http://woemen.fr/                  # same container, different virtual host

From the host machine (for debugging):

curl http://localhost:9999/mgplomberie84.fr/

The crawler sees a real website structure — full HTML pages, CSS, images, internal links — but everything runs locally. Tests are fast, deterministic, and fully offline. Adding a new site is straightforward: run wget, add a Caddyfile block, add a network alias, restart.

Wrapping Up

This setup uses three well-documented, standard tools — nothing exotic. Yet the combination is surprisingly effective for E2E testing against real web content. No mocking frameworks, no fragile fixtures, no reliance on external servers. Just static files, a reverse proxy, and DNS aliases inside Docker.


Leave a Comment