webclaw

Self-Hosting

Run the webclaw server on your own infrastructure. Choose Docker for the fastest setup, build from source for maximum control, or deploy to Fly.io for managed hosting with your own binary.

Docker

The quickest way to run webclaw. The image includes the server binary and all dependencies.

Run the server
docker run -p 3000:3000 ghcr.io/0xmassi/webclaw:latest

With authentication

bash
docker run -p 3000:3000 \
  -e WEBCLAW_API_KEY=mysecret \
  ghcr.io/0xmassi/webclaw:latest

Docker Compose with Ollama

For LLM features (extract, summarize), run Ollama alongside webclaw.

docker-compose.yml
version: "3.8"

services:
  webclaw:
    image: ghcr.io/0xmassi/webclaw:latest
    ports:
      - "3000:3000"
    environment:
      - WEBCLAW_API_KEY=mysecret
      - OLLAMA_HOST=http://ollama:11434
    depends_on:
      - ollama

  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

volumes:
  ollama_data:
Tip
After starting the compose stack, pull a model into Ollama: docker exec -it ollama ollama pull qwen3:8b

From source

Build the binaries directly from the Rust source. Requires Rust 1.75+.

bash
git clone https://github.com/0xMassi/webclaw.git
cd webclaw
cargo build --release

The build produces three binaries in target/release/:

BinaryDescription
webclawCLI tool for extraction, crawling, and more.
webclaw-serverREST API server (axum).
webclaw-mcpMCP server for AI agents.
Start the server
./target/release/webclaw-server --port 3000 --api-key mysecret

Fly.io

Deploy to Fly.io for managed infrastructure with global edge distribution.

fly.toml
app = "webclaw"
primary_region = "iad"

[build]
  image = "ghcr.io/0xmassi/webclaw:latest"

[env]
  WEBCLAW_PORT = "8080"
  WEBCLAW_HOST = "0.0.0.0"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true

[[vm]]
  size = "shared-cpu-1x"
  memory = "512mb"
Deploy
fly launch
fly secrets set WEBCLAW_API_KEY=mysecret

Environment variables

All configuration is done through environment variables. None are required -- the server runs with sensible defaults.

Server

VariableDefaultDescription
WEBCLAW_PORT3000HTTP port to listen on.
WEBCLAW_HOST0.0.0.0Bind address.
WEBCLAW_API_KEY--API key for authentication. If unset, no auth is required.
WEBCLAW_MAX_CONCURRENCY50Max concurrent extraction tasks.
WEBCLAW_JOB_TTL_SECS3600How long to keep completed crawl jobs (seconds).
WEBCLAW_MAX_JOBS100Maximum number of concurrent crawl jobs.
WEBCLAW_LOGinfoTracing filter (e.g. debug, webclaw=trace).

Proxy

VariableDefaultDescription
WEBCLAW_PROXY--Single proxy URL (http, https, or socks5).
WEBCLAW_PROXY_FILE--Path to a file with one proxy URL per line. Rotated per-request.
WEBCLAW_ANTIBOT_URL--Anti-bot service endpoint.
WEBCLAW_ANTIBOT_KEY--API key for the anti-bot service.

Auth and OAuth (cloud features)

VariableDefaultDescription
DATABASE_URL--PostgreSQL connection string. Enables OAuth and billing.
GOOGLE_CLIENT_ID--Google OAuth client ID.
GOOGLE_CLIENT_SECRET--Google OAuth client secret.
GITHUB_CLIENT_ID--GitHub OAuth client ID.
GITHUB_CLIENT_SECRET--GitHub OAuth client secret.
WEBCLAW_JWT_SECRET--JWT signing secret for session tokens.
WEBCLAW_BASE_URL--Public URL of the server (for OAuth callbacks).
WEBCLAW_FRONTEND_URL--Frontend URL (for CORS and redirect).

LLM providers

VariableDefaultDescription
OLLAMA_HOSThttp://localhost:11434Ollama API endpoint.
OLLAMA_MODELqwen3:8bDefault Ollama model for extraction and summarization.
OPENAI_API_KEY--OpenAI API key. Enables OpenAI as a fallback provider.
OPENAI_BASE_URL--Custom OpenAI-compatible endpoint (for proxies or local models).
ANTHROPIC_API_KEY--Anthropic API key. Enables Anthropic as a fallback provider.
Warning
The OAuth and billing variables (DATABASE_URL, GOOGLE_CLIENT_*, etc.) are only needed if you are building a multi-tenant deployment with user accounts. For standard self-hosted usage, only WEBCLAW_API_KEY and the LLM provider keys matter.