Self-Hosting
Run the webclaw server on your own infrastructure. Choose Docker for the fastest setup, build from source for maximum control, or deploy to Fly.io for managed hosting with your own binary.
Docker
The quickest way to run webclaw. The image includes the server binary and all dependencies.
With authentication
Docker Compose with Ollama
For LLM features (extract, summarize), run Ollama alongside webclaw.
Tip
After starting the compose stack, pull a model into Ollama:
docker exec -it ollama ollama pull qwen3:8bFrom source
Build the binaries directly from the Rust source. Requires Rust 1.75+.
The build produces three binaries in target/release/:
| Binary | Description |
|---|---|
webclaw | CLI tool for extraction, crawling, and more. |
webclaw-server | REST API server (axum). |
webclaw-mcp | MCP server for AI agents. |
Fly.io
Deploy to Fly.io for managed infrastructure with global edge distribution.
Environment variables
All configuration is done through environment variables. None are required -- the server runs with sensible defaults.
Server
| Variable | Default | Description |
|---|---|---|
WEBCLAW_PORT | 3000 | HTTP port to listen on. |
WEBCLAW_HOST | 0.0.0.0 | Bind address. |
WEBCLAW_API_KEY | -- | API key for authentication. If unset, no auth is required. |
WEBCLAW_MAX_CONCURRENCY | 50 | Max concurrent extraction tasks. |
WEBCLAW_JOB_TTL_SECS | 3600 | How long to keep completed crawl jobs (seconds). |
WEBCLAW_MAX_JOBS | 100 | Maximum number of concurrent crawl jobs. |
WEBCLAW_LOG | info | Tracing filter (e.g. debug, webclaw=trace). |
Proxy
| Variable | Default | Description |
|---|---|---|
WEBCLAW_PROXY | -- | Single proxy URL (http, https, or socks5). |
WEBCLAW_PROXY_FILE | -- | Path to a file with one proxy URL per line. Rotated per-request. |
WEBCLAW_ANTIBOT_URL | -- | Anti-bot service endpoint. |
WEBCLAW_ANTIBOT_KEY | -- | API key for the anti-bot service. |
Auth and OAuth (cloud features)
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | -- | PostgreSQL connection string. Enables OAuth and billing. |
GOOGLE_CLIENT_ID | -- | Google OAuth client ID. |
GOOGLE_CLIENT_SECRET | -- | Google OAuth client secret. |
GITHUB_CLIENT_ID | -- | GitHub OAuth client ID. |
GITHUB_CLIENT_SECRET | -- | GitHub OAuth client secret. |
WEBCLAW_JWT_SECRET | -- | JWT signing secret for session tokens. |
WEBCLAW_BASE_URL | -- | Public URL of the server (for OAuth callbacks). |
WEBCLAW_FRONTEND_URL | -- | Frontend URL (for CORS and redirect). |
LLM providers
| Variable | Default | Description |
|---|---|---|
OLLAMA_HOST | http://localhost:11434 | Ollama API endpoint. |
OLLAMA_MODEL | qwen3:8b | Default Ollama model for extraction and summarization. |
OPENAI_API_KEY | -- | OpenAI API key. Enables OpenAI as a fallback provider. |
OPENAI_BASE_URL | -- | Custom OpenAI-compatible endpoint (for proxies or local models). |
ANTHROPIC_API_KEY | -- | Anthropic API key. Enables Anthropic as a fallback provider. |
Warning
The OAuth and billing variables (DATABASE_URL, GOOGLE_CLIENT_*, etc.) are only needed if you are building a multi-tenant deployment with user accounts. For standard self-hosted usage, only WEBCLAW_API_KEY and the LLM provider keys matter.