webclaw

REST API

The webclaw REST API gives you programmatic access to the full extraction engine. Every endpoint accepts JSON and returns JSON.

Base URL

Use the cloud endpoint for managed infrastructure, or point at your own instance when self-hosting.

bash
# Cloud (managed)
https://api.webclaw.io

# Self-hosted
http://localhost:3000

Authentication

All requests require an API key sent via the Authorization header.

http
Authorization: Bearer <api_key>

Cloud: Create API keys from your dashboard at webclaw.io. Keys are prefixed with wc_.

Self-hosted: Pass --api-key when starting the server, or set the WEBCLAW_API_KEY environment variable. If neither is set, the server runs without authentication.

Note
Self-hosted instances with no API key configured accept all requests. Set one before exposing the server to the internet.

Request format

All POST endpoints accept a JSON body. Set the Content-Type header accordingly.

http
Content-Type: application/json

Response format

All responses are JSON. Successful responses return the data directly. Errors use a consistent shape:

json
{
  "error": "Human-readable error message"
}

Rate limiting

Cloud API rate limits are based on your plan tier. Self-hosted instances have no rate limits by default. See the Cloud API page for plan details.

Endpoints

The full list of available endpoints.

MethodPathDescription
POST/v1/scrapeSingle URL extraction
POST/v1/crawlStart async crawl
GET/v1/crawl/{id}Poll crawl status
POST/v1/batchMulti-URL extraction
POST/v1/mapSitemap discovery
POST/v1/extractLLM JSON extraction
POST/v1/summarizeLLM summarization
POST/v1/diffContent change tracking
POST/v1/brandBrand identity extraction
GET/healthHealth check + Ollama status

Quick example

curl
curl -X POST https://api.webclaw.io/v1/scrape \
  -H "Authorization: Bearer wc_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'