RAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTRAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULT

CLOUD API

Web extraction API.

REST API for production applications. Antibot bypass, JS rendering, LLM-optimized output, and structured data extraction. One key, every format.

Quick start

Three steps to your first extraction.

1Get your API key

Sign up at webclaw.io/login and grab your key from the dashboard.

2Make your first request
bash
curl -X POST https://api.webclaw.io/v1/scrape \
  -H "Authorization: Bearer $WEBCLAW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'
3Get clean results
Response
{
  "success": true,
  "data": {
    "url": "https://example.com",
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
    "metadata": {
      "title": "Example Domain",
      "description": "Example Domain",
      "status_code": 200,
      "response_time_ms": 118
    }
  }
}

SDK quickstart

Official clients for the languages you use.

import webclaw

client = webclaw.Client(api_key="your-key")
result = client.scrape("https://example.com")
print(result.markdown)

9 endpoints

Everything you need for web extraction at scale.

POST/v1/scrape

Extract content from any URL in any format

POST/v1/crawl

Start a BFS crawl of an entire site

GET/v1/crawl/:id

Check progress and retrieve crawl results

POST/v1/map

Discover all URLs via sitemap and link parsing

POST/v1/batch

Extract multiple URLs in a single request

POST/v1/extract

LLM-powered structured data extraction

POST/v1/summarize

AI-generated page summaries

POST/v1/diff

Track content changes between snapshots

POST/v1/brand

Extract brand identity (colors, fonts, logos)

Built for production

Every request goes through battle-tested infrastructure.

Automatic antibot bypass

Cloudflare, DataDome, AWS WAF. Handled transparently on every request.

Built-in caching

Configurable TTL per request. Identical URLs return cached results instantly.

JS-rendered pages

Full support for SPAs, React, Next.js. No browser on your side.

LLM-optimized output

9-step pipeline strips noise. 67% fewer tokens than raw HTML.

Rate-limited and managed

Per-key rate limits, usage tracking, and automatic retries built in.

Start building.

500 free pages per month. No credit card required. Scale when you need to.