RAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRINGRAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRING

The extraction engine

The web scraper
your AI agent
actually deserves.

Clean structured data for your agents. In milliseconds, not seconds.

118ms avg·67% fewer tokens·Drop-in Firecrawl replacement

One command setup

MCP + CLI

Give your AI agents web data with a single command. Auto-detects your tools and configures everything.

Learn more
Terminal
$npx create-webclaw

Works with Claude Code, Cursor,
Windsurf, Codex, OpenCode, and more

Try it live

Paste any URL

118ms
avg extraction
90%
success rate
67%
token reduction
14
API endpoints

Every page.
Every defense.

01

Fast by default. Smart when needed.

118ms average for static pages. Firecrawl's published P95 is 3.4s. Multi-layer rendering pipeline for JS-heavy sites — the engine picks the fastest path automatically. You configure nothing.

02

Drop-in Firecrawl replacement.

Change your base URL. Keep your existing SDK code. The /v2 endpoints are fully compatible — same API shape, same response format, no rewrite needed. Better extraction quality, faster response times.

03

Best-in-class bot protection.

Challenge pages, CAPTCHAs, browser fingerprinting — handled transparently. No manual cookies, no config. Your requests just work, even on the hardest sites.

04

Every format, every extraction.

Markdown, JSON, plain text, LLM-optimized. Schema-based extraction, prompt-based extraction, summarization, brand identity, content diffing. 14 endpoints, one API key.

05

Built for AI agents.

MCP server with 12 tools for Claude, Cursor, Windsurf, OpenCode, Codex, Antigravity, and any MCP client. REST API for everything else. Web search, batch processing, crawling, sitemap discovery.

06

67% fewer tokens.

The LLM format runs a 9-step optimization pipeline — strips nav, ads, boilerplate, repeated elements. The median page goes from 3,800 tokens raw to 950 tokens of actual content. Your agent gets more, spends less.

07

Agentic scraping.

Give a goal, get structured data. The AI agent reasons about page content, clicks buttons, navigates, and extracts exactly what you asked for. Powered by the best available models.

08

Deep content recovery.

Embedded JSON, structured data, server-rendered payloads — extracted even when the visible DOM is empty. Auto-detects PDFs, DOCX, XLSX. Multiple fallback strategies. If the content exists, webclaw finds it.

One credit.
One page.

No hidden multipliers. No per-feature charges. Pick a plan, start extracting.

FREE
$0/mo
PAGES················································································500/mo
CONCURRENCY················································································2
ANTIBOT················································································
JS RENDER················································································
LLM CALLS················································································
RESEARCH················································································
PROXY················································································
SUPPORT················································································COMMUNITY
JOIN WAITLIST
STARTER
$49/mo
PAGES················································································10,000/mo
CONCURRENCY················································································10
ANTIBOT················································································
JS RENDER················································································
LLM CALLS················································································
RESEARCH················································································5/mo
PROXY················································································
SUPPORT················································································EMAIL
JOIN WAITLIST
PROPOPULAR
$99/mo
PAGES················································································100,000/mo
CONCURRENCY················································································50
ANTIBOT················································································500/mo
JS RENDER················································································2,000/mo
LLM CALLS················································································1,000/mo
RESEARCH················································································25/mo
PROXY················································································2 GB
SUPPORT················································································PRIORITY
JOIN WAITLIST
SCALE
$399/mo
PAGES················································································500,000/mo
CONCURRENCY················································································100
ANTIBOT················································································5,000/mo
JS RENDER················································································10,000/mo
LLM CALLS················································································10,000/mo
RESEARCH················································································100/mo
PROXY················································································10 GB
SUPPORT················································································PRIORITY + SLACK
JOIN WAITLIST
DEDICATED

Unlimited pages. Unlimited research. 200 concurrent. Single-tenant on your cloud, your proxies, your rules. Dedicated Slack channel + SLA.

CONTACT US
OPEN SOURCE

Self-host forever. AGPL-3.0 license. CLI + server + MCP server. No limits on your hardware.

VIEW ON GITHUB

1 CREDIT = 1 PAGE, ALWAYS · NO HIDDEN MULTIPLIERS · OPEN SOURCE

Common questions

FAQ

Webclaw is a web extraction toolkit that converts any website into clean, structured data. It supports multiple output formats — Markdown, JSON, HTML, plain text, and an LLM-optimized format that strips noise and reduces token count by up to 67%.

Webclaw uses raw HTTP requests with TLS fingerprint impersonation instead of spinning up a headless browser. This means sub-200ms response times, zero browser overhead, and no Selenium or Playwright dependency. It achieves the same results through intelligent content extraction and readability scoring.

Yes. The Starter plan is completely free — 500 pages per month, 5 output formats, sitemap discovery, and full API access. No credit card required. You can upgrade anytime if you need higher limits or advanced features like LLM extraction.

Absolutely. Webclaw is open source under the AGPL-3.0 license. You can run the CLI, REST API server, or MCP server on your own infrastructure. Docker images and one-line deploy scripts are available for quick setup.

Six formats: Markdown (clean readable text), JSON (structured with metadata), HTML (sanitized), plain text, LLM-optimized (stripped of noise for AI consumption), and raw HTML. The LLM format runs a 9-step optimization pipeline to minimize token usage.

Webclaw ships a dedicated MCP (Model Context Protocol) server binary that exposes 8 tools — scrape, crawl, map, batch, extract, summarize, diff, and brand. It works with any MCP-compatible client like Claude Desktop, Claude Code, Cursor, Windsurf, OpenCode, Codex, or Antigravity over stdio transport.

Your extracted content is never stored or logged on our servers. Requests are processed in real-time and the response is returned directly to you. If you use LLM features, content is sent to the AI provider for processing but is not retained. For full control, self-host the entire stack.

Webclaw can use language models to extract structured JSON from pages using a schema you define, answer questions about page content with prompt-based extraction, or generate summaries. It chains through local Ollama first, then falls back to cloud providers.

Ready to build?

Start extracting.

Free tier. No credit card. Deploy in under a minute — or self-host forever. Open source.

Stay in the loop

Get notified when the webclaw API launches. Early subscribers get extended free tier access.

No spam. Unsubscribe anytime.