Migrating from Firecrawl: compatible API for AI agents
Firecrawl is a strong default for teams building with web data. It has good docs, a familiar API shape, and broad awareness in the LLM tooling world.
This post is not here to dunk on it.
It is for the more specific moment where you already have Firecrawl-shaped code in production or in a prototype, and you want to evaluate another API without rewriting the integration from zero.
Maybe you are building an AI agent that needs live web access. Maybe your RAG ingestion pipeline depends on scraped docs. Maybe you just want a second provider behind the same request shape so one vendor does not become a single point of failure.
That is the useful question:
Can you test a Firecrawl-compatible API with your existing scrape and crawl calls?
What Firecrawl compatibility means
Firecrawl's own v2 API docs describe a base URL of https://api.firecrawl.dev, bearer authentication, and endpoints like /v2/scrape for scraping a single URL. The scrape endpoint accepts a URL, a formats array, onlyMainContent, headers, wait options, location, cache controls, and other options. Output formats include markdown, summary, HTML, raw HTML, links, images, screenshot, JSON, change tracking, and branding.
webclaw exposes Firecrawl-compatible endpoints for the common migration path:
| Method | Path | Use |
|---|---|---|
| POST | /v2/scrape | Single URL scrape |
| POST | /v2/crawl | Start an async crawl |
| GET | /v2/crawl/{id} | Poll crawl status and results |
| DELETE | /v2/crawl/{id} | Cancel a crawl |
| POST | /v2/map | Discover URLs from a site |
| POST | /v2/search | Search the web and scrape results |
For many apps, the first test is intentionally boring: keep the same body, change the base URL, use a webclaw API key, and compare the response your app receives.
curl -X POST https://api.webclaw.io/v2/scrape \
-H "Authorization: Bearer wc_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["markdown"],
"onlyMainContent": true
}'If your current integration already builds requests for https://api.firecrawl.dev/v2/scrape, this is the migration surface to test first.
Why this matters for OpenClaw and Hermes agents
This is especially relevant if Firecrawl entered your stack through an agent runtime instead of through backend code.
OpenClaw's Firecrawl docs describe Firecrawl as a bundled web plugin: choosing Firecrawl during onboarding or running openclaw configure --section web enables the Firecrawl plugin. Firecrawl also publishes an OpenClaw quickstart for giving OpenClaw agents scrape, search, crawl, extract, and browser automation capabilities.
Hermes Agent's web search and extract docs describe three web tools: web_search, web_extract, and web_crawl, with provider backends including Firecrawl, Tavily, Exa, SearXNG, and Parallel. Firecrawl's own Hermes guide says Hermes can route web extraction through Firecrawl when a FIRECRAWL_API_KEY is configured.
The pattern is the same in both worlds: the agent does not care about a scraper brand. It cares that a tool can return clean page content, search results, or crawl output in a shape the agent can use.
That is why compatibility is useful. If your OpenClaw or Hermes workflow already assumes a Firecrawl-style scrape/crawl tool, you can test webclaw as a provider path without redesigning the whole agent.
When it is worth testing another compatible API
Do not switch tools because a blog post says so. Switch only if your real workload shows a reason.
The common reasons are practical:
Those are all measurable. You can run the same URLs through both systems, inspect status, compare markdown, and check whether your downstream parser still works.
A safe migration test plan
Do this before changing a production integration.
| Check | What to verify |
|---|---|
| Request shape | Does your existing body work unchanged? |
| Output fields | Do markdown, metadata, status, and errors match your parser? |
| Hard URLs | Do your flaky or high-value pages return useful content? |
| Agent fit | Does the result fit your context budget, MCP flow, and debug loop? |
Step 1: pick the URLs that matter
Do not start with https://example.com.
Start with the URLs that represent your actual app:
The goal is not to prove a point. The goal is to find out if the migration keeps your product behavior intact.
Step 2: compare the request body
Write down the exact options your app sends today.
For example:
{
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000,
"timeout": 60000
}Then test that body against the compatible endpoint.
If a field is not supported, you want to know during a controlled test, not from a failed customer workflow. The first pass should be about compatibility, not performance.
Step 3: compare output shape, not just success
A 200 is not enough.
For AI agents and RAG pipelines, output quality is the product. Check:
| Check | Why it matters |
|---|---|
| Title and URL metadata | Your citations and audit logs depend on it |
| Markdown headings | Chunking and retrieval often use heading structure |
| Code blocks | Docs ingestion breaks when code formatting is lost |
| Links | Agents often need source links for follow-up actions |
| Empty or tiny output | This can mean a shell page, blocked page, or wrong render path |
| Repeated nav/footer text | This inflates tokens and hurts retrieval |
If you are using the response in an agent, paste the markdown into the agent's actual prompt path and see what happens. A page can look fine in a modal and still be too noisy for your context budget.
Step 4: test crawl separately
Scrape migration and crawl migration are different.
Scrape is one request, one page. Crawl adds discovery, queueing, limits, depth, status polling, and result pagination. Even if /v2/scrape works immediately, test /v2/crawl with a small site before moving a larger ingestion job.
curl -X POST https://api.webclaw.io/v2/crawl \
-H "Authorization: Bearer wc_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"limit": 10,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}'Then poll the crawl:
curl https://api.webclaw.io/v2/crawl/YOUR_CRAWL_ID \
-H "Authorization: Bearer wc_live_YOUR_KEY"For documentation ingestion, crawl quality usually matters more than single-page quality. You want the right pages, not just more pages.
Step 5: decide what to migrate
You do not need to move everything at once.
A practical migration often looks like this:
1. Keep Firecrawl as the existing path.
2. Add webclaw behind a feature flag or provider setting.
3. Send a small set of URLs to both providers.
4. Compare output shape and downstream success.
5. Move the endpoint that benefits first.
For some teams that is /v2/scrape. For others it is /v2/map or /v2/search. If your agent stack uses MCP, the first win may be giving Claude Code, Cursor, or another MCP client a direct web extraction tool while the backend still uses your existing provider.
For OpenClaw or Hermes-style setups, start even smaller: route only web_extract or scrape-like calls first. Leave search and browser automation alone until extraction quality is stable on your real URLs.
When you should stay with Firecrawl
Stay where you are if Firecrawl is already reliable for your URLs, your team likes the SDKs, and your current costs are predictable enough. Migration work has a cost, even when the API shape is similar.
Firecrawl is also a good fit if your team already depends deeply on its ecosystem, templates, or product-specific workflows.
The point of a compatible API is not that everyone should switch. The point is that switching should be a testable engineering decision, not a rewrite.
Where webclaw fits
webclaw is built for teams that need web extraction inside AI products:
/v2 endpoints for migration testsStart with the Firecrawl comparison page if you want the product-level trade-offs. Start with the scrape API docs if you want the raw endpoint details.
FAQ
Is webclaw a drop-in Firecrawl replacement?
For the common v2 scrape, crawl, map, and search paths, webclaw exposes Firecrawl-compatible endpoints. You should still test your exact request body and response parser, especially if you use advanced options.
Can I keep using Firecrawl and test webclaw only for some URLs?
Yes. That is the safest way to evaluate it. Put the provider behind a small routing option and send a handful of known URLs through both paths.
Do I need to rewrite my AI agent?
Not necessarily. If your agent calls an HTTP endpoint, test the compatible REST path. If your agent uses MCP, webclaw also ships an MCP server, so Claude Code, Cursor, and other MCP clients can call scrape, crawl, search, extract, and summarize as tools.
What should I measure during the test?
Measure output shape, downstream parser success, token size, latency on your real URLs, and whether the result is useful to your agent or RAG pipeline. Do not stop at status code.
Ready to test the migration path? Start with the 7-day Starter trial, then run your current scrape body against /v2/scrape. If you want the bigger picture first, read webclaw vs Firecrawl or the best web scraping APIs for LLMs.
Read next: MCP web scraping for Claude Code and Cursor | HTML to Markdown for LLMs | Cloudflare scraping checklist