← BACK TO BLOG
Massi

Migrating from Firecrawl: compatible API for AI agents

Firecrawl is a strong default for teams building with web data. It has good docs, a familiar API shape, and broad awareness in the LLM tooling world.

This post is not here to dunk on it.

It is for the more specific moment where you already have Firecrawl-shaped code in production or in a prototype, and you want to evaluate another API without rewriting the integration from zero.

Maybe you are building an AI agent that needs live web access. Maybe your RAG ingestion pipeline depends on scraped docs. Maybe you just want a second provider behind the same request shape so one vendor does not become a single point of failure.

That is the useful question:

Can you test a Firecrawl-compatible API with your existing scrape and crawl calls?

Firecrawl-compatible migration flow: keep your app shape, swap the base URL, then compare output on the same URLs.
Firecrawl-compatible migration flow: keep your app shape, swap the base URL, then compare output on the same URLs.

What Firecrawl compatibility means

Firecrawl's own v2 API docs describe a base URL of https://api.firecrawl.dev, bearer authentication, and endpoints like /v2/scrape for scraping a single URL. The scrape endpoint accepts a URL, a formats array, onlyMainContent, headers, wait options, location, cache controls, and other options. Output formats include markdown, summary, HTML, raw HTML, links, images, screenshot, JSON, change tracking, and branding.

webclaw exposes Firecrawl-compatible endpoints for the common migration path:

POST/v2/scrapeSingle URL scrape
POST/v2/crawlStart an async crawl
GET/v2/crawl/{id}Poll crawl status and results
DELETE/v2/crawl/{id}Cancel a crawl
POST/v2/mapDiscover URLs from a site
POST/v2/searchSearch the web and scrape results

For many apps, the first test is intentionally boring: keep the same body, change the base URL, use a webclaw API key, and compare the response your app receives.

curl -X POST https://api.webclaw.io/v2/scrape \
  -H "Authorization: Bearer wc_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": ["markdown"],
    "onlyMainContent": true
  }'

If your current integration already builds requests for https://api.firecrawl.dev/v2/scrape, this is the migration surface to test first.

Why this matters for OpenClaw and Hermes agents

This is especially relevant if Firecrawl entered your stack through an agent runtime instead of through backend code.

OpenClaw's Firecrawl docs describe Firecrawl as a bundled web plugin: choosing Firecrawl during onboarding or running openclaw configure --section web enables the Firecrawl plugin. Firecrawl also publishes an OpenClaw quickstart for giving OpenClaw agents scrape, search, crawl, extract, and browser automation capabilities.

Hermes Agent's web search and extract docs describe three web tools: web_search, web_extract, and web_crawl, with provider backends including Firecrawl, Tavily, Exa, SearXNG, and Parallel. Firecrawl's own Hermes guide says Hermes can route web extraction through Firecrawl when a FIRECRAWL_API_KEY is configured.

The pattern is the same in both worlds: the agent does not care about a scraper brand. It cares that a tool can return clean page content, search results, or crawl output in a shape the agent can use.

That is why compatibility is useful. If your OpenClaw or Hermes workflow already assumes a Firecrawl-style scrape/crawl tool, you can test webclaw as a provider path without redesigning the whole agent.

When it is worth testing another compatible API

Do not switch tools because a blog post says so. Switch only if your real workload shows a reason.

The common reasons are practical:

  • You want a backup provider with a similar request shape.
  • Your AI agent needs MCP access as well as REST access.
  • You run OpenClaw, Hermes, or another agent runtime where web extraction is a tool call, not a human browsing session.
  • Your app is sensitive to output size, because scraped content goes straight into an LLM context window.
  • Your team wants a predictable credit model for scrape, crawl, map, and batch usage.
  • You have a small set of hard URLs that deserve a provider-by-provider test.
  • Those are all measurable. You can run the same URLs through both systems, inspect status, compare markdown, and check whether your downstream parser still works.

    A safe migration test plan

    Do this before changing a production integration.

    Request shapeDoes your existing body work unchanged?
    Output fieldsDo markdown, metadata, status, and errors match your parser?
    Hard URLsDo your flaky or high-value pages return useful content?
    Agent fitDoes the result fit your context budget, MCP flow, and debug loop?

    Step 1: pick the URLs that matter

    Do not start with https://example.com.

    Start with the URLs that represent your actual app:

  • One simple marketing page
  • One documentation page with code blocks
  • One pricing or product page
  • One JavaScript-heavy page if your app depends on one
  • One page that has been flaky or slow in your current setup
  • The goal is not to prove a point. The goal is to find out if the migration keeps your product behavior intact.

    Step 2: compare the request body

    Write down the exact options your app sends today.

    For example:

    {
      "url": "https://docs.example.com/api",
      "formats": ["markdown"],
      "onlyMainContent": true,
      "waitFor": 1000,
      "timeout": 60000
    }

    Then test that body against the compatible endpoint.

    If a field is not supported, you want to know during a controlled test, not from a failed customer workflow. The first pass should be about compatibility, not performance.

    Step 3: compare output shape, not just success

    A 200 is not enough.

    For AI agents and RAG pipelines, output quality is the product. Check:

    Title and URL metadataYour citations and audit logs depend on it
    Markdown headingsChunking and retrieval often use heading structure
    Code blocksDocs ingestion breaks when code formatting is lost
    LinksAgents often need source links for follow-up actions
    Empty or tiny outputThis can mean a shell page, blocked page, or wrong render path
    Repeated nav/footer textThis inflates tokens and hurts retrieval

    If you are using the response in an agent, paste the markdown into the agent's actual prompt path and see what happens. A page can look fine in a modal and still be too noisy for your context budget.

    Step 4: test crawl separately

    Scrape migration and crawl migration are different.

    Scrape is one request, one page. Crawl adds discovery, queueing, limits, depth, status polling, and result pagination. Even if /v2/scrape works immediately, test /v2/crawl with a small site before moving a larger ingestion job.

    curl -X POST https://api.webclaw.io/v2/crawl \
      -H "Authorization: Bearer wc_live_YOUR_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "url": "https://docs.example.com",
        "limit": 10,
        "scrapeOptions": {
          "formats": ["markdown"],
          "onlyMainContent": true
        }
      }'

    Then poll the crawl:

    curl https://api.webclaw.io/v2/crawl/YOUR_CRAWL_ID \
      -H "Authorization: Bearer wc_live_YOUR_KEY"

    For documentation ingestion, crawl quality usually matters more than single-page quality. You want the right pages, not just more pages.

    Step 5: decide what to migrate

    You do not need to move everything at once.

    A practical migration often looks like this:

    1. Keep Firecrawl as the existing path.

    2. Add webclaw behind a feature flag or provider setting.

    3. Send a small set of URLs to both providers.

    4. Compare output shape and downstream success.

    5. Move the endpoint that benefits first.

    For some teams that is /v2/scrape. For others it is /v2/map or /v2/search. If your agent stack uses MCP, the first win may be giving Claude Code, Cursor, or another MCP client a direct web extraction tool while the backend still uses your existing provider.

    For OpenClaw or Hermes-style setups, start even smaller: route only web_extract or scrape-like calls first. Leave search and browser automation alone until extraction quality is stable on your real URLs.

    When you should stay with Firecrawl

    Stay where you are if Firecrawl is already reliable for your URLs, your team likes the SDKs, and your current costs are predictable enough. Migration work has a cost, even when the API shape is similar.

    Firecrawl is also a good fit if your team already depends deeply on its ecosystem, templates, or product-specific workflows.

    The point of a compatible API is not that everyone should switch. The point is that switching should be a testable engineering decision, not a rewrite.

    Where webclaw fits

    webclaw is built for teams that need web extraction inside AI products:

  • REST endpoints for scrape, crawl, map, search, batch, extract, summarize, brand, diff, and research
  • Firecrawl-compatible /v2 endpoints for migration tests
  • MCP server for Claude Code, Cursor, and other agent clients
  • Markdown, JSON, text, HTML, and LLM-ready output formats
  • A dashboard history view for inspecting previous runs
  • 7-day Starter trial so you can test real URLs before committing
  • Start with the Firecrawl comparison page if you want the product-level trade-offs. Start with the scrape API docs if you want the raw endpoint details.

    FAQ

    Is webclaw a drop-in Firecrawl replacement?

    For the common v2 scrape, crawl, map, and search paths, webclaw exposes Firecrawl-compatible endpoints. You should still test your exact request body and response parser, especially if you use advanced options.

    Can I keep using Firecrawl and test webclaw only for some URLs?

    Yes. That is the safest way to evaluate it. Put the provider behind a small routing option and send a handful of known URLs through both paths.

    Do I need to rewrite my AI agent?

    Not necessarily. If your agent calls an HTTP endpoint, test the compatible REST path. If your agent uses MCP, webclaw also ships an MCP server, so Claude Code, Cursor, and other MCP clients can call scrape, crawl, search, extract, and summarize as tools.

    What should I measure during the test?

    Measure output shape, downstream parser success, token size, latency on your real URLs, and whether the result is useful to your agent or RAG pipeline. Do not stop at status code.

    Ready to test the migration path? Start with the 7-day Starter trial, then run your current scrape body against /v2/scrape. If you want the bigger picture first, read webclaw vs Firecrawl or the best web scraping APIs for LLMs.

    Read next: MCP web scraping for Claude Code and Cursor | HTML to Markdown for LLMs | Cloudflare scraping checklist