RAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRINGRAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRING

DROP-IN REPLACEMENT

Web scraping for LangChain

Drop-in replacement for WebBaseLoader with bot protection bypass.

LangChain is the most popular framework for building LLM applications in Python and TypeScript. webclaw integrates as a document loader via the Firecrawl-compatible API, giving your LangChain agents access to real-time web data with automatic Cloudflare bypass.

Setup

LangChain Python — document loader

from langchain_community.document_loaders import FirecrawlLoader

# Point the Firecrawl loader at webclaw
loader = FirecrawlLoader(
    api_key="wc_...",
    api_url="https://api.webclaw.io",
    url="https://example.com",
    mode="scrape",
)

docs = loader.load()

# Feed into your vector store
vectorstore.add_documents(docs)

Why webclaw for LangChain

  • Drop-in Firecrawl v2 API compatibility
  • 118ms average response (faster than browser-based loaders)
  • Automatic Cloudflare, DataDome, AWS WAF bypass
  • LLM-optimized markdown cuts token costs 67%

Common use cases

  • RAG pipelines with fresh web content
  • Document loaders with bot protection bypass
  • LangChain agents with real-time web tools
  • Multi-source research chains

Frequently asked questions

Do I need to change my LangChain code to use webclaw?

No. webclaw implements Firecrawl's v2 API. Point the FirecrawlLoader at api.webclaw.io with your webclaw API key and existing code works unchanged.

Can LangChain agents call webclaw as a tool?

Yes. Wrap the webclaw SDK or REST API as a LangChain Tool and register it with your agent. Your agent can then call scrape, crawl, search, and extract at runtime.

Get started

500 pages/month free. No credit card. Open source.

Stay in the loop

Get notified when the webclaw API launches. Early subscribers get extended free tier access.

No spam. Unsubscribe anytime.