DROP-IN REPLACEMENT
Web scraping for LangChain
Drop-in replacement for WebBaseLoader with bot protection bypass.
LangChain is the most popular framework for building LLM applications in Python and TypeScript. webclaw integrates as a document loader via the Firecrawl-compatible API, giving your LangChain agents access to real-time web data with automatic Cloudflare bypass.
Setup
LangChain Python — document loader
from langchain_community.document_loaders import FirecrawlLoader
# Point the Firecrawl loader at webclaw
loader = FirecrawlLoader(
api_key="wc_...",
api_url="https://api.webclaw.io",
url="https://example.com",
mode="scrape",
)
docs = loader.load()
# Feed into your vector store
vectorstore.add_documents(docs)Why webclaw for LangChain
- Drop-in Firecrawl v2 API compatibility
- 118ms average response (faster than browser-based loaders)
- Automatic Cloudflare, DataDome, AWS WAF bypass
- LLM-optimized markdown cuts token costs 67%
Common use cases
- RAG pipelines with fresh web content
- Document loaders with bot protection bypass
- LangChain agents with real-time web tools
- Multi-source research chains
Frequently asked questions
Do I need to change my LangChain code to use webclaw?
No. webclaw implements Firecrawl's v2 API. Point the FirecrawlLoader at api.webclaw.io with your webclaw API key and existing code works unchanged.
Can LangChain agents call webclaw as a tool?
Yes. Wrap the webclaw SDK or REST API as a LangChain Tool and register it with your agent. Your agent can then call scrape, crawl, search, and extract at runtime.
OTHER INTEGRATIONS