RAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRINGRAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRING

POST /v1/brand + /v1/extract

Lead and contact enrichment from the web

Turn a company domain into structured firmographic data.

Given a company website, webclaw extracts brand identity, tech stack signals, team information, pricing tiers, and product features. Perfect for sales intelligence, ABM targeting, and enrichment pipelines without paying Clearbit or ZoomInfo prices.

The problem

Enrichment providers charge $0.20-$2 per contact lookup and provide stale data scraped months ago. Building your own scraping pipeline requires handling bot protection, parsing diverse site structures, and extracting fields consistently across thousands of sites.

The webclaw solution

webclaw /v1/brand extracts design identity and positioning. /v1/extract with a firmographic schema pulls structured company data from About, Pricing, and Team pages. Combine with /v1/crawl for full-site enrichment and /v1/map for sitemap discovery.

Why webclaw for lead enrichment

  • Brand identity extraction (logo, colors, fonts, tagline)
  • LLM structured extraction with custom schemas
  • Site mapping to discover About, Pricing, Team pages
  • Fresh data: scrape on demand, not stale databases
  • Predictable per-page pricing at scale

Code example

Python — enrich a lead from domain

from webclaw import Webclaw

client = Webclaw(api_key="wc_...")

domain = "example.com"

# Brand identity
brand = client.brand(url=f"https://{domain}")

# Firmographic extraction from home page
firmographic = client.extract(
    url=f"https://{domain}",
    schema={
        "type": "object",
        "properties": {
            "company_name": {"type": "string"},
            "tagline": {"type": "string"},
            "industry": {"type": "string"},
            "product_categories": {"type": "array", "items": {"type": "string"}},
            "target_audience": {"type": "string"},
        },
    },
)

lead = {**brand.data, **firmographic.data, "domain": domain}
print(lead)

webclaw features for this use case

  • Brand extraction (logo, colors, fonts)
  • LLM structured extraction with schemas
  • Sitemap and page discovery
  • Fresh data on every call
  • Volume-friendly pricing

Frequently asked questions

Can webclaw replace Clearbit or ZoomInfo for lead enrichment?

For firmographic data extracted from public websites, yes. webclaw gives you structured company data from home, about, pricing, and team pages at a fraction of the cost. It does not provide B2B contact emails, which are datasets Clearbit builds separately.

How do I extract the same fields across 10,000 different company sites?

Define a JSON schema once and use /v1/extract with that schema against every domain. The LLM extraction handles the variation in page structures and returns consistently shaped output you can load into your CRM.

What about privacy and GDPR compliance?

webclaw only scrapes publicly accessible content, respects robots.txt by default, and does not store customer data. You are responsible for complying with GDPR, CCPA, and other regulations when processing scraped data.

Related reading

Start building

500 pages/month free. No credit card. Open source.

Stay in the loop

Get notified when the webclaw API launches. Early subscribers get extended free tier access.

No spam. Unsubscribe anytime.