Schema-typed output
Pass a JSON Schema and the response conforms to it exactly, so your code gets predictable, typed fields every time.
POST /v1/extract
Point an LLM at any URL and get back the exact JSON shape you asked for.
Define a JSON schema or write a plain-English prompt, send a URL, and an LLM reads the page and returns typed, structured data ready to feed straight into your agent or pipeline. No selectors, no parsing code, no brittle scrapers to maintain.
Pass a JSON Schema and the response conforms to it exactly, so your code gets predictable, typed fields every time.
Skip the schema and describe what you want in plain English, and the LLM infers the structure from the page.
Pages are stripped to LLM-ready content before extraction, cutting roughly 90% of the tokens versus raw HTML.
JS rendering and bot protection are resolved automatically, so extraction works on the same sites a scrape would.
01
POST the target URL with either a JSON schema or a natural language prompt describing what to pull.
02
The page is fetched in Rust and stripped of nav, ads, and boilerplate into compact content for the model.
03
An LLM reads the cleaned content and maps it onto your schema, or builds a structure that fits your prompt.
04
You get back a data object with the typed fields you defined, ready to use directly in your application.
Send a URL to POST /v1/extract along with a JSON schema describing the fields you want. The endpoint fetches and cleans the page, then an LLM maps the content onto your schema and returns a typed data object. No selectors or parsing code required.
Schema mode takes a JSON Schema and returns data conforming to it, giving you predictable typed output. Prompt mode takes a plain-English description and lets the LLM infer the structure from the page. If you send both, the schema wins.
Yes. Extraction runs on top of the same fetch path as scraping, so JS rendering and bot protection are handled automatically before the LLM ever sees the content.
No. Credits are only consumed on successful responses. A standard page is 1 credit; heavier work like JS rendering or protected-site access costs a few extra credits.
Yes. The core extraction engine is open source and can be self-hosted for free. The hosted API adds managed infrastructure, automatic JS rendering, and bot protection handling on top.
One credit pool, every endpoint. Cancel anytime, or self-host the open-source core for free.
Cookies & analytics
We'd like to use analytics to understand how this site is used. Nothing loads or fires until you agree. See our privacy policy for the full list of processors.