POST /v1/summarize

Page summaries your agents can read.

Collapse any page into a tight summary with a single call.

Feed your LLM the gist, not the whole page. Send a URL, get back a clean summary your agents can act on without burning context on nav, ads, and boilerplate. Built in Rust, scrapes and cleans the page before the model ever sees it, and you control the length down to the sentence.

View docs
What you get

Everything in one call.

Length you control

Set max_sentences and get a summary sized for your prompt, from a one-liner to a full paragraph.

Clean source content

The page is scraped and stripped of nav, ads, and boilerplate before the model reads it, so the summary stays on topic.

Token savings built in

Hand your LLM a summary instead of raw HTML and cut roughly 90% of the tokens you would otherwise spend on the page.

Handles the hard pages

JavaScript rendering and bot protection are dealt with automatically, so summaries come back even from pages a plain fetch can't load.

How it works

From URL to output in four steps.

01

Send a URL

POST the page URL and an optional max_sentences to set how long the summary should be.

02

Page gets fetched

Webclaw scrapes the page in Rust, rendering JavaScript and clearing bot protection when needed.

03

Content gets cleaned

Nav, ads, and boilerplate are stripped so only the real content is passed to the model.

04

Summary returned

An LLM condenses the cleaned text and you get back a single structured summary field.

API

One request, structured back.

Summary

Webclaw is a web scraping tool designed for AI agents that converts any URL into clean markdown or JSON format while reducing token usage by 90% compared to raw HTML. It offers multiple products including an MCP server for integration with Claude and Cursor, a cloud API with REST endpoints, and a CLI tool, all built on an open-source core that can be self-hosted. The service provides fast extraction with bot protection, supports multiple output formats, and offers pricing plans starting at $19/month with a credit-based system for page extraction.

Common questions

Frequently asked questions

How do I summarize a web page with an API?

Send a POST to /v1/summarize with the page URL and an optional max_sentences. Webclaw scrapes and cleans the page, then an LLM returns a structured summary field. No scraping or prompt setup on your side.

Can I control how long the summary is?

Yes. Pass max_sentences to cap the length, from a single sentence to a full paragraph. It defaults to 3 sentences if you leave it out.

Does the summarize API handle JavaScript-heavy or protected pages?

Yes. JavaScript rendering and bot protection are handled automatically, so you still get a summary from pages a plain HTTP fetch would fail to load.

Why summarize a page instead of sending raw HTML to my LLM?

Raw HTML wastes tokens on nav, ads, and markup. A summary front-loads the meaning and cuts roughly 90% of the tokens, so your agent reasons over the gist instead of the page source.

Am I billed for failed requests?

No. Credits are only consumed on successful responses. A standard page is 1 credit; heavier work like JS rendering or protected-site access costs a few extra credits.

Ship an agent that actually sees the web.

One credit pool, every endpoint. Cancel anytime, or self-host the open-source core for free.

API docs