Extract
Extract structured JSON data from any URL. Provide a JSON schema for typed output, or a natural language prompt for flexible extraction. Both modes use an LLM to parse the page content.
POST
/v1/extractExtract structured data from a URL using a JSON schema or natural language prompt.
Note
This endpoint requires an LLM provider. The provider chain tries Ollama (local) first, then falls back to OpenAI, then Anthropic. At least one must be configured.
Schema mode
Provide a JSON Schema and the LLM will return data conforming to it. This gives you predictable, typed output.
Request body
json
{
"url": "https://example.com/pricing",
"schema": {
"type": "object",
"properties": {
"title": { "type": "string" },
"price": { "type": "number" },
"currency": { "type": "string" },
"features": {
"type": "array",
"items": { "type": "string" }
}
}
}
}Response
json
{
"data": {
"title": "Pro Plan",
"price": 49,
"currency": "USD",
"features": [
"Unlimited extractions",
"Priority support",
"Custom browser profiles"
]
}
}Prompt mode
Describe what you want in plain English. The LLM will determine the structure based on your prompt and the page content.
Request body
json
{
"url": "https://example.com/pricing",
"prompt": "Extract all pricing tiers with name, price, and features"
}Response
json
{
"data": {
"tiers": [
{
"name": "Hobby",
"price": 9,
"features": ["1 seat", "Community support"]
},
{
"name": "Pro",
"price": 49,
"features": ["5 seats", "Priority support", "Custom profiles"]
},
{
"name": "Scale",
"price": 199,
"features": ["500k pages/month", "Dedicated support", "SLA"]
}
]
}
}Parameters
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to extract data from. |
schema | object | No* | JSON Schema defining the desired output structure. |
prompt | string | No* | Natural language description of what to extract. |
Warning
You must provide either
schema or prompt. If both are provided, schema takes precedence.LLM provider chain
The extract endpoint tries LLM providers in this order:
- Ollama (local) -- free, no API key needed. Set
OLLAMA_HOSTif not running on localhost. - OpenAI -- requires
OPENAI_API_KEY. - Anthropic -- requires
ANTHROPIC_API_KEY.
Example
curl -- schema mode
curl -X POST https://api.webclaw.io/v1/extract \
-H "Authorization: Bearer wc_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product/widget",
"schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" },
"in_stock": { "type": "boolean" }
}
}
}'curl -- prompt mode
curl -X POST https://api.webclaw.io/v1/extract \
-H "Authorization: Bearer wc_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/team",
"prompt": "Extract all team members with name, role, and LinkedIn URL"
}'