Batch
Extract content from multiple URLs in a single request. Requests are processed concurrently on the server.
POST
/v1/batchExtract content from multiple URLs concurrently.
Request body
json
{
"urls": [
"https://example.com/page-1",
"https://example.com/page-2",
"https://example.com/page-3"
],
"formats": ["markdown", "llm"],
"concurrency": 5
}| Field | Type | Required | Description |
|---|---|---|---|
urls | string[] | Yes | Array of URLs to extract. |
formats | string[] | No | Output formats. Options: markdown, llm, text, json. Defaults to ["markdown"]. |
concurrency | number | No | Max concurrent requests. Default: 5. |
Response
json
{
"results": [
{
"url": "https://example.com/page-1",
"markdown": "# Page One\n\nContent of the first page...",
"metadata": {
"title": "Page One",
"word_count": 654
},
"error": null
},
{
"url": "https://example.com/page-2",
"markdown": "# Page Two\n\nContent of the second page...",
"metadata": {
"title": "Page Two",
"word_count": 1102
},
"error": null
},
{
"url": "https://example.com/page-3",
"markdown": null,
"metadata": null,
"error": "Failed to fetch: 404 Not Found"
}
]
}Note
Individual URL failures do not fail the entire batch. Check the
error field on each result to detect per-URL failures.Example
curl
curl -X POST https://api.webclaw.io/v1/batch \
-H "Authorization: Bearer wc_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"urls": [
"https://openai.com/blog/gpt-4",
"https://anthropic.com/research/claude-3",
"https://deepmind.google/technologies/gemini"
],
"formats": ["markdown", "llm"],
"concurrency": 3
}'Tip
For large URL lists, keep concurrency reasonable (5-10) to avoid overwhelming target servers. The server-side default of 5 is a good starting point.