Batch

Extract content from multiple URLs in a single request. Requests are processed concurrently on the server.

POST/v1/batch

Extract content from multiple URLs concurrently.

Request body

json

{
  "urls": [
    "https://example.com/page-1",
    "https://example.com/page-2",
    "https://example.com/page-3"
  ],
  "formats": ["markdown", "llm"],
  "concurrency": 5
}

Field	Type	Required	Description
`urls`	`string[]`	Yes	Array of URLs to extract.
`formats`	`string[]`	No	Output formats. Options: `markdown`, `llm`, `text`, `json`. Defaults to `["markdown"]`.
`concurrency`	`number`	No	Max concurrent requests. Default: 5.

Response

json

{
  "results": [
    {
      "url": "https://example.com/page-1",
      "markdown": "# Page One\n\nContent of the first page...",
      "metadata": {
        "title": "Page One",
        "word_count": 654
      },
      "error": null
    },
    {
      "url": "https://example.com/page-2",
      "markdown": "# Page Two\n\nContent of the second page...",
      "metadata": {
        "title": "Page Two",
        "word_count": 1102
      },
      "error": null
    },
    {
      "url": "https://example.com/page-3",
      "markdown": null,
      "metadata": null,
      "error": "Failed to fetch: 404 Not Found"
    }
  ]
}

Note

Individual URL failures do not fail the entire batch. Check the error field on each result to detect per-URL failures.

Example

curl

curl -X POST https://api.webclaw.io/v1/batch \
  -H "Authorization: Bearer wc_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": [
      "https://openai.com/blog/gpt-4",
      "https://anthropic.com/research/claude-3",
      "https://deepmind.google/technologies/gemini"
    ],
    "formats": ["markdown", "llm"],
    "concurrency": 3
  }'

Tip

For large URL lists, keep concurrency reasonable (5-10) to avoid overwhelming target servers. The server-side default of 5 is a good starting point.

Batch

Request body

Response

Example

Ready to build? Start extracting.