Map

Discover all URLs on a site by parsing robots.txt and sitemap.xml. Recursively resolves sitemap indexes to find every listed page.

POST/v1/map

Discover all URLs on a site via sitemap parsing.

Request body

json

{
  "url": "https://docs.example.com"
}

Field	Type	Required	Description
`url`	`string`	Yes	Base URL of the site to map.

Response

json

{
  "urls": [
    "https://docs.example.com",
    "https://docs.example.com/getting-started",
    "https://docs.example.com/api/reference",
    "https://docs.example.com/guides/authentication",
    "https://docs.example.com/guides/deployment"
  ],
  "count": 156
}

Field	Type	Description
`urls`	`string[]`	All discovered URLs from sitemap parsing.
`count`	`number`	Total number of URLs found.

Note

The map endpoint checks robots.txt for sitemap references first, then falls back to /sitemap.xml. Sitemap indexes are resolved recursively, so a single request can discover thousands of URLs.

Example

curl

curl -X POST https://api.webclaw.io/v1/map \
  -H "Authorization: Bearer wc_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://docs.stripe.com"}'

Tip

Use /v1/map to build a URL list, then feed it to /v1/batch for bulk extraction. This is faster than crawling when the site has a comprehensive sitemap.

Map

Request body

Response

Example

Ready to build? Start extracting.