Map
Discover all URLs on a site by parsing robots.txt and sitemap.xml. Recursively resolves sitemap indexes to find every listed page.
POST
/v1/mapDiscover all URLs on a site via sitemap parsing.
Request body
json
{
"url": "https://docs.example.com"
}| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | Base URL of the site to map. |
Response
json
{
"urls": [
"https://docs.example.com",
"https://docs.example.com/getting-started",
"https://docs.example.com/api/reference",
"https://docs.example.com/guides/authentication",
"https://docs.example.com/guides/deployment"
],
"count": 156
}| Field | Type | Description |
|---|---|---|
urls | string[] | All discovered URLs from sitemap parsing. |
count | number | Total number of URLs found. |
Note
The map endpoint checks
robots.txt for sitemap references first, then falls back to /sitemap.xml. Sitemap indexes are resolved recursively, so a single request can discover thousands of URLs.Example
curl
curl -X POST https://api.webclaw.io/v1/map \
-H "Authorization: Bearer wc_your_api_key" \
-H "Content-Type: application/json" \
-d '{"url": "https://docs.stripe.com"}'Tip
Use
/v1/map to build a URL list, then feed it to /v1/batch for bulk extraction. This is faster than crawling when the site has a comprehensive sitemap.