POST /v1/map

Every URL on a site, in one call.

Discover every page URL for a domain before you crawl or extract.

Point your agent at a domain and get back a flat list of every page URL, parsed from robots.txt and sitemap.xml. Built in Rust, it resolves sitemap indexes recursively so one request can surface thousands of links, ready to feed into batch or crawl.

View docs

What you get

Everything in one call.

Full URL list

Get back a flat array of every page URL the site exposes, ready to feed straight into another endpoint.

Recursive sitemap resolution

Nested sitemap indexes are followed automatically, so a single request can surface thousands of links from one domain.

robots.txt aware

It reads robots.txt for sitemap references first, then falls back to /sitemap.xml when none are declared.

Fast crawl planning

Map a site before you scrape it to scope the work, skip dead routes, and feed the right URLs into batch.

How it works

From URL to output in four steps.

Send a domain

POST the base URL of the site you want to map with your API key.

Sources are checked

We read robots.txt for declared sitemaps, then look for /sitemap.xml if none are listed.

Indexes resolved

Sitemap indexes are expanded recursively to collect every listed page in one pass.

URL list returned

You get a deduplicated array of discovered URLs plus a total count.

API

One request, structured back.

POST /v1/map

Sitemap for webclaw.io

94 URLs discovered

https://webclaw.io
https://webclaw.io/pricing
https://webclaw.io/sponsor
https://webclaw.io/blog
https://webclaw.io/changelog
https://webclaw.io/products/cli
https://webclaw.io/products/api
https://webclaw.io/products/mcp

Common questions

Frequently asked questions

What is a sitemap API?

A sitemap API takes a domain and returns the list of page URLs that site publishes, parsed from its robots.txt and sitemap.xml. It is how an agent learns what pages exist before deciding what to crawl or extract.

How do I get all the URLs of a website?

Send the base URL to POST /v1/map. It reads robots.txt for declared sitemaps, falls back to /sitemap.xml, resolves any sitemap indexes recursively, and returns a flat array of every discovered URL with a total count.

What is the difference between map and crawl?

Map only discovers URLs from a site's sitemaps and returns them as a list, so it is fast and cheap. Crawl actually visits pages and extracts their content. A common pattern is to map first, then feed the URLs into batch or crawl.

Does it work if a site has no sitemap?

It checks robots.txt and /sitemap.xml first. If neither exists, the URL list will be limited to what those sources expose, so for sites without a sitemap you will want crawl instead, which follows links from the pages themselves.

Am I billed for failed requests?

No. Credits are only consumed on successful responses. A standard page is 1 credit; heavier work like JS rendering or protected-site access costs a few extra credits.

Ship an agent that actually sees the web.

One credit pool, every endpoint. Cancel anytime, or self-host the open-source core for free.

API docs

Every endpoint

Web Scraping API HTML to Markdown API Web Crawler API Web Search API Batch Scraping API AI Web Extraction API Webpage Summarization API Website Change Monitoring API Brand Data API Deep Research API YouTube Transcript API Lead Enrichment API