POST /v1/diff

Content change monitoring for terms, pricing, and policy pages

Snapshot any page, diff it on a schedule, and get alerted only when the content actually changes.

Track when a terms of service, pricing, policy, or docs page changes. webclaw snapshots the page as clean markdown, diffs it against the last version, and tells you exactly what changed.

Try a diff live
How it works

Build it step by step.

The real flow, one step at a time. Switch between TypeScript, Python, and cURL on any snippet.

  1. 1

    Scrape a baseline

    Call /v1/scrape with formats markdown to capture the page as clean text and store it as your first snapshot.

    const url = "https://example.com/terms";// Capture the page as clean markdownconst baseline = await webclaw.scrape({ url, formats: ["markdown"] });// Store it as your first snapshotawait saveSnapshot(url, baseline.markdown);
  2. 2

    Diff on a schedule

    On each run, scrape the page again and pass the new version plus the stored snapshot to /v1/diff to get added and removed content.

    // On each run, scrape the page againconst current = await webclaw.scrape({ url, formats: ["markdown"] });// Compare it against the last stored snapshotconst diff = await webclaw.diff({  url,  previous: await loadLastSnapshot(url),  current: current.markdown,});
  3. 3

    Alert on real changes

    When the diff reports changed, fire a webhook to Slack or Discord and persist the new version as the next baseline.

    if (diff.changed) {  // Fire a webhook to Slack or Discord  await notify(url, diff.changes); // added / removed lines  // Persist the new version as the next baseline  await saveSnapshot(url, current.markdown);}
  4. 4

    Replay from history

    Use the dashboard or cached replay to inspect any past snapshot and confirm exactly what a page said at a given time.

Why webclaw

Built for content change monitoring.

Diffs clean markdown, not raw HTML, so layout and ad noise do not trigger false alerts

Automatic bot-protection handling for pages other scrapers cannot reach

Snapshot and replay model: store a version, compare any two later

118ms on static pages makes watching hundreds of pages affordable

Dashboard history records every request, response, timing, and cost

What you get

Everything this use case needs.

  • Markdown snapshots with boilerplate stripped
  • Structured diff of added and removed content
  • Scheduled re-checks with webhook alerts
  • Cached replay for fast debugging
  • Bot-protection handling on gated pages
Where it fits

Built for the messy parts.

Terms of service, privacy policies, pricing, SLAs, and supplier docs change silently and without notice. Watching them by hand does not scale past a handful of pages, and naive HTML scrapers flag every layout tweak, ad rotation, or session token as a change, so you drown in false positives and miss the edits that matter.

webclaw scrapes each page to clean markdown with navigation, ads, and boilerplate stripped, stores it as a snapshot, then uses /v1/diff to compare the current version against the previous one. You get a structured diff of the meaningful text only, so you can alert on a clause being added to a contract or a price tier changing, not on cosmetic noise.

Common questions

Frequently asked questions

How is this different from price monitoring?

Price monitoring tracks numeric fields like price and stock on product pages. Content change monitoring watches the full text of a page, terms of service, privacy policies, pricing tables, SLAs, or docs, and tells you which sentences or clauses were added or removed.

How does webclaw avoid false positives from layout changes?

webclaw diffs clean markdown, not raw HTML. Navigation, ads, scripts, and boilerplate are stripped before the comparison, so a redesign or an ad rotation does not register as a change. Only the meaningful body text is compared.

Can I run checks on a schedule and get alerted?

Yes. Run /v1/diff on a cron or job runner against your stored snapshots, and fire a webhook when diff.changed is true. webclaw signs webhook payloads with HMAC and can format them for Slack or Discord.

For AI agents

Or hand it to your agent.

Add the webclaw MCP server to Claude, Cursor, or any MCP client, then paste this prompt. The agent calls the webclaw tools and hands the result back to your model — no code to write.

PROMPT FOR YOUR AGENT

Using the webclaw tools, monitor [the page URL] for meaningful content changes (for example a terms-of-service, pricing, privacy policy, SLA, or docs page). First call the scrape tool on that URL to capture a clean markdown snapshot with navigation, ads, and boilerplate stripped out, and treat that as the current version. Then call the diff tool to compare it against [paste the previous snapshot here, or say "this is the first run" if you have none], so cosmetic noise like layout tweaks or session tokens is ignored and only real text edits surface. If nothing changed, just tell me "no changes." If something changed, return a short alert that names the page, lists the exact clauses or lines that were added and removed, and gives a one-sentence plain-English summary of what it means; then print the full new markdown snapshot so I can save it for the next comparison.

Ready to build? Start extracting.

Cancel anytime. Clean, structured data on every call.

View API docs