webclaw

TypeScript SDK

The TypeScript SDK provides a fully typed client for every webclaw endpoint. Zero runtime dependencies -- uses native fetch. Ships ESM + CJS via tsup.

Installation

npm
npm install webclaw
pnpm
pnpm add webclaw
Note
Requires Node.js 18+ (for native fetch). Also works in Bun and Deno.

Configuration

Create a client by passing a config object. Only apiKey is required.

Basic
import { Webclaw } from "webclaw";

const client = new Webclaw({ apiKey: "wc_your_api_key" });

Options

PropertyTypeDefaultDescription
apiKeystring--Your webclaw API key (starts with wc_).
baseUrlstringhttps://api.webclaw.ioOverride for self-hosted instances.
timeoutnumber30000Request timeout in milliseconds.
All options
const client = new Webclaw({
  apiKey: "wc_your_api_key",
  baseUrl: "https://api.webclaw.io",  // default
  timeout: 60_000,                     // ms
});

Scrape

typescript
const result = await client.scrape({
  url: "https://example.com",
  formats: ["markdown", "text", "llm"],
  include_selectors: ["article", ".content"],
  exclude_selectors: ["nav", "footer"],
  only_main_content: true,
  no_cache: true,
});

result.url       // string
result.markdown  // string | undefined
result.text      // string | undefined
result.llm       // string | undefined
result.metadata  // PageMetadata
result.cache     // { status: "hit" | "miss" | "bypass" }
result.warning   // string | undefined

Crawl

Returns a CrawlJob handle that you can poll or wait on.

typescript
const job = await client.crawl({
  url: "https://example.com",
  max_depth: 3,
  max_pages: 100,
  use_sitemap: true,
});

// Poll until complete
const status = await job.waitForCompletion({
  interval: 2_000,   // ms between polls (default 2s)
  maxWait: 300_000,  // max total wait (default 5min)
});

console.log(status.status);    // "completed" | "failed"
console.log(status.total);     // pages discovered
console.log(status.completed); // pages crawled
console.log(status.errors);    // pages that errored

for (const page of status.pages) {
  console.log(page.url, page.markdown?.length);
}
Tip
You can also poll manually with job.getStatus() if you want custom retry logic.

Map

typescript
const result = await client.map({ url: "https://example.com" });
console.log(result.count);
result.urls.forEach((url) => console.log(url));

Batch

typescript
const result = await client.batch({
  urls: ["https://a.com", "https://b.com", "https://c.com"],
  formats: ["markdown"],
  concurrency: 5,
});

for (const item of result.results) {
  if ("error" in item) {
    console.error(item.url, item.error);
  } else {
    console.log(item.url, item.markdown?.length);
  }
}

Extract

LLM-powered structured extraction. Pass a JSON schema or a plain-text prompt.

Schema-based
const result = await client.extract({
  url: "https://example.com/pricing",
  schema: {
    type: "object",
    properties: {
      plans: {
        type: "array",
        items: {
          type: "object",
          properties: {
            name: { type: "string" },
            price: { type: "string" },
          },
        },
      },
    },
  },
});
console.log(result.data);
Prompt-based
const result = await client.extract({
  url: "https://example.com",
  prompt: "Extract all pricing tiers with names and monthly prices",
});
console.log(result.data);

Summarize

typescript
const result = await client.summarize({
  url: "https://example.com",
  max_sentences: 3,
});
console.log(result.summary);

Brand

typescript
const result = await client.brand({ url: "https://example.com" });
console.log(result); // { colors: [...], fonts: [...], logo_url: "..." }

Error handling

All errors extend WebclawError. Use instanceof for specific cases.

ClassHTTP statusProperties
AuthenticationError401message
NotFoundError404message
RateLimitError429message, retryAfter
TimeoutError--message, timeout
typescript
import {
  WebclawError,
  AuthenticationError,
  RateLimitError,
  TimeoutError,
} from "webclaw";

try {
  const result = await client.scrape({ url: "https://example.com" });
} catch (err) {
  if (err instanceof AuthenticationError) {
    console.error("Check your API key");
  } else if (err instanceof RateLimitError) {
    console.error("Rate limited, retry after:", err.retryAfter, "s");
  } else if (err instanceof TimeoutError) {
    console.error("Request timed out after", err.timeout, "ms");
  } else if (err instanceof WebclawError) {
    console.error("API error:", err.statusCode, err.message);
  }
}

TypeScript types

All request and response interfaces are exported from the package root.

typescript
import type {
  ScrapeRequest,
  ScrapeResponse,
  CrawlRequest,
  CrawlStatusResponse,
  MapResponse,
  BatchRequest,
  BatchResponse,
  ExtractRequest,
  ExtractResponse,
  SummarizeResponse,
  BrandResponse,
  Format,
  PageMetadata,
  WebclawConfig,
} from "webclaw";

Source

github.com/0xMassi/webclaw-js