FireScraper API Documentation

Use the TypeScript SDK or the REST API to start crawls, monitor progress, and download results. Everything you need to build web scraping into your pipeline.

Quick start

Zero dependencies. Works in Node.js 18+, Bun, Deno, and Cloudflare Workers.

Terminal
npm install @firescraper/sdk
TypeScript
import { FireScraper } from '@firescraper/sdk';

const client = new FireScraper('fsk_your_api_key');

// Start a crawl
const session = await client.scrape({
  name: 'Docs crawl',
  urls: ['https://docs.example.com/'],
  maxDepth: 2,
  scraper: 'article',
});

// Wait for it to finish
const result = await client.waitForCompletion(session.id, {
  onProgress: (s) => console.log(`${s.counts.success} pages scraped`),
});

// Download clean Markdown for your RAG pipeline
const download = await client.getResults(session.id, 'markdown');

Authentication

Every request requires an API key. Create keys from API Keys in the dashboard. Keys start with fsk_ and are shown only once.

Authorization: Bearer fsk_your_api_key
Rate limits

POST /api/v1/scrape — 30 req/min per key

GET /api/v1/sessions/:id — 120 req/min per key

GET /api/v1/sessions/:id/results — 60 req/min per key

SDK methods

client.scrape(options)

Start a new crawl. Returns the session ID immediately.

client.getSession(id)

Get session status, page counts, and queue depth.

client.waitForCompletion(id, opts?)

Poll until the crawl finishes. Supports onProgress callbacks.

client.listResults(id)

List available export files after crawl completes.

client.getResults(id, format)

Download results in a specific format.

client.getPartialResults(id, fmt?)

Download pages scraped so far mid-crawl.

Feed a RAG pipeline

TypeScript
const session = await client.scrape({
  name: 'Knowledge base',
  urls: ['https://docs.example.com/'],
  maxDepth: 4,
  scraper: 'article',
  respectRobotsTxt: true,
});

await client.waitForCompletion(session.id);
const docs = await client.getResults(session.id, 'documents');
const text = new TextDecoder().decode(docs.data);

for (const line of text.split('\n').filter(Boolean)) {
  const doc = JSON.parse(line);
  await vectorStore.upsert(doc.document_id, doc.text);
}

Structured extraction

TypeScript
const session = await client.scrape({
  name: 'Product catalog',
  urls: ['https://shop.example.com/products'],
  maxDepth: 2,
  extractionSchema: {
    type: 'object',
    properties: {
      product_name: { type: 'string' },
      price: { type: 'number' },
      in_stock: { type: 'boolean' },
    },
  },
});

await client.waitForCompletion(session.id);
const extracted = await client.getResults(session.id, 'extracted');

REST API

Use the REST API directly with curl, Python, Go, or any HTTP client.

POST
/api/v1/scrape
FieldTypeReqDescription
namestringYesProject name.
urlsstring[]YesOne or more seed URLs.
ignoreUrlsstring[]NoURLs to exclude.
maxDepthnumberNoLink-hop depth (0 = seed only).
minTextLengthnumberNoMinimum word count per page.
scraper"article" | "full"NoExtraction mode. Default: article.
uniqueTextDownloadsbooleanNoDeduplicate text content.
respectRobotsTxtbooleanNoHonour robots.txt rules.
contentSelectorstringNoCSS selector to restrict extraction. Max 500 chars.
webhookUrlstringNoPOST callback when crawl finishes.
extractionSchemastring (JSON)NoJSON Schema for structured extraction.
curl
curl -X POST https://firescraper.com/api/v1/scrape \
  -H "Authorization: Bearer fsk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Vendor docs crawl",
    "urls": ["https://docs.vendor.com/"],
    "maxDepth": 2,
    "scraper": "article",
    "respectRobotsTxt": true,
    "contentSelector": "main article",
    "webhookUrl": "https://example.com/webhook"
  }'
Response (201)
{
  "id": "SESSION_ID",
  "status": "in-progress",
  "message": "Scrape session created successfully.",
  "webhookSecret": "whsec_abc123..."
}
GET
/api/v1/sessions/:id
curl
curl https://firescraper.com/api/v1/sessions/SESSION_ID \
  -H "Authorization: Bearer fsk_your_api_key"
Response (200)
{
  "session": {
    "id": "SESSION_ID",
    "name": "Vendor docs crawl",
    "status": "in-progress",
    "downloadFilesReady": false
  },
  "counts": {
    "success": 124, "warning": 3,
    "error": 1, "total": 128
  },
  "processing": {
    "serverInstancesCount": 3,
    "queueLength": 41
  }
}
GET
/api/v1/sessions/:id/results
curl
# List available files
curl https://firescraper.com/api/v1/sessions/SESSION_ID/results \
  -H "Authorization: Bearer fsk_your_api_key"

# Download a specific format
curl -L "https://firescraper.com/api/v1/sessions/SESSION_ID/results?format=markdown" \
  -H "Authorization: Bearer fsk_your_api_key" \
  -o corpus.md

# Partial export (mid-crawl)
curl -L "https://firescraper.com/api/v1/sessions/SESSION_ID/results?partial=true&format=csv" \
  -H "Authorization: Bearer fsk_your_api_key" \
  -o partial.csv
PATCH
/api/v1/sessions/:id

Rotate the webhook signing secret for a session.

curl
curl -X PATCH https://firescraper.com/api/v1/sessions/SESSION_ID \
  -H "Authorization: Bearer fsk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "action": "rotate_webhook_secret" }'

Result formats

csv

Tabular export — URL, title, text, word count, status.

json

Full JSON array of all scraped pages.

jsonl

Newline-delimited JSON. One document per line.

markdown

Clean Markdown. Uses fewer LLM tokens than HTML.

zip

All formats bundled in a single ZIP archive.

documents

JSONL documents artifact with metadata.

chunks

JSONL chunks artifact for vector stores.

extracted

Structured extraction output (requires schema).

manifest

Crawl manifest with summary and file index.

Webhooks

Include a webhookUrl when starting a crawl. FireScraper sends a POST when exports are ready. Deliveries retry up to 3 times with exponential backoff.

Webhook payload
{
  "event": "session.completed",
  "occurredAt": "2026-05-18T14:30:00.000Z",
  "sessionId": "SESSION_ID",
  "session": {
    "id": "SESSION_ID",
    "name": "Vendor docs crawl",
    "status": "done"
  },
  "files": [
    { "format": "csv", "fileName": "corpus-csv.csv" },
    { "format": "zip", "fileName": "corpus-zip.zip" }
  ]
}
Verifying signatures

The POST /api/v1/scrape response includes a webhookSecret (shown once). Each delivery includes an x-firescraper-signature header:

t=<unix_timestamp>,v1=<hmac_sha256_hex>

Compute HMAC-SHA256(secret, "<timestamp>.<raw_body>") and compare with the v1 value using constant-time comparison.

Lost your secret? Use the rotate_webhook_secret action via PATCH to generate a new one.

Error codes

400Invalid request body or unsupported format.
401Missing or invalid API key.
404Session or artifact not found.
429Rate limit exceeded. Check the Retry-After header.
Need help?

Open a support ticket. Include your session ID for faster resolution.

Open support ticket