← Back to blog

Introducing FireScraper: Web Scraping Built for AI Teams

2 min read
announcement

We built FireScraper because getting clean web data into AI pipelines is harder than it should be.

If you are building a RAG pipeline, training a model, or populating a knowledge base, you need to turn websites into structured text. Most scraping tools were built for a pre-LLM world — they give you raw HTML and leave the parsing to you. FireScraper is different.

What FireScraper Does

FireScraper crawls websites and exports clean, structured text in the format your AI pipeline needs:

  • JSONL — one JSON object per page, ready to stream into embedding workflows
  • Markdown — LLM-optimized output that uses fewer tokens
  • CSV and JSON — for analysis and custom pipelines
  • Structured extraction — define a JSON schema and pull specific fields from every page

You can use the dashboard to start crawls visually, or the REST API and TypeScript SDK to automate everything.

How It Works

  1. Give it a URL — point FireScraper at a documentation site, blog, or knowledge base
  2. Set your depth — follow links up to N hops from the seed URL
  3. Choose your scraper — the article mode strips navigation, footers, and sidebars automatically
  4. Export — download results as JSONL, Markdown, CSV, or JSON

The dashboard shows real-time progress as pages are scraped. You can see which pages succeeded, which failed, and what is still in the queue.

Flat, Predictable Pricing

One page scraped equals one credit. Always. No multipliers for JavaScript rendering, structured extraction, or any other feature. Credits never expire.

The free tier gives you 1,000 credits — no credit card required. Paid plans start at $20 for 20,000 credits.

Built-in Scheduling

Set a crawl to run daily, weekly, or monthly. When it completes, a webhook notifies your pipeline to process the fresh data. No cron jobs, no external orchestration — just set it and forget it.

SDKs and Integrations

FireScraper has official SDKs for both TypeScript and Python:

  • TypeScript: npm install @firescraper/sdk
  • Python: pip install firescraper

The Python SDK includes sync and async clients, plus a LangChain document loader (langchain-firescraper). The REST API works with any language.

Start scraping for free

1,000 free crawl units. No credit card required. Export to JSONL, Markdown, CSV, and more.