ariangibson/firecrawl-lite-mcp-server
If you are the rightful owner of firecrawl-lite-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Firecrawl Lite MCP Server is a privacy-first, standalone server for web scraping and data extraction using local browser automation and your own LLM API key.
Firecrawl Lite MCP Server
A privacy-first, standalone MCP server that provides web scraping and data extraction tools using local browser automation and your own LLM API key. No external dependencies or API keys required
🎯 What Makes Firecrawl Lite Special
🔒 Privacy-First Architecture
- Local Processing - All web scraping and data extraction happens on your machine
- Your Data Stays Local - Content is processed locally, not sent to third parties
- No External Service Lock-in - Doesn't require a cloud API
- Complete Control - You own your data and infrastructure
💰 Cost-Effective & Transparent
- Pay Only for LLM Usage - No additional subscription or API fees
- Your LLM Provider - Compatible with OpenAI, xAI, Anthropic, Ollama, etc.
- Predictable Costs - Transparent pricing based on your chosen LLM rates
⚡ Performance & Simplicity
- Lightning-Fast Startup - Lightweight design means quick initialization
- Single Container - Simple deployment with Docker support
- Minimal Resource Usage - Optimized for efficiency and low memory footprint
🛠️ Available Tools
✅ scrape_page
- Extract content from a single webpage
- Use case: Get webpage content for LLMs to read
- Parameters:
url
,onlyMainContent
✅ batch_scrape
- Scrape multiple URLs in a single request
- Use case: Process multiple pages efficiently
- Parameters:
urls[]
,onlyMainContent
✅ extract_data
- Extract structured data using LLM
- Use case: Pull specific data from pages using natural language prompts
- Parameters:
urls[]
,prompt
,enableWebSearch
✅ extract_with_schema
- Extract data using JSON schema
- Use case: Extract structured data with predefined schema
- Parameters:
urls[]
,schema
,prompt
,enableWebSearch
✅ screenshot
- Take a screenshot of a webpage
- Use case: Capture visual representation of pages
- Parameters:
url
,width
,height
,fullPage
🚀 Quick Start (Recommended)
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json
:
{
"mcpServers": {
"firecrawl-lite": {
"command": "npx",
"args": ["-y", "@ariangibson/firecrawl-lite-mcp-server"],
"env": {
"LLM_API_KEY": "your_llm_api_key_here",
"LLM_PROVIDER_BASE_URL": "https://api.x.ai/v1",
"LLM_MODEL": "grok-code-fast-1"
}
}
}
}
Claude Code (CLI)
claude mcp add firecrawl-lite npx -- -y @ariangibson/firecrawl-lite-mcp-server --env LLM_API_KEY=your_key --env LLM_PROVIDER_BASE_URL=https://api.x.ai/v1 --env LLM_MODEL=grok-code-fast-1
Cursor
Add to your Cursor MCP configuration:
{
"mcpServers": {
"firecrawl-lite": {
"command": "npx",
"args": ["-y", "@ariangibson/firecrawl-lite-mcp-server"],
"env": {
"LLM_API_KEY": "your_llm_api_key_here",
"LLM_PROVIDER_BASE_URL": "https://api.x.ai/v1",
"LLM_MODEL": "grok-code-fast-1"
}
}
}
}
⚙️ Configuration
Required Environment Variables
# Your LLM API key (xAI, OpenAI, Anthropic, etc.)
LLM_API_KEY=your_api_key_here
# LLM provider base URL
LLM_PROVIDER_BASE_URL=https://api.x.ai/v1
# LLM model name
LLM_MODEL=grok-code-fast-1
LLM Provider Examples
# xAI (Grok)
LLM_PROVIDER_BASE_URL=https://api.x.ai/v1
LLM_API_KEY=xai-your-key-here
LLM_MODEL=grok-code-fast-1
# OpenAI
LLM_PROVIDER_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-your-key-here
LLM_MODEL=gpt-4o-mini
# Anthropic
LLM_PROVIDER_BASE_URL=https://api.anthropic.com
LLM_API_KEY=sk-ant-your-key-here
LLM_MODEL=claude-3-haiku-20240307
# Local LLM (Ollama)
LLM_PROVIDER_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=your-local-key
LLM_MODEL=llama2
🌐 Remote Deployment
For remote servers or Docker deployments, be sure to enable at least one of the HTTP endpoints (depending on which transport protocol you are planning to use) - these are not enabled by default:
Docker
docker run -d \
-p 3000:3000 \
-e ENABLE_HTTP_STREAMABLE_ENDPOINT=true \
-e ENABLE_SSE_ENDPOINT=true \
-e LLM_API_KEY=your_key_here \
-e LLM_PROVIDER_BASE_URL=https://api.x.ai/v1 \
-e LLM_MODEL=grok-code-fast-1 \
ariangibson/firecrawl-lite-mcp-server:latest
Claude Code (Remote)
claude mcp add firecrawl-lite-remote http://your-server:3000/mcp -t http
Claude Desktop (Remote)
Method 1: Connectors (Recommended - HTTPS only) The official Claude Desktop method for remote MCP servers:
- Go to Claude Desktop → Settings → Connectors
- Add connector:
https://your-server.com:3000/mcp
- Requires: HTTPS server with valid SSL certificate
Method 2: mcp-proxy (HTTP/HTTPS fallback) For servers without SSL certificates:
pip install mcp-proxy
Add to claude_desktop_config.json:
{
"mcpServers": {
"firecrawl-lite": {
"command": "mcp-proxy",
"args": ["http://your-server:3000/sse"]
}
}
}
🛠️ Advanced Configuration
Proxy Support
PROXY_SERVER_URL=http://your-proxy-server.com:1337
PROXY_SERVER_USERNAME=your-username
PROXY_SERVER_PASSWORD=your-password
Anti-Detection
SCRAPE_USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.3"
SCRAPE_DELAY_MIN=1000
SCRAPE_DELAY_MAX=3000
Performance Tuning
SCRAPE_VIEWPORT_WIDTH=1920
SCRAPE_VIEWPORT_HEIGHT=1080
SCRAPE_BATCH_DELAY_MIN=2000
SCRAPE_BATCH_DELAY_MAX=5000
🛠️ Troubleshooting
Chrome Issues
Chrome is automatically installed on first use. If you encounter issues:
# Manual installation
npx puppeteer browsers install chrome
# Reset if corrupted
rm -rf ~/.cache/puppeteer && npx puppeteer browsers install chrome
Connection Issues
- Verify internet connectivity
- Check LLM provider URL accessibility
- Ensure API keys are valid
- For corporate networks, configure proxy settings
📊 Usage Examples
Scrape a webpage
{
"name": "scrape_page",
"arguments": {
"url": "https://example.com"
}
}
Batch scrape multiple URLs
{
"name": "batch_scrape",
"arguments": {
"urls": ["https://example.com", "https://example.org"],
"onlyMainContent": true
}
}
Extract data with prompt
{
"name": "extract_data",
"arguments": {
"urls": ["https://example.com"],
"prompt": "Extract the main article title and summary"
}
}
Extract with schema
{
"name": "extract_with_schema",
"arguments": {
"urls": ["https://example.com"],
"schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"description": {"type": "string"}
}
}
}
}
🐳 Container Registries
Pre-built images are available:
Docker Hub: ariangibson/firecrawl-lite-mcp-server:latest
GitHub Container Registry: ghcr.io/ariangibson/firecrawl-lite-mcp-server:latest
Both support multi-architecture (amd64
, arm64
) with automatic updates.
🙏 Credits & Acknowledgments
This project is inspired by the excellent work of the original Firecrawl projects:
🔥 Firecrawl
The original Firecrawl project by Mendable.ai - a comprehensive web scraping platform with advanced features.
🔥 Firecrawl MCP Server
The official MCP server implementation by the Firecrawl team.
We give huge thanks to the Firecrawl team for their pioneering work in web scraping and MCP integration! 🚀
💡 Looking for enterprise-grade web scraping?
Visit firecrawl.com for their cloud service with zero setup complexity.
📝 License
MIT License - see for details.