cloudscraper-mcp-server by LLMTooling - MCP Server

CloudScraper MCP Server

A Model Context Protocol server that enables AI agents to bypass Cloudflare protection and scrape web content

Core Features

Feature	Description
Cloudflare Bypass	Automatically handles Cloudflare protection using cloudscraper library
Multiple Transports	Supports both stdio and HTTP transport protocols
Content Cleaning	Converts HTML to clean, LLM-friendly Markdown format
Smart Chunking	Automatically splits large responses into 10k token chunks
Docker Support	Production-ready containerized deployment
Multiple Methods	Supports GET and POST HTTP methods
Binary Handling	Base64 encoding for non-text content
File Export	Save scraped content directly to disk

Available MCP Tools

Tool Comparison

Tool	Return Type	Use Case	Chunking Support	File Output
scrape_url	String (content only)	Quick content retrieval for AI processing	Yes	No
scrape_url_raw	Dictionary (metadata + content)	Full response details with headers and timing	Yes	No
scrape_url_to_file	Dictionary (save confirmation)	Export content to workspace files	No	Yes

Shared Parameters

Parameter	Type	Required	Default	Description
`url`	string	Yes	-	Target URL to scrape
`method`	string	No	"GET"	HTTP method (GET or POST)
`clean_content`	boolean	No	true	Convert HTML to Markdown
`continuation_token`	string	No	null	Token for retrieving next chunk

scrape_url Response Fields

Field	Type	Description
Response	string	Page content with chunk instructions if applicable

Note: When content exceeds 10k tokens, response includes continuation instructions embedded in the text.

scrape_url_raw Response Fields

Field	Type	Always Present	Description
`status_code`	integer	Yes	HTTP response status code
`headers`	object	Yes	Response headers (hop-by-hop headers removed)
`content`	string	Yes	Page content or current chunk
`content_type`	string	Yes	MIME type of response
`response_time`	number	Yes	Request duration in seconds
`chunked`	boolean	When chunked	Indicates response was split
`chunk_index`	integer	When chunked	Current chunk number (1-based)
`total_chunks`	integer	When chunked	Total number of chunks
`continuation_token`	string	When more chunks	Token for next chunk retrieval
`total_tokens`	integer	When chunked	Total tokens in full response
`message`	string	When chunked	Human-readable chunk status
`error`	string	On failure	Error description

scrape_url_to_file Parameters

Parameter	Type	Required	Default	Description
`url`	string	Yes	-	Target URL to scrape
`file_path`	string	Yes	-	Path where content should be saved
`method`	string	No	"GET"	HTTP method (GET or POST)
`clean_content`	boolean	No	false	Convert HTML to Markdown before saving
`overwrite`	boolean	No	false	Replace file if it exists

scrape_url_to_file Response Fields

Field	Type	Always Present	Description
`status_code`	integer	Yes	HTTP response status code
`headers`	object	Yes	Response headers (hop-by-hop headers removed)
`content_type`	string	Yes	MIME type of saved content
`response_time`	number	Yes	Request duration in seconds
`file_path`	string	On success	Absolute path to saved file
`bytes_written`	integer	On success	Number of bytes written to disk
`message`	string	On success	Confirmation message
`error`	string	On failure	Error description

Installation

Prerequisites

Requirement	Version	Purpose
Python	3.10+	Runtime environment
uv	Latest	Dependency management
Git	Any	Repository cloning

Setup Steps

Clone the repository and install dependencies:

git clone https://github.com/yourusername/cloudscraper-mcp-server.git
cd cloudscraper-mcp-server
uv sync

Configuration

Transport Protocols

Transport	Best For	Configuration
stdio	Claude Code, VSCode, Direct AI integration	Default mode, no environment variables needed
http	n8n, Web apps, API integrations, Remote access	Requires MCP_TRANSPORT=http

Environment Variables

Variable	Default	Options	Description
`MCP_TRANSPORT`	stdio	stdio, http	Transport protocol selection
`MCP_HOST`	0.0.0.0	Any valid IP	Host binding for HTTP mode
`MCP_PORT`	8000	Any valid port	Port for HTTP mode

Usage Examples

Running with Stdio Transport (Default)

uv run server.py

Running with HTTP Transport

MCP_TRANSPORT=http MCP_HOST=0.0.0.0 MCP_PORT=8000 uv run server.py

Claude Code Integration

claude mcp add cloudscraper-mcp \
  --type stdio \
  --command "uv" \
  --args "run" "server.py" \
  --directory "/path/to/cloudscraper-mcp-server"

VSCode/IDE Configuration

{
  "mcpServers": {
    "cloudscraper-mcp": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "run",
        "server.py"
      ],
      "cwd": "/path/to/cloudscraper-mcp-server"
    }
  }
}

Docker Deployment

For containerized deployment instructions, see

Technical Stack

Component	Technology	Purpose
Protocol	FastMCP 3.0+	Model Context Protocol implementation
Scraping	cloudscraper 1.2.71+	Cloudflare bypass engine
Compression	brotli 1.0.9+	Response decompression
Parsing	beautifulsoup4 4.10.0+	HTML parsing
Conversion	markdownify 0.11.6+	HTML to Markdown transformation
Tokenization	tiktoken 0.5.0+	Token counting for chunking
Testing	pytest 8.0+	Integration test suite

Advanced Features

Response Chunking System

Feature	Value	Description
Max Tokens Per Chunk	10,000	Maximum tokens in a single response
Chunk Expiry	2 minutes	Cache lifetime for chunk retrieval
Token Encoding	cl100k_base	tiktoken encoding model
Continuation Pattern	chunk_id:index	Token format for sequential retrieval

Security Headers

Header	Value	Purpose
User-Agent	Chrome 120	Browser impersonation
Sec-Ch-Ua	Chrome/Chromium	Client hints
Sec-Fetch-*	cors/same-origin	Fetch metadata
Origin/Referer	Auto-generated	Request legitimacy

Made with CloudScraper and FastMCP