CoachSteff/crawl4ai-mcp-server
If you are the rightful owner of crawl4ai-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
Crawl4AI MCP Server is a Model Context Protocol server that enables web crawling capabilities for MCP-enabled applications.
Crawl4AI MCP Server
A Model Context Protocol (MCP) server that provides web crawling functionality via Crawl4AI. This server allows you to use Crawl4AI's powerful web crawling capabilities directly within MCP-enabled applications like Claude Desktop, BoltAI, and other MCP clients.
Features
- 🌐 Web Crawling: Crawl any publicly accessible URL and extract content
- 📄 Multiple Output Formats: Get content as Markdown, HTML, plain text, or structured JSON
- 📊 Metadata Extraction: Extract page metadata including title, description, links, and more
- 🔧 MCP Integration: Seamlessly integrates with any MCP-enabled application
- 🐳 Flexible Deployment: Run locally, on a VPS, or in a Docker container
- ⚡ Async Performance: Built with Python asyncio for optimal performance
Prerequisites
- Python 3.8+ (3.9+ recommended)
- Crawl4AI (automatically installed as dependency)
- Playwright browsers (automatically installed during setup)
Quick Start
1. Clone the repository
git clone https://github.com/CoachSteff/crawl4ai-mcp-server.git
cd crawl4ai-mcp-server
2. Install dependencies
pip install -r requirements.txt
3. Install Playwright browsers
python -m playwright install chromium
4. Configure for MCP client
Copy the configuration template and update paths:
cp config.json.template config.json
# Edit config.json with your Python and server paths
5. Test the installation
python3 tests/test_wrapper.py
MCP Client Integration
Claude Desktop
Add this to your Claude Desktop configuration (claude_desktop_config.json):
{
"mcpServers": {
"crawl4ai": {
"command": "/path/to/crawl4ai-mcp-server/crawl4ai-mcp-server",
"args": [],
"env": {}
}
}
}
Other MCP Clients
Use the wrapper script as your MCP server command:
- Command:
/path/to/crawl4ai-mcp-server/crawl4ai-mcp-server - Args:
[] - Env:
{}
Available Tools
1. crawl_url
Crawl a single URL and extract content in your preferred format.
Parameters:
url(required): The URL to crawloutput_format(optional): Output format -markdown,html,text, orjson(default:markdown)
Example Usage in Claude Desktop:
Please crawl https://example.com and extract the content as markdown.
2. get_page_metadata
Extract metadata from a webpage without downloading the full content.
Parameters:
url(required): The URL to get metadata from
Example Usage in Claude Desktop:
Get the metadata for https://example.com
Example Outputs
Markdown Output
# Example Page Title
This is the main content of the page...
## Section 1
Content here...
JSON Output
{
"url": "https://example.com",
"title": "Example Page Title",
"markdown": "# Example Page Title\n\nThis is the main content...",
"html": "<html>...</html>",
"metadata": {
"title": "Example Page Title",
"description": "Page description",
"language": "en"
}
}
Metadata Output
{
"url": "https://example.com",
"title": "Example Page Title",
"description": "Page description",
"keywords": "example, test, page",
"language": "en",
"author": "Page Author",
"links_count": 15,
"media_count": 3,
"word_count": 250,
"status_code": 200
}
Advanced Usage
Running Examples
Try the included examples to see the server in action:
# Run basic usage examples
python3 examples/basic_usage.py
# Run advanced examples
python3 examples/advanced_usage.py
Docker Deployment
For containerized deployment:
# Build the Docker image
docker build -t crawl4ai-mcp-server .
# Run the container
docker run -p 3000:3000 crawl4ai-mcp-server
Configuration Options
The server can be configured through the config.json file:
{
"python_path": "/usr/bin/python3",
"server_path": "/path/to/server.py",
"headless": true,
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36"
}
Troubleshooting
Common Issues
-
Playwright browser not found
python -m playwright install chromium -
Permission denied on wrapper script
chmod +x crawl4ai-mcp-server -
Import errors
pip install -r requirements.txt
Logs
Check the log file for debugging information:
tail -f crawl4ai-mcp.log
Development
Running Tests
# Run configuration tests
python3 tests/test_wrapper.py
# Run server tests (if available)
pytest tests/
Code Quality
The project uses several tools for code quality:
# Format code
black server.py
# Lint code
flake8 server.py
# Type checking
mypy server.py
Contributing
Contributions are welcome! Please see the file for guidelines on how to contribute to this project.
Code of Conduct
Please note that this project is released with a . By participating in this project you agree to abide by its terms.
License
This project is licensed under the MIT License - see the file for details.