Kode-Rex/webcat
If you are the rightful owner of webcat and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Web Cat MCP is a Model Context Protocol server that enhances AI models with web search and content extraction capabilities.
WebCat MCP Server
Web search and content extraction for AI models via Model Context Protocol (MCP)
Quick Start
Docker (Recommended)
# Run with Docker (no setup required)
docker run -p 8000:8000 tmfrisinger/webcat:latest
# With Serper API key for premium search
docker run -p 8000:8000 -e SERPER_API_KEY=your_key tmfrisinger/webcat:latest
# With authentication enabled
docker run -p 8000:8000 -e WEBCAT_API_KEY=your_token tmfrisinger/webcat:latest
Supports: linux/amd64, linux/arm64 (Intel/AMD, Apple Silicon, AWS Graviton)
Local Development
cd docker
python -m pip install -e ".[dev]"
# Start MCP server with auto-reload
make dev
# Or run directly
python mcp_server.py
What is WebCat?
WebCat is an MCP (Model Context Protocol) server that provides AI models with:
- 🔍 Web Search - Serper API (premium) or DuckDuckGo (free fallback)
- 📄 Content Extraction - Serper scrape API (premium) or Trafilatura (free fallback)
- 🌐 Modern HTTP Transport - Streamable HTTP with JSON-RPC 2.0
- 🐳 Multi-Platform Docker - Works on Intel, ARM, and Apple Silicon
- 🎯 Composite Tool - Single SERPER_API_KEY enables both search + scraping
Built with FastMCP, Serper.dev, and Trafilatura for seamless AI integration.
Features
- ✅ Optional Authentication - Bearer token auth when needed, or run without (v2.3.1)
- ✅ Composite Search Tool - Single Serper API key enables both search + scraping
- ✅ Automatic Fallback - Search: Serper → DuckDuckGo | Scraping: Serper → Trafilatura
- ✅ Premium Scraping - Serper's optimized infrastructure for fast, clean content extraction
- ✅ Smart Content Extraction - Returns markdown with preserved document structure
- ✅ MCP Compliant - Works with Claude Desktop, LiteLLM, and other MCP clients
- ✅ Parallel Processing - Fast concurrent scraping
- ✅ Multi-Platform Docker - Linux (amd64/arm64) support
Installation & Usage
Docker Deployment
# Quick start - no configuration needed
docker run -p 8000:8000 tmfrisinger/webcat:latest
# With environment variables
docker run -p 8000:8000 \
-e SERPER_API_KEY=your_key \
-e WEBCAT_API_KEY=your_token \
tmfrisinger/webcat:latest
# Using docker-compose
cd docker
docker-compose up
Local Development
cd docker
python -m pip install -e ".[dev]"
# Configure environment (optional)
echo "SERPER_API_KEY=your_key" > .env
# Development mode with auto-reload
make dev # Start MCP server with auto-reload
# Production mode
make mcp # Start MCP server
Available Endpoints
| Endpoint | Description |
|---|---|
http://localhost:8000/health | 💗 Health check |
http://localhost:8000/status | 📊 Server status |
http://localhost:8000/mcp | 🛠️ MCP protocol endpoint (Streamable HTTP with JSON-RPC 2.0) |
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
SERPER_API_KEY | (none) | Serper API key for premium search (optional, falls back to DuckDuckGo if not set) |
PERPLEXITY_API_KEY | (none) | Perplexity API key for deep research tool (optional, get at https://www.perplexity.ai/settings/api) |
WEBCAT_API_KEY | (none) | Bearer token for authentication (optional, if set all requests must include Authorization: Bearer <token>) |
PORT | 8000 | Server port |
LOG_LEVEL | INFO | Logging level (DEBUG, INFO, WARNING, ERROR) |
LOG_DIR | /tmp | Log file directory |
MAX_CONTENT_LENGTH | 1000000 | Maximum characters to return per scraped article |
Get API Keys
Serper API (for web search + scraping):
- Visit serper.dev
- Sign up for free tier (2,500 searches/month + scraping)
- Copy your API key
- Add to
.envfile:SERPER_API_KEY=your_key - Note: One API key enables both search AND content scraping!
Perplexity API (for deep research):
- Visit perplexity.ai/settings/api
- Sign up and get your API key
- Copy your API key
- Add to
.envfile:PERPLEXITY_API_KEY=your_key
Enable Authentication (Optional)
To require bearer token authentication for all MCP tool calls:
- Generate a secure random token:
openssl rand -hex 32 - Add to
.envfile:WEBCAT_API_KEY=your_token - Include in all requests:
Authorization: Bearer your_token
Note: If WEBCAT_API_KEY is not set, no authentication is required.
MCP Tools
WebCat exposes these tools via MCP:
| Tool | Description | Parameters |
|---|---|---|
search | Search web and extract content | query: str, max_results: int |
scrape_url | Scrape specific URL | url: str |
health_check | Check server health | (none) |
get_server_info | Get server capabilities | (none) |
Architecture
MCP Client (Claude, LiteLLM)
↓
FastMCP Server (Streamable HTTP with JSON-RPC 2.0)
↓
Authentication (optional bearer token)
↓
Search Decision
├─ Serper API (premium) → Serper Scrape API (premium)
└─ DuckDuckGo (free) → Trafilatura (free)
↓
Markdown Response
Tech Stack:
- FastMCP - MCP protocol implementation with modern HTTP transport
- JSON-RPC 2.0 - Standard protocol for client-server communication
- Serper API - Google-powered search + optimized web scraping
- Trafilatura - Fallback content extraction (removes navigation/ads)
- DuckDuckGo - Free search fallback
Testing
cd docker
# Run all unit tests
make test
# OR
python -m pytest tests/unit -v
# With coverage report
make test-coverage
# OR
python -m pytest tests/unit --cov=. --cov-report=term --cov-report=html
# CI-safe tests (no external dependencies)
python -m pytest -v -m "not integration"
# Run specific test file
python -m pytest tests/unit/services/test_content_scraper.py -v
Current test coverage: 70%+ across all modules (enforced in CI)
Development
# First-time setup
make setup-dev # Install all dependencies + pre-commit hooks
# Development workflow
make dev # Start server with auto-reload
make format # Auto-format code (Black + isort)
make lint # Check code quality (flake8)
make test # Run unit tests
# Before committing
make ci-fast # Quick validation (~30 seconds)
# OR
make ci # Full validation with security checks (~2-3 minutes)
# Code quality tools
make format-check # Check formatting without changes
make security # Run bandit security scanner
make audit # Check dependency vulnerabilities
Pre-commit Hooks:
Hooks run automatically on git commit to ensure code quality. Install with make setup-dev.
Project Structure
docker/
├── mcp_server.py # Main MCP server (FastMCP)
├── cli.py # CLI interface for server modes
├── health.py # Health check endpoint
├── api_tools.py # API tooling utilities
├── clients/ # External API clients
│ ├── serper_client.py # Serper API (search + scrape)
│ └── duckduckgo_client.py # DuckDuckGo fallback
├── services/ # Core business logic
│ ├── search_service.py # Search orchestration
│ └── content_scraper.py # Serper scrape → Trafilatura fallback
├── tools/ # MCP tool implementations
│ └── search_tool.py # Search tool with auth
├── models/ # Pydantic data models
│ ├── domain/ # Domain entities (SearchResult, etc.)
│ └── responses/ # API response models
├── utils/ # Shared utilities
│ └── auth.py # Bearer token authentication
├── endpoints/ # FastAPI endpoints
├── tests/ # Comprehensive test suite
│ ├── unit/ # Unit tests (mocked dependencies)
│ └── integration/ # Integration tests (external deps)
└── pyproject.toml # Project config + dependencies
Search Quality Comparison
| Feature | Serper API | DuckDuckGo |
|---|---|---|
| Cost | Paid (free tier available) | Free |
| Quality | ⭐⭐⭐⭐⭐ Excellent | ⭐⭐⭐⭐ Good |
| Coverage | Comprehensive (Google-powered) | Standard |
| Speed | Fast | Fast |
| Rate Limits | 2,500/month (free tier) | None |
Docker Multi-Platform Support
WebCat supports multiple architectures for broad deployment compatibility:
# Build locally for multiple platforms
cd docker
./build.sh # Builds for linux/amd64 and linux/arm64
# Manual multi-platform build and push
docker buildx build --platform linux/amd64,linux/arm64 \
-t tmfrisinger/webcat:2.3.2 \
-t tmfrisinger/webcat:latest \
-f Dockerfile --push .
# Verify multi-platform support
docker buildx imagetools inspect tmfrisinger/webcat:latest
Automated Releases: Push a version tag to trigger automated multi-platform builds via GitHub Actions:
git tag v2.3.2
git push origin v2.3.2
Limitations
- Text-focused: Optimized for article content, not multimedia
- No JavaScript: Cannot scrape dynamic JS-rendered content (uses static HTML)
- PDF support: Detection only, not full extraction
- Python 3.11 required: Not compatible with 3.10 or 3.12
- External API limits: Subject to Serper API rate limits (2,500/month free tier)
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure
make cipasses - Submit a Pull Request
See for development guidelines and architecture standards.
License
MIT License - see file for details.
Links
- GitHub: github.com/Kode-Rex/webcat
- MCP Spec: modelcontextprotocol.io
- Serper API: serper.dev
Version 2.3.2 | Built with FastMCP, FastAPI, Readability, and html2text