python-mcp-websearch-server

IN-PUN-COAONE-AUTOMATNSA/python-mcp-websearch-server

3.1

If you are the rightful owner of python-mcp-websearch-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Python MCP WebSearch Server is a robust server that leverages the Model Context Protocol to provide advanced web search and research capabilities, supporting multiple search providers.

Tools
3
Resources
0
Prompts
0

Python MCP WebSearch Server

A powerful Model Context Protocol (MCP) server that provides web search and deep research capabilities with support for multiple search providers.

Python Version MCP Version

🌟 Features

  • Multiple Search Providers: DuckDuckGo (primary) and Google search support
  • Deep Research: Iterative search with content extraction and analysis
  • Content Extraction: Automatic web page content extraction using Trafilatura
  • Async Architecture: High-performance asynchronous operations
  • MCP Compliant: Full compatibility with the Model Context Protocol specification
  • Production Ready: Tested and optimized for production use

� Quick Start

Installation

git clone https://github.com/IN-PUN-COAONE-AUTOMATNSA/python-mcp-websearch-server.git
cd python-mcp-websearch-server
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Usage

# Test the server
python test_server.py

# Run production demo
python demo.py

# Start the MCP server
python -m src.server

📚 Tools Reference

1. web_search

Perform a web search using specified provider.

Parameters:

  • query (string, required): The search query
  • provider (string, optional): Search provider (duckduckgo, google, default: duckduckgo)
  • max_results (integer, optional): Maximum results (1-100, default: 10)

Example:

{
  "tool": "web_search",
  "arguments": {
    "query": "artificial intelligence trends 2024",
    "provider": "duckduckgo",
    "max_results": 5
  }
}

2. deep_research

Perform comprehensive research on a topic with content extraction.

Parameters:

  • topic (string, required): The topic to research
  • depth (integer, optional): Research depth 1-5 (default: 3)
  • max_sources (integer, optional): Maximum sources to analyze (5-50, default: 20)
  • provider (string, optional): Search provider (default: duckduckgo)

Example:

{
  "tool": "deep_research",
  "arguments": {
    "topic": "sustainable energy solutions",
    "depth": 4,
    "max_sources": 15
  }
}

🤖 Integration with AI Agents

Claude Desktop Integration

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "websearch": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "/absolute/path/to/python-mcp-websearch-server"
    }
  }
}

Continue.dev Integration

Add to your Continue configuration:

{
  "mcpServers": [
    {
      "name": "websearch",
      "command": ["python", "-m", "src.server"],
      "cwd": "/path/to/python-mcp-websearch-server"
    }
  ]
}

🔧 Configuration

Environment Variables

  • LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR, default: INFO)
  • SEARCH_PROVIDER: Default search provider (duckduckgo, google, default: duckduckgo)
  • MAX_CONCURRENT_REQUESTS: Maximum concurrent search requests (default: 5)
  • REQUEST_TIMEOUT: Request timeout in seconds (default: 30.0)

🚀 Docker Deployment

# Build and run with Docker
docker build -t mcp-websearch-server .
docker run -d --name mcp-websearch mcp-websearch-server

📁 Project Structure

├── src/
│   ├── server.py              # Main MCP server
│   ├── models/                # Pydantic data models
│   │   └── search.py
│   ├── providers/             # Search provider implementations
│   │   ├── base.py
│   │   ├── duckduckgo.py
│   │   └── google.py
│   ├── tools/                 # MCP tools
│   │   ├── search.py
│   │   └── deep_research.py
│   └── utils/                 # Utilities
│       └── web_scraper.py
├── test_server.py             # Main test suite
├── demo.py                    # Production demo
├── requirements.txt           # Python dependencies
├── Dockerfile                 # Docker configuration
└── README.md                  # This file

🧪 Testing

Run the comprehensive test suite:

python test_server.py

This will test:

  • Google and DuckDuckGo search providers
  • Error handling and validation
  • Provider availability and performance
  • End-to-end functionality

📊 Performance Notes

ProviderAvg Response TimeSuccess RateNotes
DuckDuckGo1.0-3.0s99%Primary provider, highly reliable
Google0.5-0.9s70%*May be blocked by anti-bot measures

*Google success rate varies due to anti-bot measures during automated testing

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the file for details.

🙏 Acknowledgments

  • Built on the Model Context Protocol
  • Search functionality powered by DuckDuckGo and Google
  • Content extraction using Trafilatura and BeautifulSoup
  • Designed for seamless AI agent integration

Ready for Production Use 🚀 | MCP Compliant ✅ | High Performance

1. Test the Server

# Test all functionality
python test_server.py

# Run production demo
python demo.py

# Run the MCP server
python -m src.server

2. Basic Usage Example

import asyncio
from src.tools.search import SearchTool

async def example():
    search_tool = SearchTool()
    
    # Perform web search
    result = await search_tool.search(
        query="Python programming tutorial",
        provider="duckduckgo",
        max_results=5
    )
    
    print(result)  # JSON formatted results

asyncio.run(example())

📚 Tools Reference

1. web_search

Perform a web search using specified provider.

Parameters:

  • query (string, required): The search query
  • provider (string, optional): Search provider (duckduckgo, google, default: duckduckgo)
  • max_results (integer, optional): Maximum results (1-100, default: 10)

Example:

{
  "tool": "web_search",
  "arguments": {
    "query": "artificial intelligence trends 2024",
    "provider": "duckduckgo",
    "max_results": 5
  }
}

Response Format:

{
  "query": "artificial intelligence trends 2024",
  "provider": "duckduckgo",
  "total_results": 5,
  "search_time": 0.923,
  "timestamp": "2024-01-15T10:30:00Z",
  "results": [
    {
      "rank": 1,
      "title": "AI Trends 2024: What to Expect",
      "url": "https://example.com/ai-trends-2024",
      "snippet": "The latest trends in artificial intelligence for 2024...",
      "timestamp": "2024-01-15T10:30:00Z"
    }
  ]
}

2. deep_research

Perform comprehensive research on a topic with content extraction.

Parameters:

  • topic (string, required): The topic to research
  • depth (integer, optional): Research depth 1-5 (default: 3)
  • max_sources (integer, optional): Maximum sources to analyze (5-50, default: 20)
  • provider (string, optional): Search provider (default: duckduckgo)

Example:

{
  "tool": "deep_research",
  "arguments": {
    "topic": "sustainable energy solutions",
    "depth": 4,
    "max_sources": 15,
    "provider": "duckduckgo"
  }
}

3. test_providers

Test all available search providers.

Parameters: None

Example:

{
  "tool": "test_providers",
  "arguments": {}
}

🤖 Integration with AI Agents

Claude Desktop Integration

Add to your Claude Desktop configuration (%APPDATA%\Claude\claude_desktop_config.json on Windows or ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "websearch": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "/absolute/path/to/python-mcp-websearch-server",
      "env": {
        "LOG_LEVEL": "INFO"
      }
    }
  }
}

Continue.dev Integration

Add to your Continue configuration (.continue/config.json):

{
  "mcpServers": [
    {
      "name": "websearch",
      "command": ["python", "-m", "src.server"],
      "cwd": "/path/to/python-mcp-websearch-server"
    }
  ]
}

Codeium Integration

{
  "mcp": {
    "servers": {
      "websearch": {
        "command": "python -m src.server",
        "working_directory": "/path/to/python-mcp-websearch-server"
      }
    }
  }
}

Custom MCP Client Integration

import asyncio
from mcp.client.stdio import stdio_client
import subprocess

class WebSearchMCPClient:
    def __init__(self, server_path):
        self.server_path = server_path
        self.process = None
        self.client = None
    
    async def start(self):
        self.process = subprocess.Popen([
            "python", "-m", "src.server"
        ], cwd=self.server_path, stdin=subprocess.PIPE, 
           stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        
        self.client = stdio_client(self.process.stdin, self.process.stdout)
        await self.client.__aenter__()
    
    async def search(self, query, provider="duckduckgo", max_results=10):
        result = await self.client.call_tool("web_search", {
            "query": query,
            "provider": provider,
            "max_results": max_results
        })
        return result
    
    async def research(self, topic, depth=3, max_sources=20):
        result = await self.client.call_tool("deep_research", {
            "topic": topic,
            "depth": depth,
            "max_sources": max_sources
        })
        return result
    
    async def close(self):
        if self.client:
            await self.client.__aexit__(None, None, None)
        if self.process:
            self.process.terminate()

# Usage example
async def main():
    client = WebSearchMCPClient("/path/to/python-mcp-websearch-server")
    await client.start()
    
    # Perform search
    results = await client.search("Python machine learning libraries", "duckduckgo", 5)
    print(results)
    
    # Perform research
    research = await client.research("quantum computing applications", depth=3, max_sources=10)
    print(research)
    
    await client.close()

asyncio.run(main())

🔧 Configuration

Environment Variables

  • LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR, default: INFO)
  • SEARCH_PROVIDER: Default search provider (duckduckgo, google, default: duckduckgo)
  • MAX_CONCURRENT_REQUESTS: Maximum concurrent search requests (default: 5)
  • REQUEST_TIMEOUT: Request timeout in seconds (default: 30.0)

Provider Configuration

# Custom configuration
config = {
    "request_timeout": 45.0,
    "max_results": 50,
    "user_agent": "Custom Bot 1.0",
    "max_concurrent_extractions": 10
}

search_tool = SearchTool(config)

Search Provider Notes

  • DuckDuckGo: ✅ Primary provider - No API key required, highly reliable
  • Google: ⚠️ Secondary provider - May be blocked by anti-bot measures
  • Bing: 🚧 Future implementation - Will require API key

🛠️ Development

Setup Development Environment

# Clone the repo
git clone https://github.com/abhilashjaiswal0110/python-mcp-websearch-server.git
cd python-mcp-websearch-server

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt

# Run tests
python test_server.py
python demo.py

Adding New Search Providers

  1. Create a new provider class in src/providers/:
from .base import BaseSearchProvider
from ..models.search import SearchResult, SearchResponse, SearchProvider

class NewSearchProvider(BaseSearchProvider):
    def get_provider_name(self) -> SearchProvider:
        return SearchProvider.NEW_PROVIDER
    
    async def search(self, query: str, max_results: int = 10) -> SearchResponse:
        # Implementation here
        pass
  1. Add to SearchProvider enum in src/models/search.py
  2. Register in src/tools/search.py

🚀 Production Deployment

Docker Deployment

# Build image
docker build -t mcp-websearch-server .

# Run container
docker run -d --name mcp-websearch \
  -e LOG_LEVEL=INFO \
  -e SEARCH_PROVIDER=duckduckgo \
  mcp-websearch-server

Systemd Service (Linux)

[Unit]
Description=MCP WebSearch Server
After=network.target

[Service]
Type=simple
User=your-user
WorkingDirectory=/path/to/python-mcp-websearch-server
ExecStart=/path/to/python-mcp-websearch-server/venv/bin/python -m src.server
Restart=always
Environment=LOG_LEVEL=INFO
Environment=SEARCH_PROVIDER=duckduckgo

[Install]
WantedBy=multi-user.target

📊 Performance Benchmarks

ProviderAvg Response TimeSuccess RateMax ResultsNotes
DuckDuckGo1.0-3.0s99%100Primary provider
Google0.5-0.9s70%*100May be blocked

*Google success rate varies due to anti-bot measures

✅ Production Readiness Checklist

  • Core Functionality: Web search and deep research working
  • Error Handling: Robust error handling and logging
  • Type Safety: Full type hints and validation
  • Async Performance: High-performance async operations
  • Content Extraction: Multiple extraction methods
  • Provider Redundancy: Multiple search providers
  • MCP Compliance: Full MCP specification compliance
  • Docker Support: Ready for containerized deployment
  • Documentation: Comprehensive documentation and examples
  • Testing: Comprehensive test suite

🤝 Contributing

We welcome contributions! Please see our for details.

📄 License

This project is licensed under the MIT License - see the file for details.

🙏 Acknowledgments

  • Built on the Model Context Protocol
  • Search functionality powered by DuckDuckGo and Google
  • Content extraction using Trafilatura and BeautifulSoup
  • Inspired by the need for reliable AI agent web search capabilities

📞 Support


Ready for Production Use 🚀 | MCP Compliant ✅ | High Performance