ola172/web-search-mcp-server
If you are the rightful owner of web-search-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Search Tool is a modular Python framework designed for advanced web searches, content scraping, and AI-powered analysis.
Search Tool
Overview
Search Tool is a modular Python framework for performing advanced web searches, scraping content from search results, and analyzing the retrieved information using AI-powered models. The project is designed for extensibility, allowing easy integration of new search engines, scrapers, and analyzers.
Features
- Custom Site Search: Search within a specified list of websites.
- Custom Domain Search: Restrict searches to specific domains (e.g.,
.edu
,.gov
). - General Web Search: Perform open web searches.
- Content Scraping: Extracts main textual content from URLs using trafilatura.
- AI Analysis: Summarizes and analyzes scraped content using OpenAI models.
- Validation: Ensures URLs are valid before processing.
- Extensible Architecture: Easily add new searchers, scrapers, or analyzers.
Project Structure
search_tool/
āāā src/
ā āāā analyzer/ # AI-powered analyzers (e.g., OpenAI)
ā āāā core/
ā ā āāā factory/ # Factories for searcher, scraper,
ā ā āāā interface/ # Abstract interfaces for extensibility
ā ā āāā types.py # Enums and constants
ā āāā mcp_servers/ # MCP server integration
ā āāā models/ # Pydantic models for data validation
ā āāā scraper/ # Web scrapers (e.g., Trafilatura)
ā āāā searcher/ # Search engine integrations
ā āāā tools/ # User-facing tool functions
ā āāā utils/ # Utility functions (e.g., URL validation)
āāā test.py # Example/test script
āāā requirements.txt # Python dependencies
āāā pyproject.toml # Project metadata and dependencies
āāā .env # Environment variables (e.g., API keys)
āāā README.md # Project documentation
Installation
-
Clone the repository:
git clone https://github.com/ola172/web-search-mcp-server.git cd search_tool
-
Set up a virtual environment (recommended):
python3 -m venv .venv source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
- Copy
.env.example
to.env
- Add your secrets:
- Copy
Usage
Core Tools
Each tool validates input, performs the search, scrapes the results, and analyzes the content.
- General Web Search:
search_on_web
- Custom Sites Search:
search_custom_sites
- Custom Domains Search:
search_custom_domain
MCP Server Integration
The project includes an MCP server (web_search_server.py
) for exposing search tools as mcp tools.
Extending the Framework
- Add a new searcher: Implement the
SearchInterface
and register it inSearcherFactory
. - Add a new scraper: Implement the
ScraperInterface
and register it inScraperFactory
. - Add a new analyzer: Implement the
AnalyzerInterface
and register it inAnalyzerFactory
.
Configuration
- API Keys: Store sensitive keys (e.g., OpenAI) in the
.env
file. - Search Engine IDs: For Google Custom Search, configure
API_KEY
andSEARCH_ENGINE_ID
in the relevant modules.
Dependencies
openai
trafilatura
pydantic
googlesearch-python
python-dotenv
google-api-python-client
See requirements.txt
for the full list.
License
This project is for educational and research purposes. Please ensure compliance with the terms of service of any third-party APIs used.
Acknowledgements
- OpenAI
- Trafilatura
- Google Custom Search
For questions or contributions, please open an issue or pull request.