undici77/MCPWebSearch
If you are the rightful owner of MCPWebSearch and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The MCP Web Search Server is a privacy-focused search server that provides controlled access to web, social media, and archive search capabilities via the Model Control Protocol (MCP).
📂 MCP Web Search Server
A privacy‑focused web, social media, and archive search server exposing tools via the Model Control Protocol (MCP) for controlled access to external search capabilities.
Table of Contents
- Features
- Installation & Quick Start
- Command‑Line Options
- Integration with LM Studio
- MCP API Overview
- Available Tools
- Security Features
🎯 Features
- Parallel search across multiple privacy‑focused web engines.
- Social media lookup for public content on major platforms.
- Archive retrieval from Wayback Machine, archive.today, Google Cache and others.
- Dynamic listing of supported engines and archive services.
- Result caching with LRU eviction to speed up repeated queries.
📦 Installation & Quick Start
# Clone the repository (if applicable)
git clone https://github.com/undici77/MCPWebSearch.git
cd MCPWebSearch
# Run the startup script (adjust name if different)
./run.sh -d /path/to/working/directory
1️⃣ Create & activate a Python virtual environment (.venv).
2️⃣ Install all required dependencies from requirements.txt.
3️⃣ Launch the MCP Search Server (main.py) which listens on stdin/stdout for JSON‑RPC messages.
📌 Ensure the startup script is executable:
chmod +x run.sh
⚙️ Command‑Line Options
| Option | Description |
|---|---|
-d, --directory | Path to the working directory (default: current process dir). |
The server itself does not require additional CLI flags; all configuration is performed via JSON‑RPC.
🤝 Integration with LM Studio
Add an entry to your mcp.json so LM Studio can start the server automatically:
{
"mcpServers": {
"web-search": {
"command": "/absolute/path/to/run.sh",
"args": [
"-d",
"/absolute/path/to/working/directory"
],
"env": { "WORKING_DIR": "." }
}
}
}
📌 Make the script executable (
chmod +x /absolute/path/to/run.sh) and run./run.shonce to install the virtual environment before launching LM Studio.
📡 MCP API Overview
All communication follows JSON‑RPC 2.0 over stdin/stdout.
initialize
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {}
}
Response: protocol version (2024-11-05), server capabilities (tool enumeration) and basic server info (name, version).
tools/list
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/list",
"params": {}
}
Response: an array of tool definitions (name, description, input schema).
tools/call
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "<tool_name>",
"arguments": { … }
}
}
Note: The tool identifier key is name, not tool.
🛠️ Available Tools
web_search
Search the web using multiple privacy‑focused engines in parallel.
| Name | Type | Required | Description |
|---|---|---|---|
query | string | ✅ | Search query (max 500 characters). |
engine | string | ❌ (default all) | Engine to use (duckduckgo, brave, startpage, ecosia, mojeek, yandex or all). |
max_results | integer | ❌ (default 20) | Max results per engine (1‑50). |
Example
{
"jsonrpc": "2.0",
"id": 10,
"method": "tools/call",
"params": {
"name": "web_search",
"arguments": {
"query": "privacy focused search engines",
"engine": "duckduckgo",
"max_results": 15
}
}
}
The server returns a formatted text block containing titles, URLs and snippets from each selected engine.
social_search
Search public content on major social‑media platforms.
| Name | Type | Required | Description |
|---|---|---|---|
query | string | ✅ | Search query (max 500 characters). |
platform | string | ❌ (default all) | Platform to search (twitter, reddit, youtube, github, stackoverflow, medium, pinterest, tiktok, instagram, facebook, linkedin or all). |
Example
{
"jsonrpc": "2.0",
"id": 11,
"method": "tools/call",
"params": {
"name": "social_search",
"arguments": {
"query": "AI ethics research",
"platform": "reddit"
}
}
}
The response contains direct URLs that can be opened in a browser.
archives_search
Find archived versions of a URL across multiple web‑archive services.
| Name | Type | Required | Description |
|---|---|---|---|
url | string | ✅ | Complete URL (must include http:// or https://). |
service | string | ❌ (default all) | Archive service (wayback, archive_today, google_cache, bing_cache, yandex_cache, cachedview, ghostarchive or all). |
check_availability | boolean | ❌ (default false) | When true, the server queries the Wayback Machine API for snapshot statistics. |
Example
{
"jsonrpc": "2.0",
"id": 12,
"method": "tools/call",
"params": {
"name": "archives_search",
"arguments": {
"url": "https://example.com",
"service": "wayback",
"check_availability": true
}
}
}
The response lists archive URLs and, if requested, snapshot counts and timestamps.
list_engines
List all available privacy‑focused search engines.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 13,
"method": "tools/call",
"params": {
"name": "list_engines",
"arguments": {}
}
}
The server returns a markdown‑formatted overview of each engine and usage notes.
list_archives_services
List all supported web‑archive services.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 14,
"method": "tools/call",
"params": {
"name": "list_archives_services",
"arguments": {}
}
}
The response includes a description of each service, its ID and key features.
clear_cache
Clear the internal search‑result cache.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 15,
"method": "tools/call",
"params": {
"name": "clear_cache",
"arguments": {}
}
}
The server replies with a confirmation message.
🔐 Security Features
- Query sanitisation – strips control characters, removes HTML tags and enforces
MAX_QUERY_LENGTH(500). - Strict URL validation – accepts only
http://orhttps://schemes with a valid domain. - Blocked patterns – regexes prevent
<script>injection,javascript:URIs and event‑handler attributes. - Input schema enforcement – each tool validates required fields via the JSON‑RPC
inputSchema. - Rate limiting – an asyncio semaphore caps concurrent external requests (
MAX_CONCURRENT_SEARCHES).
© 2025 Undici77 – All rights reserved.