ramonbnuezjr/mcp-research-server
If you are the rightful owner of mcp-research-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP Research Tools Server is a FastAPI-based application providing research tools for news summarization and scientific paper retrieval.
MCP Research Tools Server
Project Overview
This project implements a Model Context Protocol (MCP) server designed to provide various research-oriented tools, including news summarization and scientific paper retrieval. It acts as a standardized interface, enabling AI agents or other client applications to fetch and process information from diverse sources without needing to interact directly with various external APIs or manage complex processing logic.
The server is built using Python with the FastAPI framework. It currently offers:
- A news tool that retrieves news using the Tavily Search API and summarizes content using a configurable LLM (e.g., local models via Ollama like Mistral 7B, or OpenAI).
- An arXiv tool that retrieves scientific paper metadata from the arXiv API.
Future enhancements will include summarizing arXiv abstracts/papers and adding more research tools.
Key Features (as of v0.2.0):
- Standardized MCP Interface: Adheres to a defined request/response structure.
- Multi-Tool Support: A single endpoint (
/mcp/tools
) routes requests to different tools based ontool_id
. - News Tool (
news_tool
):- Fetches news article search results using Tavily Search API.
- Summarizes news content using a locally run LLM (e.g., Mistral 7B via Ollama) or OpenAI.
- ArXiv Tool (
arxiv_tool
):- Fetches scientific paper metadata (titles, authors, abstracts, URLs) from arXiv API.
- XML parsing for arXiv API responses.
- Placeholder for summarizing arXiv abstracts.
- Configurable LLM Backend: Supports local LLMs via Ollama and OpenAI, selectable via configuration.
- Abstraction of External Services: Shields clients from Tavily and arXiv API specifics.
- Error Handling: Provides standardized MCP error messages.
- Asynchronous Operations: Leverages FastAPI's async capabilities.
- Data Validation: Uses Pydantic for request/response validation.
- Configuration Management: Loads API keys and settings from
.env
.
Architecture
- Client (AI Agent/Test Script) sends an MCP-formatted JSON request to the
/mcp/tools
endpoint, specifyingtool_id
andmethod
. - MCP Research Tools Server (this application) parses the request and routes it based on
tool_id
. - The corresponding tool's service handler (e.g.,
news_service
,arxiv_service
):- Calls the relevant External API (Tavily, arXiv).
- Processes the data (e.g., parses XML for arXiv, prepares content for summarization).
- If summarization is involved, calls the configured LLM (e.g., local Mistral 7B via Ollama) through LangChain.
- The MCP server transforms the result into the standardized MCP response format.
- The MCP server sends the MCP-formatted JSON response back to the client.
sequenceDiagram
participant Client as AI Agent / Test Script
participant MCPServer as MCP Research Tools Server (FastAPI)
participant ExternalAPI as Tavily / arXiv API
participant LLMService as Local LLM (Ollama) / OpenAI
Client->>+MCPServer: 1. Request (tool_id, method, params) @ /mcp/tools
MCPServer->>ExternalAPI: 2. Fetch Data (e.g., news, papers)
ExternalAPI-->>MCPServer: 3. Raw Data
MCPServer->>LLMService: 4. Request Summarization (if applicable)
LLMService-->>MCPServer: 5. Summarized Text
MCPServer-->>-Client: 6. Processed Data (MCP Format)
Prerequisites
- Python 3.8+ (Python 3.10+ recommended)
- API Keys / Setups:
- Tavily API Key (
TAVILY_API_KEY
in.env
) for the news tool. - Ollama installed and a model running (e.g., Mistral 7B via
ollama pull mistral
) ifPREFERRED_LLM_PROVIDER="local"
. Ollama server should be accessible (defaulthttp://localhost:11434
). - (Optional) OpenAI API Key (
OPENAI_API_KEY
in.env
) ifPREFERRED_LLM_PROVIDER="openai"
.
- Tavily API Key (
- arXiv API is public and does not require a key for basic search.
Setup and Installation
-
Clone the repository.
git clone <your-repository-url> cd mcp-research-server
(Ensure your project folder is named
mcp-research-server
or adjust as needed). -
Create and activate a virtual environment.
python3 -m venv venv source venv/bin/activate # On macOS/Linux # For Windows (Command Prompt): venv\Scripts\activate.bat # For Windows (PowerShell): venv\Scripts\Activate.ps1
-
Install dependencies:
pip install -r requirements.txt
-
Configure API Keys & Settings in
.env
:- Create a
.env
file in the project root (you can copy.env.example
if one is provided, or create it manually). - Add your
TAVILY_API_KEY
. - Set
PREFERRED_LLM_PROVIDER
(e.g.,"local"
or"openai"
). - If using OpenAI, add your
OPENAI_API_KEY
. - Configure
LOCAL_LLM_MODEL_NAME
(e.g.,"mistral:latest"
) andLOCAL_LLM_API_URL
if they differ from the defaults inapp/core/config.py
. - Ensure
.env
is listed in your.gitignore
file.
- Create a
Running the MCP Server
-
Ensure your virtual environment is activated.
-
If using a local LLM via Ollama, ensure your Ollama server is running and the model is available.
-
Navigate to the project root directory (
mcp-research-server
). -
Start the server using Uvicorn:
uvicorn app.main:app --reload --port 8001
The server will be accessible at
http://localhost:8001
.
API Documentation (Auto-Generated)
FastAPI automatically generates interactive API documentation:
- Swagger UI:
http://localhost:8001/docs
- ReDoc:
http://localhost:8001/redoc
API Endpoint: /mcp/tools
This single endpoint handles requests for all available tools. The specific tool and method are determined by the tool_id
and method
fields in the request body.
News Tool (tool_id: "news_tool"
)
-
Method:
get_news_summary
-
Request Parameters (
parameters
object):query
(string, required): Search query for news.max_summary_sentences
(integer, optional, default: 3): Hint for summary length (actual sentence count may vary based on LLM).include_sources
(boolean, optional, default: true): Whether to include sources in the response.
-
Example Request:
{ "protocol_version": "1.0", "tool_id": "news_tool", "method": "get_news_summary", "parameters": { "query": "latest developments in AI ethics", "include_sources": true } }
-
Example Success Response (Summarized by Local LLM):
{ "protocol_version": "1.0", "tool_id": "news_tool", "status": "success", "data": { "query_processed": "latest developments in AI ethics", "summary": "Title: AI Ethics: Challenges, Importance, and Future\n\n The article discusses the importance of Artificial Intelligence (AI) ethics... and addressing ethical considerations in emerging AI applications. The article emphasizes the need for continued development of AI while mitigating potential risks.", "articles_processed_count": 5, "sources": [ {"title": "AI Ethics : Challenges, Importance, and Future - GeeksforGeeks", "url": "https://www.geeksforgeeks.org/ai-ethics/"}, {"title": "The future of ethics in AI: challenges and opportunities", "url": "https://link.springer.com/article/10.1007/s00146-023-01644-x"} ] }, "error": null }
ArXiv Tool (tool_id: "arxiv_tool"
)
-
Method:
search_papers
-
Request Parameters (
parameters
object):search_query
(string, required): arXiv search query (e.g.,au:Hinton AND cat:cs.LG
).max_results
(integer, optional, default: value from config, e.g., 10): Max papers to return.summarize_abstracts
(boolean, optional, default: false): Whether to request summarization of abstracts (current implementation for summarization is a placeholder message).
-
Example Request:
{ "protocol_version": "1.0", "tool_id": "arxiv_tool", "method": "search_papers", "parameters": { "search_query": "au:Lecun AND ti:deep learning", "max_results": 2, "summarize_abstracts": false } }
-
Example Success Response (Metadata Fetch):
{ "protocol_version": "1.0", "tool_id": "arxiv_tool", "status": "success", "data": { "query_processed": "au:Lecun AND ti:deep learning", "papers_found": 2, "papers": [ { "arxiv_id": "2211.01340v3", "title": "POLICE: Provably Optimal Linear Constraint Enforcement for Deep Neural Networks", "authors": ["Randall Balestriero", "Yann LeCun"], "published_date": "2022-11-02T17:48:52Z", "updated_date": "2023-03-10T16:23:19Z", "summary_abstract": "Deep Neural Networks (DNNs) outshine alternative function approximators...", "paper_url": "http://arxiv.org/abs/2211.01340v3", "pdf_url": "http://arxiv.org/pdf/2211.01340v3", "primary_category": "cs.LG", "categories": ["cs.LG", "cs.CV", "stat.ML"], "generated_summary": null }, { "arxiv_id": "2401.11188v1", "title": "Fast and Exact Enumeration of Deep Networks Partitions Regions", "authors": ["Randall Balestriero", "Yann LeCun"], "published_date": "2024-01-20T09:51:52Z", "updated_date": "2024-01-20T09:51:52Z", "summary_abstract": "One fruitful formulation of Deep Networks (DNs) enabling their theoretical study...", "paper_url": "http://arxiv.org/abs/2401.11188v1", "pdf_url": "http://arxiv.org/pdf/2401.11188v1", "primary_category": "cs.LG", "categories": ["cs.LG", "cs.AI"], "generated_summary": null } ] }, "error": null }
Project Structure
mcp-research-server/ # Main project folder (ensure this matches your folder name)
āāā .vscode/
ā āāā settings.json
āāā app/
ā āāā __init__.py
ā āāā main.py # FastAPI app, routes to /mcp/tools
ā āāā models.py # Pydantic models for all tools
ā āāā services/ # Package for service logic
ā ā āāā __init__.py
ā ā āāā news_service.py
ā ā āāā arxiv_service.py
ā āāā core/
ā āāā __init__.py
ā āāā config.py
āāā tests/
ā āāā __init__.py
ā # ... placeholder for test files ...
āāā .env # Local environment variables (NOT COMMITTED)
āāā .gitignore
āāā CHANGELOG.md
āāā requirements.txt
āāā README.md # This file
# ... other root files like news_agent_cli.py (placeholder), test_mcp_client_news.py (placeholder) ...
License
(Specify your license, e.g., MIT License)