PRIDE-Archive/pride-mcp-server
If you are the rightful owner of pride-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP PRIDE Archive Search Server is a Model Context Protocol-compliant API server designed to facilitate AI-driven interactions with proteomics datasets from the PRIDE Archive.
PRIDE MCP Server
A Model Context Protocol (MCP) server for accessing PRIDE Archive proteomics data.
Overview
This MCP server provides tools for searching and retrieving proteomics data from the PRIDE Archive database. It implements the Model Context Protocol to enable AI assistants to access proteomics data programmatically. The system uses an intelligent search approach that always calls facets first to determine optimal filters, then performs enhanced searches with those filters for more precise results. It automatically retrieves detailed project information and presents results in a clean, professional format with direct links to EBI project pages.
Features
- PRIDE Archive Integration: Direct access to PRIDE EBI proteomics database
- Intelligent Search: AI-powered natural language search with automatic project details retrieval
- Facets-Enhanced Search: Always calls facets first to determine optimal filters for more precise searches
- Clean Response Format: Professional, research-oriented responses with direct links to EBI project pages
- Advanced Filtering: Automatic filter selection based on user keywords and available facets
- Project Details: Retrieve detailed information about proteomics projects
- File Access: Get file information and download links
- MCP Protocol: Standard Model Context Protocol implementation
- Analytics & Database: SQLite database for tracking questions, response times, and usage analytics
- Slack Integration: Real-time notifications and analytics reports via Slack webhooks
- API Endpoints: RESTful API for accessing analytics data and system statistics
- Analytics Dashboard: Web-based dashboard for visualizing usage patterns and system performance
Quick Start
Prerequisites
- Python 3.8+
- uv (recommended) or pip
Installation
# Clone the repository
git clone <repository-url>
cd pride-mcp-server
# Install dependencies
uv sync
# Start both MCP server and AI conversational UI
uv run python start_services.py
# Alternative: Use the convenience script
./start.sh
The services will start on:
- MCP Server: http://127.0.0.1:9000
- AI Conversational UI: http://127.0.0.1:9090
- Analytics Dashboard: http://127.0.0.1:8080/analytics_dashboard.html
MCP Server Integration
The PRIDE Archive MCP Server can be integrated with various AI tools like Claude Desktop, ChatGPT, Cursor IDE, and more.
Quick Help
uv run python help_command.py
Integration Guides
- Claude Desktop:
uv run python help_command.py integration claude
- Cursor IDE:
uv run python help_command.py integration cursor
- ChatGPT:
uv run python help_command.py integration chatgpt
Tool Documentation
uv run python help_command.py tool <tool_name>
Configuration Files
All integration configurations are available in the help/
directory:
help/README.md
- Complete integration guidehelp/claude_desktop_config.json
- Claude Desktop configurationhelp/cursor_config.json
- Cursor IDE configurationhelp/chatgpt_config.json
- ChatGPT configurationhelp/vscode_config.json
- VS Code configurationhelp/custom_config.json
- Generic configuration
Available Tools
get_pride_facets
Retrieves available filter values from PRIDE Archive.
Parameters:
facet_page_size
(optional): Number of facet values per page (default: 100)facet_page
(optional): Page number for pagination (default: 0)
fetch_projects
Searches for proteomics projects in PRIDE Archive.
Parameters:
keyword
(required): Search keywordfilters
(optional): Comma-separated filters using exact values from facetspage_size
(optional): Results per page (default: 25)page
(optional): Page number (default: 0)sort_direction
(optional): ASC or DESC (default: DESC)sort_fields
(optional): Fields to sort by (default: downloadCount)
get_project_details
Gets detailed information about a specific PRIDE project.
Parameters:
project_accession
(required): PRIDE project accession (e.g., PXD000001)
get_project_files
Gets file information for a specific PRIDE project.
Parameters:
project_accession
(required): PRIDE project accessionfile_type
(optional): Filter for specific file types
analyze_with_ai
Analyzes proteomics data using AI services.
Parameters:
data
(required): Data to analyze (JSON string or text)analysis_type
(optional): Type of analysis (default: general)context
(optional): Additional context
Usage Examples
Using with MCP Client
from mcp_client_tools import MCPClient
# Connect to the server
client = MCPClient("http://127.0.0.1:9000")
# Get available facets
facets = client.call_tool("get_pride_facets", {})
# Search for projects
projects = client.call_tool("fetch_projects", {
"keyword": "cancer",
"filters": "organisms==Homo sapiens (human),diseases==Breast cancer"
})
Using with AI Assistants
The server can be integrated with AI assistants that support the MCP protocol:
# Example with Claude Desktop
claude --mcp-server pride-mcp-server
Analytics & Database Features
Database Storage
The system automatically stores all questions and responses in a SQLite database (pride_questions.db
) with the following information:
- Questions: User queries with timestamps
- Response Times: Performance metrics for each interaction
- Tool Usage: Which MCP tools were called
- Success/Failure: Whether the request completed successfully
- Metadata: Additional context about the interaction
API Endpoints
The server provides RESTful API endpoints for accessing analytics data:
# Health check
GET /api/health
# Get questions with filtering
GET /api/questions?limit=100&user_id=user123&start_date=2024-01-01
# Get analytics data
GET /api/analytics?days=30
# Get daily statistics
GET /api/analytics/daily?date=2024-01-15
# Get system statistics
GET /api/stats
# Export questions data
GET /api/export/questions?format=csv&start_date=2024-01-01
# Store a question (used by UI)
POST /api/questions
Analytics Dashboard
A web-based dashboard provides real-time visualization of system usage:
# Start the analytics dashboard
python serve_analytics.py --port 8080
Features:
- Real-time statistics (questions, success rate, response times)
- Interactive charts showing daily usage patterns
- Recent questions table with status and performance metrics
- Data export functionality (CSV format)
- Auto-refresh every 30 seconds
Slack Integration
Configure Slack notifications by setting the SLACK_WEBHOOK_URL
environment variable:
# Add to config.env
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
Available Notifications:
- Question Notifications: Real-time alerts for new questions
- Daily Analytics Reports: Automated daily summaries
- Error Alerts: System error notifications
- System Status: Startup/shutdown notifications
Slack API Endpoints:
# Test Slack integration
POST /api/slack/test
# Send analytics report to Slack
POST /api/slack/analytics?days=7
Database Schema
The SQLite database contains two main tables:
questions table:
id
: Primary keyquestion
: User's question textuser_id
: User identifiersession_id
: Session identifiertimestamp
: When the question was askedresponse_time_ms
: Response time in millisecondstools_called
: JSON array of tools usedresponse_length
: Length of the responsesuccess
: Whether the request succeedederror_message
: Error details if failedmetadata
: Additional JSON metadata
analytics table:
id
: Primary keydate
: Date of the analyticstotal_questions
: Total questions for the daysuccessful_questions
: Successful questions countavg_response_time_ms
: Average response timeunique_users
: Number of unique userscreated_at
: When the record was createdupdated_at
: When the record was last updated
Configuration
Environment Variables
MCP_SERVER_PORT
: Port for the MCP server (default: 9000)PRIDE_API_BASE_URL
: PRIDE Archive API base URL (default: https://www.ebi.ac.uk/pride/ws/archive/v3)
Settings
Configuration is managed through config/settings.py
. Key settings include:
- API endpoints and timeouts
- Logging configuration
- Server settings
Project Structure
pride-mcp-server/
āāā config/
ā āāā __init__.py
ā āāā settings.py # Configuration settings
āāā servers/
ā āāā __init__.py
ā āāā pride_mcp_server.py # Main MCP server implementation
āāā tools/
ā āāā __init__.py
ā āāā pride_archive_public_api.py # PRIDE API integration
āāā utils/
ā āāā __init__.py
ā āāā logging.py # Logging utilities
āāā mcp_client_tools/ # Professional UI and client tools
āāā database.py # SQLite database management
āāā slack_integration.py # Slack notifications
āāā api_endpoints.py # REST API endpoints
āāā analytics_dashboard.html # Web analytics dashboard
āāā serve_analytics.py # Analytics dashboard server
āāā main.py # Server entry point
āāā server.py # Enhanced server with API endpoints
āāā pyproject.toml # Project configuration
āāā README.md # This file
Development
Setup Development Environment
# Install in development mode
uv sync --dev
# Run tests
uv run pytest
# Format code
uv run black .
uv run isort .
Adding New Tools
- Define the tool in
servers/pride_mcp_server.py
- Implement the tool logic in
tools/pride_archive_public_api.py
- Update the tool schema and documentation
API Reference
PRIDE Archive API
The server integrates with the PRIDE Archive REST API:
- Base URL: https://www.ebi.ac.uk/pride/ws/archive/v3
- Documentation: https://www.ebi.ac.uk/pride/ws/archive/v3/docs
MCP Protocol
The server implements the Model Context Protocol:
- Specification: https://modelcontextprotocol.io/
- Tools: JSON-RPC 2.0 over HTTP
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
License
This project is licensed under the MIT License - see the file for details.
Acknowledgments
- PRIDE Archive team for providing the proteomics data and API
- MCP community for the protocol specification
- Contributors and maintainers
Support
- Documentation: GitHub Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions