gpetruzella/openalex-mcp-server
If you are the rightful owner of openalex-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The OpenAlex MCP Server is a production-ready Model Context Protocol server designed to provide academic research tools using the OpenAlex API. It is built with FastAPI and optimized for deployment on Google Cloud Run.
OpenAlex MCP Server
A production-ready Model Context Protocol (MCP) server that provides academic research tools using the OpenAlex API. Built with FastAPI and designed for deployment on Google Cloud Run.
Features
🔬 Academic Research Tools
- search_works - Search for papers, articles, and academic publications
- search_authors - Find researchers and their profiles with h-index, citations
- get_work_details - Get detailed metadata for specific papers
- search_concepts - Explore research topics and their relationships
- search_institutions - Find universities and research organizations
- get_citations - Analyze citation networks (citing/cited works)
- advanced_filter - Complex multi-criteria searches with OpenAlex filter syntax
🚀 MCP Protocol Features
- Streamable-HTTP transport - Modern HTTP/SSE-based communication
- JSON-RPC 2.0 - Standard message protocol
- Server-Sent Events - Real-time streaming for large responses
- Cloud-ready - Optimized for Google Cloud Run deployment
📊 OpenAlex Polite Access
- Automatic rate limiting (10 requests/second)
- Mailto parameter for polite pool access
- Proper User-Agent headers
- Exponential backoff for retries
Quick Start
Local Development
- Clone and setup:
cd openalex-mcp-server
cp .env.example .env
# Edit .env and set your MAILTO_EMAIL
- Install dependencies:
pip install -r requirements.txt
- Run the server:
python server.py
# Or with uvicorn:
uvicorn server:app --reload --port 8080
- Test the endpoint:
curl http://localhost:8080/health
Docker Local Testing
docker build -t openalex-mcp-server .
docker run -p 8080:8080 -e MAILTO_EMAIL=your-email@williamscollege.edu openalex-mcp-server
Google Cloud Run Deployment
Prerequisites
- Google Cloud SDK installed
- GCP project with Cloud Run API enabled
- Docker installed locally
Deployment Steps
- Authenticate with Google Cloud:
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
- Build and push container:
# Configure Docker for Google Container Registry
gcloud auth configure-docker
# Build the image
docker build -t gcr.io/YOUR_PROJECT_ID/openalex-mcp-server .
# Push to Container Registry
docker push gcr.io/YOUR_PROJECT_ID/openalex-mcp-server
- Deploy to Cloud Run:
gcloud run deploy openalex-mcp-server \
--image gcr.io/YOUR_PROJECT_ID/openalex-mcp-server \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars MAILTO_EMAIL=your-email@williamscollege.edu \
--memory 512Mi \
--cpu 1 \
--max-instances 10
- Get your service URL:
gcloud run services describe openalex-mcp-server --platform managed --region us-central1 --format 'value(status.url)'
Alternative: One-Command Deployment
gcloud run deploy openalex-mcp-server \
--source . \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars MAILTO_EMAIL=your-email@williamscollege.edu
MCP Client Configuration
Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"openalex": {
"transport": "streamable-http",
"url": "https://YOUR-SERVICE-URL.run.app/mcp",
"description": "OpenAlex academic research API"
}
}
}
Config file locations:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Other MCP Clients
The server implements standard MCP over HTTP/SSE, so it works with any compatible client:
import httpx
import json
# Initialize request
init_message = {
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-03-26",
"clientInfo": {
"name": "my-client",
"version": "1.0.0"
}
}
}
response = httpx.post(
"https://YOUR-SERVICE-URL.run.app/mcp",
json=init_message,
headers={"Content-Type": "application/json"}
)
print(response.json())
Usage Examples
Search for Papers
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "search_works",
"arguments": {
"query": "machine learning climate change",
"publication_year": "2020-2024",
"open_access": true,
"limit": 10
}
}
}
Find Researchers
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "search_authors",
"arguments": {
"name": "Andrew Ng",
"institution": "Stanford",
"limit": 5
}
}
}
Get Citation Network
{
"jsonrpc": "2.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "get_citations",
"arguments": {
"work_id": "W2741809807",
"direction": "citing",
"limit": 50
}
}
}
Advanced Filtering
{
"jsonrpc": "2.0",
"id": 5,
"method": "tools/call",
"params": {
"name": "advanced_filter",
"arguments": {
"entity_type": "works",
"filters": {
"publication_year": ">2020",
"cited_by_count": ">100",
"is_oa": true
},
"search": "artificial intelligence",
"sort": "cited_by_count:desc",
"limit": 25
}
}
}
API Endpoints
POST /mcp
Main MCP endpoint for JSON-RPC messages. Supports both JSON and SSE responses.
Headers:
Content-Type: application/json(required)Accept: application/json(JSON response) orAccept: text/event-stream(SSE)
GET /mcp
Optional SSE endpoint for server-initiated messages (keepalive, notifications).
Headers:
Accept: text/event-stream(required)
GET /health
Health check endpoint for monitoring.
Response: {"status": "healthy", "service": "openalex-mcp-server", "version": "1.0.0"}
GET /
Service information and endpoint documentation.
Environment Variables
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | Server port (Cloud Run sets automatically) |
MAILTO_EMAIL | researcher@williamscollege.edu | Required for polite pool access |
OPENALEX_BASE_URL | https://api.openalex.org | OpenAlex API endpoint |
MAX_REQUESTS_PER_SECOND | 10 | Rate limit for API calls |
LOG_LEVEL | INFO | Logging verbosity |
ALLOWED_ORIGINS | None | CORS allowed origins (comma-separated) |
Architecture
openalex-mcp-server/
├── server.py # FastAPI app with HTTP/SSE transport
├── mcp_handler.py # MCP protocol & JSON-RPC handling
├── config.py # Environment configuration
├── tools/
│ ├── __init__.py
│ ├── search.py # OpenAlex API tool implementations
│ ├── filters.py # Filter utilities
│ └── utils.py # Formatting helpers
├── Dockerfile # Cloud Run deployment
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── claude_desktop_config.json # Client config example
└── README.md
Development
Adding New Tools
- Implement the tool function in
tools/search.py:
async def my_new_tool(param1: str, param2: int = 10) -> str:
"""Tool description."""
# Implementation
return json.dumps(result)
- Add to
MCPHandlerinmcp_handler.py:
self.tools = {
# ... existing tools
"my_new_tool": my_new_tool,
}
- Add schema in
_get_tool_schema():
schemas = {
# ... existing schemas
"my_new_tool": {
"description": "Tool description",
"inputSchema": {
"type": "object",
"properties": {
"param1": {"type": "string", "description": "..."},
"param2": {"type": "integer", "default": 10}
},
"required": ["param1"]
}
}
}
Running Tests
# Install test dependencies
pip install pytest pytest-asyncio httpx
# Run tests
pytest
Monitoring Cloud Run
# View logs
gcloud run services logs read openalex-mcp-server --region us-central1
# Check service status
gcloud run services describe openalex-mcp-server --region us-central1
Security Considerations
Current Configuration (Development)
- No authentication required
- CORS allows all origins
- Suitable for testing and internal use
Production Hardening
- Enable authentication:
gcloud run deploy openalex-mcp-server \
--no-allow-unauthenticated
- Set allowed origins in
.env:
ALLOWED_ORIGINS=https://yourdomain.com,https://anotherdomain.com
- Use Cloud Run IAM for access control
- Enable Cloud Armor for DDoS protection
- Set up VPC for private networking
Rate Limiting & Polite Access
The server implements OpenAlex polite pool best practices:
- ✅ 10 requests/second rate limit (vs 6 req/s for non-polite)
- ✅ Mailto parameter in all requests
- ✅ User-Agent header with contact info
- ✅ Exponential backoff for errors
- ✅ Response caching (where appropriate)
Always set MAILTO_EMAIL to get better rate limits!
Troubleshooting
Issue: "Origin not allowed"
Solution: Set ALLOWED_ORIGINS environment variable or update CORS middleware in server.py
Issue: Rate limiting errors
Solution: Verify MAILTO_EMAIL is set correctly for polite pool access
Issue: Cloud Run timeout
Solution: Increase timeout in deployment:
gcloud run deploy openalex-mcp-server --timeout 300
Issue: Memory errors
Solution: Increase memory allocation:
gcloud run deploy openalex-mcp-server --memory 1Gi
Resources
- OpenAlex API Documentation
- Model Context Protocol Specification
- Google Cloud Run Documentation
- FastAPI Documentation
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
License
MIT License - See LICENSE file for details
Support
For issues and questions:
- OpenAlex API: support@openalex.org
- Williams College: Contact your research computing support
Built for Williams College undergraduate researchers 🎓
Happy researching! 🔬