masa061580/pubmed_mcp_openai_server_2
If you are the rightful owner of pubmed_mcp_openai_server_2 and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The PubMed MCP Server provides access to PubMed/PMC research data through a set of core functions, enabling efficient querying and retrieval of medical research information.
PubMed MCP Server
A Model Context Protocol (MCP) server that provides access to PubMed/PMC research data through six core functions:
- Search: Query PubMed with MeSH support, returning up to 100 paper titles with PMCID info
- Fetch: Retrieve abstract for a single PMID (OpenAI MCP compliant)
- Fetch Batch: Retrieve abstracts for multiple PMIDs efficiently
- Get Full Text: Retrieve full-text content for PMCIDs (JATS XML or OA URLs)
- Count: Get result count for query optimization and refinement
- Count Batch: Get counts for multiple queries efficiently
Features
- MeSH Query Support: Advanced medical subject heading searches
- PMCID Detection: Automatically identifies papers with PMC full-text availability
- Rate Limited: Respects NCBI guidelines (3-10 req/s depending on API key)
- XML Parsing: Handles PubMed XML responses with proper error handling
- Open Access Integration: Detects and provides PDF URLs for OA articles
- Comprehensive Metadata: Returns titles, abstracts, authors, journals, dates
Setup
1. Install Dependencies
pip install -r requirements.txt
2. Environment Configuration (Optional)
All configuration is optional. The server works out-of-the-box without any setup.
For customization, copy .env.example
to .env
:
cp .env.example .env
Available settings:
# All settings are optional
NCBI_API_KEY=your_ncbi_api_key_here # For higher rate limits (10 req/s vs 3 req/s)
NCBI_TOOL_NAME=PubMedMCPServer # Custom tool identifier
NCBI_EMAIL=your.email@example.com # Contact email (uses default if not provided)
# Server settings
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=INFO
3. Start the Server
python pubmed_mcp_server.py
Server will be available at: http://localhost:8000/sse/
4. Get NCBI API Key (Optional, for Higher Rate Limits)
For higher throughput (10 req/s instead of 3 req/s):
- Visit NCBI API Key Settings
- Sign in to your NCBI account
- Create a new API key
- Add
NCBI_API_KEY=your_key_here
to your.env
file
Benefits of API Key:
- 10 requests/second (vs 3 without key)
- Better reliability for high-volume usage
MCP Tools
1. Search
Query PubMed database with advanced search capabilities.
Parameters:
query
(string): Search query with MeSH support
Example Queries:
"COVID-19 AND vaccine"
"asthma[mh] AND adult[mh]"
"diabetes[tiab] AND 2023:2024[dp]"
"cancer treatment[majr]"
Response:
{
"results": [
{
"id": "12345678",
"title": "Study Title",
"url": "https://pubmed.ncbi.nlm.nih.gov/12345678/",
"pmcid": "PMC1234567", // Present if full text available in PMC
"full_text_available": true // Indicates PMC full text availability
}
]
}
2. Fetch
Retrieve abstract for a single PMID only (OpenAI MCP compliant).
Important: This tool accepts exactly one PMID per request. For multiple PMIDs, use fetch_batch
.
Parameters:
id
(string): Single PubMed ID (PMID) - NO ARRAYS OR COMMA-SEPARATED VALUES
Example:
{
"id": "40930554"
}
Response:
{
"id": "40930554",
"title": "Paper Title",
"text": "Background: ... Methods: ... Results: ... Conclusions: ...",
"url": "https://pubmed.ncbi.nlm.nih.gov/40930554/",
"metadata": {
"journal": "Journal Name",
"year": "2024",
"authors": ["Author 1", "Author 2"]
}
}
3. Fetch Batch
Retrieve abstracts for multiple PMIDs efficiently in one request.
Parameters:
pmids
(array): List of PubMed IDs (PMIDs)
Example:
{
"pmids": ["40930554", "40929575", "40929571"]
}
Response:
{
"items": [
{
"pmid": "40930554",
"title": "Paper Title 1",
"abstract": "Background: ... Methods: ... Results: ... Conclusions: ...",
"journal": "Journal Name",
"year": "2024",
"authors": ["Author 1", "Author 2"]
},
{
"pmid": "40929575",
"title": "Paper Title 2",
"abstract": "Background: ... Methods: ... Results: ... Conclusions: ...",
"journal": "Another Journal",
"year": "2024",
"authors": ["Author 3", "Author 4"]
}
]
}
4. Get Full Text
Retrieve full-text content for PMCIDs.
Parameters:
pmcids
(array): List of PMC IDs (with or without 'PMC' prefix)
Example:
{
"pmcids": ["PMC1234567", "7654321"]
}
Response:
{
"items": [
{
"pmcid": "PMC1234567",
"jats_xml": "<article>...</article>",
"pdf_url": "https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1234567/pdf/",
"status": "success"
}
],
"notes": [
"Full text availability depends on PMC Open Access status",
"JATS XML is provided when available",
"PDF URLs are only available for Open Access articles"
]
}
5. Count
Get result count for a search query without retrieving papers (for query optimization).
Parameters:
query
(string): Search query with MeSH support
Example:
{
"query": "lung cancer[mh] AND 2024[dp]"
}
Response:
{
"query": "lung cancer[mh] AND 2024[dp]",
"count": 12543,
"query_translation": "\"lung neoplasms\"[MeSH Terms] AND 2024[dp]",
"warnings": []
}
6. Count Batch
Get counts for multiple queries efficiently (for comparing search strategies).
Parameters:
queries
(array): List of search queries
Example:
{
"queries": [
"diabetes",
"diabetes[mh]",
"diabetes[majr]",
"diabetes[mh] AND clinical trial[pt]"
]
}
Response:
[
{"query": "diabetes", "count": 1072365, "query_translation": "..."},
{"query": "diabetes[mh]", "count": 562256, "query_translation": "..."},
{"query": "diabetes[majr]", "count": 287431, "query_translation": "..."},
{"query": "diabetes[mh] AND clinical trial[pt]", "count": 45123, "query_translation": "..."}
]
Query Refinement Workflow
The count tools enable efficient query refinement:
# Start broad, then refine
"cancer" # 5,000,000+ results (too broad)
"lung cancer" # 800,000 results
"lung cancer[mh]" # 400,000 results (MeSH term)
"lung cancer[mh] AND 2024[dp]" # 12,000 results
"lung cancer[majr] AND clinical trial[pt] AND 2024[dp]" # 450 results (optimal)
MeSH Search Examples
The server supports advanced PubMed search syntax:
# MeSH terms
"diabetes mellitus[mh]"
# MeSH major topic
"cancer[majr]"
# Title/Abstract
"machine learning[tiab]"
# Publication date
"2023:2024[dp]"
# Combined search
"COVID-19[mh] AND vaccine[tiab] AND 2023:2024[dp]"
# Author search
"Smith J[au]"
# Journal search
"nature[ta]"
Usage with ChatGPT/Claude
- Start the server locally
- Use the URL:
http://localhost:8000/sse/
- Configure as MCP connector in your AI interface
- Enable tools:
search
,fetch
,fetch_batch
,get_full_text
Rate Limiting & Compliance
- Without API Key: 3 requests/second maximum
- With API Key: 10 requests/second maximum
- Automatic retry with exponential backoff
- Follows NCBI usage guidelines
- Tool and email identification in all requests
Error Handling
The server handles various error conditions:
- Rate limit exceeded (429) → Automatic retry
- Invalid PMIDs/PMCIDs → Graceful error response
- Network timeouts → Retry with backoff
- XML parsing errors → Detailed error messages
- Missing abstracts → "No abstract available" message
Development
Testing Individual Functions
You can test the server functions directly:
import asyncio
from pubmed_mcp_server import PubMedClient
async def test_search():
async with PubMedClient() as client:
results = await client.search_pubmed("COVID-19 AND vaccine", retmax=5)
print(f"Found {results['count']} papers")
for item in results['items']:
print(f"- {item['title']}")
asyncio.run(test_search())
Logging
Set LOG_LEVEL=DEBUG
in .env
for detailed request/response logging.
License
This implementation follows NCBI usage policies and guidelines. Please ensure compliance with:
- NCBI E-utilities usage guidelines
- PMC Open Access licensing terms
- Rate limiting requirements
- Proper attribution in research use