pubmed_mcp_openai_server

masa061580/pubmed_mcp_openai_server

3.2

If you are the rightful owner of pubmed_mcp_openai_server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The PubMed MCP Server provides access to PubMed/PMC research data through a Model Context Protocol, enabling advanced search and retrieval of medical literature.

Tools
3
Resources
0
Prompts
0

PubMed MCP Server

A Model Context Protocol (MCP) server that provides access to PubMed/PMC research data through seven core tools:

  • Search: Query PubMed with MeSH support, returning configurable number of papers (1-200, default: 50) with PMCID info and sorting options
  • Fetch: Retrieve abstract for a single PMID (OpenAI MCP compliant)
  • Fetch Batch: Retrieve abstracts for multiple PMIDs efficiently
  • Get Full Text: Retrieve full-text content for PMCIDs (JATS XML or OA URLs)
  • Count: Get result count for query optimization and refinement
  • Find Similar Articles: Discover related research using PubMed's recommendation algorithm
  • Export to RIS: Export articles to RIS format for citation managers (EndNote/Zotero/Mendeley)

Features

  • MeSH Query Support: Advanced medical subject heading searches
  • PMCID Detection: Automatically identifies papers with PMC full-text availability
  • Rate Limited: Respects NCBI guidelines (3-10 req/s depending on API key)
  • XML Parsing: Handles PubMed XML responses with proper error handling
  • Open Access Integration: Detects and provides PDF URLs for OA articles
  • Comprehensive Metadata: Returns titles, abstracts, authors, journals, dates

Setup

1. Install Dependencies

pip install -r requirements.txt

2. Environment Configuration (Optional)

All configuration is optional. The server works out-of-the-box without any setup.

For customization, copy .env.example to .env:

cp .env.example .env

Available settings:

# All settings are optional - server works with defaults
NCBI_API_KEY=your_ncbi_api_key_here  # For higher rate limits (10 req/s vs 3 req/s)
NCBI_TOOL_NAME=PubMedMCPServer       # Custom tool identifier (optional)
NCBI_EMAIL=your.email@example.com    # Contact email (optional - not required)

# Server settings
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=INFO

3. Start the Server

python pubmed_mcp_server.py

Server will be available at: http://localhost:8000/sse/

4. Get NCBI API Key (Optional, for Higher Rate Limits)

For higher throughput (10 req/s instead of 3 req/s):

  1. Visit NCBI API Key Settings
  2. Sign in to your NCBI account
  3. Create a new API key
  4. Add NCBI_API_KEY=your_key_here to your .env file

Benefits of API Key:

  • 10 requests/second (vs 3 without key)
  • Better reliability for high-volume usage

MCP Tools

1. Search

Query PubMed database with advanced search capabilities.

Parameters:

  • query (string): Search query with MeSH support
  • max_results (integer, optional): Number of results to return (1-200, default: 50)
  • sort (string, optional): Sort order - "relevance" (Best Match, default) or "date" (Most Recent)

Example Queries:

"COVID-19 AND vaccine"
"asthma[mh] AND adult[mh]"
"diabetes[tiab] AND 2023:2024[dp]"
"cancer treatment[majr]"

Response:

{
  "results": [
    {
      "id": "12345678",
      "title": "Study Title",
      "url": "https://pubmed.ncbi.nlm.nih.gov/12345678/",
      "pmcid": "PMC1234567",  // Present if full text available in PMC
      "full_text_available": true  // Indicates PMC full text availability
    }
  ]
}

2. Fetch

Retrieve abstract for a single PMID only (OpenAI MCP compliant).

Important: This tool accepts exactly one PMID per request. For multiple PMIDs, use fetch_batch.

Parameters:

  • id (string): Single PubMed ID (PMID) - NO ARRAYS OR COMMA-SEPARATED VALUES

Example:

{
  "id": "40930554"
}

Response:

{
  "id": "40930554",
  "title": "Paper Title",
  "text": "Background: ... Methods: ... Results: ... Conclusions: ...",
  "url": "https://pubmed.ncbi.nlm.nih.gov/40930554/",
  "metadata": {
    "journal": "Journal Name",
    "year": "2024",
    "authors": ["Author 1", "Author 2"]
  }
}

3. Fetch Batch

Retrieve abstracts for multiple PMIDs efficiently in one request.

Parameters:

  • pmids (array): List of PubMed IDs (PMIDs)

Example:

{
  "pmids": ["40930554", "40929575", "40929571"]
}

Response:

{
  "items": [
    {
      "pmid": "40930554",
      "title": "Paper Title 1",
      "abstract": "Background: ... Methods: ... Results: ... Conclusions: ...",
      "journal": "Journal Name",
      "year": "2024",
      "authors": ["Author 1", "Author 2"],
      "doi": "10.1234/example.2024.001"
    },
    {
      "pmid": "40929575",
      "title": "Paper Title 2",
      "abstract": "Background: ... Methods: ... Results: ... Conclusions: ...",
      "journal": "Another Journal",
      "year": "2024",
      "authors": ["Author 3", "Author 4"],
      "doi": "10.5678/another.2024.002"
    }
  ]
}

4. Get Full Text

Retrieve full-text content for PMCIDs.

Parameters:

  • pmcids (array): List of PMC IDs (with or without 'PMC' prefix)

Example:

{
  "pmcids": ["PMC1234567", "7654321"]
}

Response:

{
  "items": [
    {
      "pmcid": "PMC1234567",
      "jats_xml": "<article>...</article>",
      "pdf_url": "https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1234567/pdf/",
      "status": "success"
    }
  ],
  "notes": [
    "Full text availability depends on PMC Open Access status",
    "JATS XML is provided when available",
    "PDF URLs are only available for Open Access articles"
  ]
}

5. Count

Get result count for a search query without retrieving papers (for query optimization).

Parameters:

  • query (string): Search query with MeSH support

Example:

{
  "query": "lung cancer[mh] AND 2024[dp]"
}

Response:

{
  "query": "lung cancer[mh] AND 2024[dp]",
  "count": 12543,
  "query_translation": "\"lung neoplasms\"[MeSH Terms] AND 2024[dp]",
  "warnings": []
}

6. Find Similar Articles

Discover related research using PubMed's computational similarity algorithm.

Parameters:

  • pmid (string): PubMed ID of the seed article
  • max_results (integer, optional): Number of similar articles to return (default: 20, max: 100)

Example:

{
  "pmid": "40930554",
  "max_results": 10
}

Response:

{
  "seed_pmid": "40930554",
  "similar_articles": [
    {
      "pmid": "40912345",
      "title": "Related Study Title",
      "url": "https://pubmed.ncbi.nlm.nih.gov/40912345/",
      "pmcid": "PMC1234567",
      "full_text_available": true
    }
  ],
  "count": 10
}

7. Export to RIS

Export PubMed articles to RIS format for citation managers.

Parameters:

  • pmids (array): List of PubMed IDs (PMIDs) to export

Example:

{
  "pmids": ["40930554", "40929575", "40929571"]
}

Response: Returns RIS formatted text that can be copied and saved as .ris file for import into EndNote, Zotero, or Mendeley. Citation managers will automatically fetch complete metadata from PubMed using the PMID.

RIS Format Features:

  • Minimal metadata export (PMID, title, first author, journal, year, DOI)
  • Citation managers auto-fetch complete metadata
  • Supports all major citation managers
  • Copy output and save as .ris file for import

Query Refinement Workflow

The count tools enable efficient query refinement:

# Start broad, then refine
"cancer"                                    # 5,000,000+ results (too broad)
"lung cancer"                               # 800,000 results
"lung cancer[mh]"                          # 400,000 results (MeSH term)
"lung cancer[mh] AND 2024[dp]"            # 12,000 results
"lung cancer[majr] AND clinical trial[pt] AND 2024[dp]"  # 450 results (optimal)

MeSH Search Examples

The server supports advanced PubMed search syntax:

# MeSH terms
"diabetes mellitus[mh]"

# MeSH major topic  
"cancer[majr]"

# Title/Abstract
"machine learning[tiab]"

# Publication date
"2023:2024[dp]"

# Combined search
"COVID-19[mh] AND vaccine[tiab] AND 2023:2024[dp]"

# Author search
"Smith J[au]"

# Journal search  
"nature[ta]"

Usage with ChatGPT/Claude

  1. Start the server locally
  2. Use the URL: http://localhost:8000/sse/
  3. Configure as MCP connector in your AI interface
  4. Enable tools: search, fetch, fetch_batch, get_full_text, count, find_similar_articles, export_to_ris

Rate Limiting & Compliance

  • Without API Key: 3 requests/second maximum
  • With API Key: 10 requests/second maximum
  • Automatic retry with exponential backoff
  • Follows NCBI usage guidelines
  • Tool and email identification in all requests

Error Handling

The server handles various error conditions:

  • Rate limit exceeded (429) → Automatic retry
  • Invalid PMIDs/PMCIDs → Graceful error response
  • Network timeouts → Retry with backoff
  • XML parsing errors → Detailed error messages
  • Missing abstracts → "No abstract available" message

Development

Testing Individual Functions

You can test the server functions directly:

import asyncio
from pubmed_mcp_server import PubMedClient

async def test_search():
    async with PubMedClient() as client:
        results = await client.search_pubmed("COVID-19 AND vaccine", retmax=5)
        print(f"Found {results['count']} papers")
        for item in results['items']:
            print(f"- {item['title']}")

asyncio.run(test_search())

Logging

Set LOG_LEVEL=DEBUG in .env for detailed request/response logging.

License

This implementation follows NCBI usage policies and guidelines. Please ensure compliance with:

  • NCBI E-utilities usage guidelines
  • PMC Open Access licensing terms
  • Rate limiting requirements
  • Proper attribution in research use