OpenAlexMCP

SnippetSquid/OpenAlexMCP

3.2

If you are the rightful owner of OpenAlexMCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

OpenAlex MCP Server provides access to a vast scholarly database with over 240 million works, authors, institutions, and other academic entities.

Tools
  1. search_works

    Search for scholarly works such as papers, articles, books, and datasets.

  2. search_authors

    Search for authors and researchers.

  3. search_institutions

    Search for academic institutions.

  4. search_sources

    Search for journals, conferences, and publication venues.

  5. get_work_details

    Get detailed information about a specific work.

  6. get_author_profile

    Get detailed profile information about a specific author.

  7. get_citations

    Get works that cite a specific work for citation analysis.

OpenAlex MCP Server

A Model Context Protocol (MCP) server that provides access to the OpenAlex scholarly database containing 240M+ works, authors, institutions, and other academic entities.

Status: āœ… All 127 tests passing | šŸš€ Production ready | šŸ“Š Full test coverage

Features

  • Comprehensive Search: Search across works, authors, institutions, and publication venues
  • Detailed Profiles: Get detailed information about specific works, authors, and institutions
  • Citation Analysis: Track citations and find related works
  • Rich Metadata: Access comprehensive scholarly data including affiliations, topics, and metrics
  • Open Access Focus: Filter for open access publications and venues
  • No Authentication Required: Free access to the complete OpenAlex database
  • FastMCP Architecture: Built with the latest FastMCP framework for optimal performance
  • Robust Error Handling: Comprehensive error handling and logging for production use
  • Async Support: Full async/await support for high-performance concurrent requests

Installation

Development Setup

git clone https://github.com/your-username/openalex-mcp.git
cd openalex-mcp

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

Configuration

Environment Variables

  • OPENALEX_EMAIL: Your email address (recommended for polite pool access and higher rate limits)
  • OPENALEX_TIMEOUT: Request timeout in seconds (default: 30.0)
  • OPENALEX_MAX_CONCURRENT: Maximum concurrent requests (default: 10)
  • LOG_LEVEL: Logging level (default: INFO)
  • LOG_API_REQUESTS: Log API requests for debugging (default: false)

Example Configuration

export OPENALEX_EMAIL="your.email@example.com"
export LOG_LEVEL="DEBUG"
export LOG_API_REQUESTS="true"

Usage

Running the MCP Server

openalex-mcp

Available Tools

1. search_works

Search for scholarly works (papers, articles, books, datasets).

Parameters:

  • query (required): Search query for works
  • author: Filter by author name
  • year_from: Filter works from this year onwards
  • year_to: Filter works up to this year
  • venue: Filter by venue/journal name
  • topic: Filter by research topic/field
  • open_access: Filter for open access works only
  • sort: Sort order (relevance_score, cited_by_count, publication_date)
  • limit: Number of results (max 50, default 10)

Example:

{
  "query": "machine learning",
  "year_from": 2020,
  "open_access": true,
  "sort": "cited_by_count",
  "limit": 10
}
2. search_authors

Search for authors and researchers.

Parameters:

  • query (required): Search query for author names
  • institution: Filter by institution name
  • topic: Filter by research area/topic
  • h_index_min: Minimum h-index
  • works_count_min: Minimum number of works
  • sort: Sort order (relevance_score, cited_by_count, works_count, h_index)
  • limit: Number of results (max 50, default 10)
3. search_institutions

Search for academic institutions.

Parameters:

  • query (required): Search query for institution names
  • country: Filter by country code (e.g., 'US', 'GB', 'CA')
  • type: Filter by institution type
  • works_count_min: Minimum number of works
  • sort: Sort order (relevance_score, cited_by_count, works_count)
  • limit: Number of results (max 50, default 10)
4. search_sources

Search for journals, conferences, and publication venues.

Parameters:

  • query (required): Search query for source names
  • type: Filter by source type (journal, conference, repository, etc.)
  • publisher: Filter by publisher name
  • open_access: Filter for open access sources only
  • works_count_min: Minimum number of works published
  • sort: Sort order (relevance_score, cited_by_count, works_count, h_index)
  • limit: Number of results (max 50, default 10)
5. get_work_details

Get detailed information about a specific work.

Parameters:

  • work_id (required): OpenAlex work ID or DOI
6. get_author_profile

Get detailed profile information about a specific author.

Parameters:

  • author_id (required): OpenAlex author ID or ORCID
7. get_citations

Get works that cite a specific work for citation analysis.

Parameters:

  • work_id (required): OpenAlex work ID or DOI of the work to find citations for
  • sort: Sort order (publication_date, cited_by_count, relevance_score)
  • limit: Number of citing works (max 50, default 20)

Integration with MCP Clients

Claude Desktop

Option 1: Using pip installation (Recommended)
{
  "mcpServers": {
    "OpenAlexMCP": {
      "command": "/path/to/venv/bin/python",
      "args": [
        "-m",
        "src.openalex_mcp.server"
      ],
      "cwd": "/path/to/openalex-mcp",
      "env": {
        "OPENALEX_EMAIL": "your.email@example.com"
      }
    }
  }
}

Continue.dev

Add to your Continue configuration:

{
  "mcpServers": [
    {
      "name": "openalex",
      "command": ["openalex-mcp"],
      "env": {
        "OPENALEX_EMAIL": "your.email@example.com"
      }
    }
  ]
}

Development

Setup Development Environment

With pip
git clone https://github.com/your-username/openalex-mcp.git
cd openalex-mcp
pip install -e ".[dev]"

Run Tests

The project includes a comprehensive test suite with 127 passing tests covering unit tests, integration tests, and full code coverage.

Quick Test Commands
With pip/Python
# Run all tests (recommended)
python -m pytest tests/ -v

# Run unit tests only (fast, no network required)  
python -m pytest tests/ -m "not slow and not integration" -v

# Run with coverage report
python -m pytest tests/ --cov=src --cov-report=html
Using the Test Runner Script
# Install dependencies and run unit tests
python run_tests.py --install-deps --type unit

# Run with linting and formatting
python run_tests.py --lint --format --type unit

# Run coverage tests
python run_tests.py --type coverage

# Run integration tests (requires network)
python run_tests.py --type integration
Current Test Status: āœ… 127/127 Tests Passing

All tests have been recently updated and fixed to work with the current FastMCP server implementation.

Test Types
  • Unit Tests: Fast tests that don't require network access, use mocked API responses
  • Integration Tests: Tests against the real OpenAlex API (marked as slow and integration)
  • Coverage Tests: Unit tests with code coverage reporting
Test Structure
tests/
ā”œā”€ā”€ conftest.py          # Pytest configuration and fixtures
ā”œā”€ā”€ test_client.py       # OpenAlex API client tests (19 tests)
ā”œā”€ā”€ test_tools.py        # MCP tools functionality tests (33 tests)
ā”œā”€ā”€ test_server.py       # FastMCP server tests (10 tests) 
ā”œā”€ā”€ test_config.py       # Configuration management tests (13 tests)
ā”œā”€ā”€ test_models.py       # Pydantic model tests (26 tests)
ā”œā”€ā”€ test_logging.py      # Logging functionality tests (17 tests)
└── test_integration.py  # Integration tests (9 tests, network required)
Recent Test Fixes

The test suite has been comprehensively updated to fix all issues:

  • āœ… Server Tests: Updated for FastMCP architecture with @mcp.tool() decorators
  • āœ… Logging Tests: Fixed caplog capture with temporary logger propagation
  • āœ… Integration Tests: Removed fake emails causing API 400 errors
  • āœ… Config Tests: Fixed module import consistency for dynamic config
  • āœ… API Parameters: Fixed invalid field names for OpenAlex API compatibility
  • āœ… Sort Parameters: Updated default sort from problematic relevance_score to cited_by_count
  • āœ… All Dependencies: Updated for latest FastMCP and async patterns
Running Specific Tests
# Run only client tests
pytest tests/test_client.py

# Run tests matching a pattern
pytest -k "search_works"

# Run with verbose output
pytest -v

# Skip integration tests
pytest -m "not integration and not slow"

Code Formatting

With pip/Python
black src/ tests/
ruff check src/ tests/
mypy src/

Building and Packaging

With pip/build
# Build source distribution and wheel
python -m build

# This creates:
# - dist/openalex_mcp-1.0.0.tar.gz
# - dist/openalex_mcp-1.0.0-py3-none-any.whl
Package Contents Verification
# Check source distribution contents
tar -tzf dist/openalex_mcp-1.0.0.tar.gz

# Check wheel contents  
unzip -l dist/openalex_mcp-1.0.0-py3-none-any.whl

For detailed packaging instructions, see .

Rate Limits and Best Practices

  • Daily Limit: 100,000 requests per day per user
  • Polite Pool: Add your email address to get better performance and higher rate limits
  • Concurrent Requests: Limited to 10 concurrent requests by default
  • Pagination: Use pagination for large result sets
  • Caching: Results are not cached by default - implement caching in your application if needed

OpenAlex Data Coverage

  • 240M+ Works: Journal articles, books, datasets, theses
  • Global Coverage: ~2x coverage compared to traditional databases
  • Author Disambiguation: Advanced author name disambiguation
  • Institution Mapping: Comprehensive institution identification
  • Open Access: Full open access status and location information
  • Citation Networks: Complete citation relationships
  • Research Topics: AI-powered topic classification

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

Troubleshooting

Common Issues

400 "Invalid" API Errors
  • Cause: Using fake email addresses (like test@example.com) with OpenAlex API
  • Solution: Either omit the email entirely or use a real email address
  • Note: Email is optional but provides better rate limits when valid
MCP JSON Protocol Errors
  • Cause: Logger outputting to stdout instead of stderr
  • Status: āœ… FIXED - Logger now correctly outputs to stderr
  • Context: This was causing "Unexpected non-whitespace character after JSON" errors
Test Failures
  • Status: āœ… FIXED - All 127 tests now pass
  • Recent Fixes:
    • Updated server tests for FastMCP architecture
    • Fixed logging test capture with proper propagation
    • Resolved config import inconsistencies
    • Updated integration tests to avoid API errors
Import Errors
  • Cause: Module import path inconsistencies
  • Status: āœ… FIXED - All imports use consistent relative paths
  • Solution: Client now uses relative imports (.config, .logutil)
API Parameter Errors (403 "Invalid query parameters")
  • Cause: Using outdated field names like author.display_name.search
  • Status: āœ… FIXED - Updated to correct API field names
  • Solution: Now uses raw_author_name.search and cited_by_count sort default
Sort Parameter Errors (403 "Sorting relevance score ascending is not allowed")
  • Cause: OpenAlex API doesn't allow ascending sort on relevance_score
  • Status: āœ… FIXED - Changed default sort to cited_by_count
  • Solution: Users can still specify relevance_score but it uses default sort order

Development Tips

# Verify all tests pass
python -m pytest tests/ -v

# Check for any import issues
python -c "from src.openalex_mcp.server import mcp; print('āœ… All imports working')"

# Test the server manually
python -m src.openalex_mcp.server

Support

Citation

If you use OpenAlex data in your research, please cite:

Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. ArXiv. https://arxiv.org/abs/2205.01833