SnippetSquid/OpenAlexMCP
If you are the rightful owner of OpenAlexMCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
OpenAlex MCP Server provides access to a vast scholarly database with over 240 million works, authors, institutions, and other academic entities.
search_works
Search for scholarly works such as papers, articles, books, and datasets.
search_authors
Search for authors and researchers.
search_institutions
Search for academic institutions.
search_sources
Search for journals, conferences, and publication venues.
get_work_details
Get detailed information about a specific work.
get_author_profile
Get detailed profile information about a specific author.
get_citations
Get works that cite a specific work for citation analysis.
OpenAlex MCP Server
A Model Context Protocol (MCP) server that provides access to the OpenAlex scholarly database containing 240M+ works, authors, institutions, and other academic entities.
Status: ā All 127 tests passing | š Production ready | š Full test coverage
Features
- Comprehensive Search: Search across works, authors, institutions, and publication venues
- Detailed Profiles: Get detailed information about specific works, authors, and institutions
- Citation Analysis: Track citations and find related works
- Rich Metadata: Access comprehensive scholarly data including affiliations, topics, and metrics
- Open Access Focus: Filter for open access publications and venues
- No Authentication Required: Free access to the complete OpenAlex database
- FastMCP Architecture: Built with the latest FastMCP framework for optimal performance
- Robust Error Handling: Comprehensive error handling and logging for production use
- Async Support: Full async/await support for high-performance concurrent requests
Installation
Development Setup
git clone https://github.com/your-username/openalex-mcp.git
cd openalex-mcp
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e ".[dev]"
Configuration
Environment Variables
OPENALEX_EMAIL
: Your email address (recommended for polite pool access and higher rate limits)OPENALEX_TIMEOUT
: Request timeout in seconds (default: 30.0)OPENALEX_MAX_CONCURRENT
: Maximum concurrent requests (default: 10)LOG_LEVEL
: Logging level (default: INFO)LOG_API_REQUESTS
: Log API requests for debugging (default: false)
Example Configuration
export OPENALEX_EMAIL="your.email@example.com"
export LOG_LEVEL="DEBUG"
export LOG_API_REQUESTS="true"
Usage
Running the MCP Server
openalex-mcp
Available Tools
1. search_works
Search for scholarly works (papers, articles, books, datasets).
Parameters:
query
(required): Search query for worksauthor
: Filter by author nameyear_from
: Filter works from this year onwardsyear_to
: Filter works up to this yearvenue
: Filter by venue/journal nametopic
: Filter by research topic/fieldopen_access
: Filter for open access works onlysort
: Sort order (relevance_score
,cited_by_count
,publication_date
)limit
: Number of results (max 50, default 10)
Example:
{
"query": "machine learning",
"year_from": 2020,
"open_access": true,
"sort": "cited_by_count",
"limit": 10
}
2. search_authors
Search for authors and researchers.
Parameters:
query
(required): Search query for author namesinstitution
: Filter by institution nametopic
: Filter by research area/topich_index_min
: Minimum h-indexworks_count_min
: Minimum number of workssort
: Sort order (relevance_score
,cited_by_count
,works_count
,h_index
)limit
: Number of results (max 50, default 10)
3. search_institutions
Search for academic institutions.
Parameters:
query
(required): Search query for institution namescountry
: Filter by country code (e.g., 'US', 'GB', 'CA')type
: Filter by institution typeworks_count_min
: Minimum number of workssort
: Sort order (relevance_score
,cited_by_count
,works_count
)limit
: Number of results (max 50, default 10)
4. search_sources
Search for journals, conferences, and publication venues.
Parameters:
query
(required): Search query for source namestype
: Filter by source type (journal
,conference
,repository
, etc.)publisher
: Filter by publisher nameopen_access
: Filter for open access sources onlyworks_count_min
: Minimum number of works publishedsort
: Sort order (relevance_score
,cited_by_count
,works_count
,h_index
)limit
: Number of results (max 50, default 10)
5. get_work_details
Get detailed information about a specific work.
Parameters:
work_id
(required): OpenAlex work ID or DOI
6. get_author_profile
Get detailed profile information about a specific author.
Parameters:
author_id
(required): OpenAlex author ID or ORCID
7. get_citations
Get works that cite a specific work for citation analysis.
Parameters:
work_id
(required): OpenAlex work ID or DOI of the work to find citations forsort
: Sort order (publication_date
,cited_by_count
,relevance_score
)limit
: Number of citing works (max 50, default 20)
Integration with MCP Clients
Claude Desktop
Option 1: Using pip installation (Recommended)
{
"mcpServers": {
"OpenAlexMCP": {
"command": "/path/to/venv/bin/python",
"args": [
"-m",
"src.openalex_mcp.server"
],
"cwd": "/path/to/openalex-mcp",
"env": {
"OPENALEX_EMAIL": "your.email@example.com"
}
}
}
}
Continue.dev
Add to your Continue configuration:
{
"mcpServers": [
{
"name": "openalex",
"command": ["openalex-mcp"],
"env": {
"OPENALEX_EMAIL": "your.email@example.com"
}
}
]
}
Development
Setup Development Environment
With pip
git clone https://github.com/your-username/openalex-mcp.git
cd openalex-mcp
pip install -e ".[dev]"
Run Tests
The project includes a comprehensive test suite with 127 passing tests covering unit tests, integration tests, and full code coverage.
Quick Test Commands
With pip/Python
# Run all tests (recommended)
python -m pytest tests/ -v
# Run unit tests only (fast, no network required)
python -m pytest tests/ -m "not slow and not integration" -v
# Run with coverage report
python -m pytest tests/ --cov=src --cov-report=html
Using the Test Runner Script
# Install dependencies and run unit tests
python run_tests.py --install-deps --type unit
# Run with linting and formatting
python run_tests.py --lint --format --type unit
# Run coverage tests
python run_tests.py --type coverage
# Run integration tests (requires network)
python run_tests.py --type integration
Current Test Status: ā 127/127 Tests Passing
All tests have been recently updated and fixed to work with the current FastMCP server implementation.
Test Types
- Unit Tests: Fast tests that don't require network access, use mocked API responses
- Integration Tests: Tests against the real OpenAlex API (marked as
slow
andintegration
) - Coverage Tests: Unit tests with code coverage reporting
Test Structure
tests/
āāā conftest.py # Pytest configuration and fixtures
āāā test_client.py # OpenAlex API client tests (19 tests)
āāā test_tools.py # MCP tools functionality tests (33 tests)
āāā test_server.py # FastMCP server tests (10 tests)
āāā test_config.py # Configuration management tests (13 tests)
āāā test_models.py # Pydantic model tests (26 tests)
āāā test_logging.py # Logging functionality tests (17 tests)
āāā test_integration.py # Integration tests (9 tests, network required)
Recent Test Fixes
The test suite has been comprehensively updated to fix all issues:
- ā
Server Tests: Updated for FastMCP architecture with
@mcp.tool()
decorators - ā Logging Tests: Fixed caplog capture with temporary logger propagation
- ā Integration Tests: Removed fake emails causing API 400 errors
- ā Config Tests: Fixed module import consistency for dynamic config
- ā API Parameters: Fixed invalid field names for OpenAlex API compatibility
- ā
Sort Parameters: Updated default sort from problematic
relevance_score
tocited_by_count
- ā All Dependencies: Updated for latest FastMCP and async patterns
Running Specific Tests
# Run only client tests
pytest tests/test_client.py
# Run tests matching a pattern
pytest -k "search_works"
# Run with verbose output
pytest -v
# Skip integration tests
pytest -m "not integration and not slow"
Code Formatting
With pip/Python
black src/ tests/
ruff check src/ tests/
mypy src/
Building and Packaging
With pip/build
# Build source distribution and wheel
python -m build
# This creates:
# - dist/openalex_mcp-1.0.0.tar.gz
# - dist/openalex_mcp-1.0.0-py3-none-any.whl
Package Contents Verification
# Check source distribution contents
tar -tzf dist/openalex_mcp-1.0.0.tar.gz
# Check wheel contents
unzip -l dist/openalex_mcp-1.0.0-py3-none-any.whl
For detailed packaging instructions, see .
Rate Limits and Best Practices
- Daily Limit: 100,000 requests per day per user
- Polite Pool: Add your email address to get better performance and higher rate limits
- Concurrent Requests: Limited to 10 concurrent requests by default
- Pagination: Use pagination for large result sets
- Caching: Results are not cached by default - implement caching in your application if needed
OpenAlex Data Coverage
- 240M+ Works: Journal articles, books, datasets, theses
- Global Coverage: ~2x coverage compared to traditional databases
- Author Disambiguation: Advanced author name disambiguation
- Institution Mapping: Comprehensive institution identification
- Open Access: Full open access status and location information
- Citation Networks: Complete citation relationships
- Research Topics: AI-powered topic classification
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
Troubleshooting
Common Issues
400 "Invalid" API Errors
- Cause: Using fake email addresses (like
test@example.com
) with OpenAlex API - Solution: Either omit the email entirely or use a real email address
- Note: Email is optional but provides better rate limits when valid
MCP JSON Protocol Errors
- Cause: Logger outputting to stdout instead of stderr
- Status: ā FIXED - Logger now correctly outputs to stderr
- Context: This was causing "Unexpected non-whitespace character after JSON" errors
Test Failures
- Status: ā FIXED - All 127 tests now pass
- Recent Fixes:
- Updated server tests for FastMCP architecture
- Fixed logging test capture with proper propagation
- Resolved config import inconsistencies
- Updated integration tests to avoid API errors
Import Errors
- Cause: Module import path inconsistencies
- Status: ā FIXED - All imports use consistent relative paths
- Solution: Client now uses relative imports (
.config
,.logutil
)
API Parameter Errors (403 "Invalid query parameters")
- Cause: Using outdated field names like
author.display_name.search
- Status: ā FIXED - Updated to correct API field names
- Solution: Now uses
raw_author_name.search
andcited_by_count
sort default
Sort Parameter Errors (403 "Sorting relevance score ascending is not allowed")
- Cause: OpenAlex API doesn't allow ascending sort on
relevance_score
- Status: ā
FIXED - Changed default sort to
cited_by_count
- Solution: Users can still specify
relevance_score
but it uses default sort order
Development Tips
# Verify all tests pass
python -m pytest tests/ -v
# Check for any import issues
python -c "from src.openalex_mcp.server import mcp; print('ā
All imports working')"
# Test the server manually
python -m src.openalex_mcp.server
Support
- Documentation: OpenAlex API Documentation
- Issues: GitHub Issues
- OpenAlex: OpenAlex Website
Citation
If you use OpenAlex data in your research, please cite:
Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. ArXiv. https://arxiv.org/abs/2205.01833