arxiv-mcp-server
If you are the rightful owner of arxiv-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
ArXivBot is a research automation tool that integrates with AI assistants using the Model Context Protocol (MCP) to streamline academic research tasks.
ArXivBot - Research Automation for ArXiv Papers
š¤ Automate your literature review - A powerful research bot with 45+ tools that finds evidence to support or contradict your hypotheses across ArXiv papers
Latest Update (v1.0.1): Fixed critical search function issue in MCP integration. Added diagnostics tool for troubleshooting.
ArXivBot is both a CLI tool for direct command-line research automation AND an MCP server for AI assistant integration. It automates the tedious parts of academic research: searching papers, extracting key information, finding supporting or contradicting evidence, and building a searchable knowledge base. Let the bot handle the grunt work while you focus on the science.
šļø Two Ways to Use ArXivBot
1. Direct CLI Usage
# Use ArXivBot directly from your terminal
arxiv-cli search "quantum computing" --max-results 10
arxiv-cli find-support "My hypothesis about X" --all
2. MCP Integration with AI Assistants
// Connect to Claude or other AI assistants
{
"mcpServers": {
"arxiv-bot": {
"command": "python",
"args": ["-m", "arxiv_mcp_server"]
}
}
}
This dual interface means you can use ArXivBot standalone for automated workflows OR let AI assistants like Claude access its capabilities for more complex research tasks.
šÆ The Power of Bolster & Contradict
The killer feature of ArXivBot is its ability to automatically find evidence across multiple papers that either supports (bolsters) or challenges (contradicts) your research hypotheses:
# Test your hypothesis against the literature
arxiv-cli find-support "Quantum computers can solve NP-complete problems in polynomial time" \
--all \
--type both
# Output:
# SUPPORTING EVIDENCE (3 findings):
# ⢠Paper 2401.12345: "Recent experiments demonstrate polynomial speedup for specific NP problems..."
# ⢠Paper 2401.67890: "Our quantum algorithm achieves O(n²) complexity for subset sum..."
#
# CONTRADICTING EVIDENCE (7 findings):
# ⢠Paper 2402.11111: "Theoretical limits show exponential lower bounds remain for general NP..."
# ⢠Paper 2402.22222: "No quantum advantage observed in comprehensive NP-complete benchmarks..."
This feature alone can save days of manual paper reading by automatically identifying relevant passages across your entire paper library.
š Quick Start
Installation Options
Option 1: Docker (Recommended for Production)
Best for users who want isolation and easy deployment. See for details.
# Clone and run with Docker
git clone https://github.com/yourusername/arxiv-mcp-server.git
cd arxiv-mcp-server
docker compose up -d arxiv-mcp
Option 2: Local Installation
For development or users comfortable with Python environments.
# Clone the repository
git clone https://github.com/yourusername/arxiv-mcp-server.git
cd arxiv-mcp-server
# Install with uv (recommended)
uv sync
# Or install with pip
pip install -e .
Quick Test (No API Keys Required!)
# Test search functionality
arxiv-cli search "quantum computing" --max-results 5
# Test with mock evidence extraction
arxiv-cli find-support "Quantum computers are faster" --all --provider mock
# Check server health and connectivity (new!)
arxiv-cli diagnostics
Basic Automation Workflow
# 1. Bot searches for relevant papers
arxiv-cli search "transformer architecture" --max-results 20
# 2. Bot downloads them all
arxiv-cli batch-download --search "transformer architecture" --max 10
# 3. Bot finds evidence for your hypothesis
arxiv-cli find-support "Attention mechanisms improve model interpretability" --all
# 4. Bot extracts all citations for your bibliography
arxiv-cli extract-citations 2401.12345 --format bibtex >> refs.bib
Enable Semantic Search (Optional)
# Index downloaded papers for natural language search
arxiv-cli index-papers
# Search with natural language
arxiv-cli semantic-search "how do transformers handle long sequences"
New: Daily Research Workflow
# Set up daily digest for your research area
arxiv-cli create-digest "My Research" \
--keywords "transformer,attention" \
--authors "Vaswani,Hinton" \
--categories "cs.LG,cs.CL"
# Get your daily digest
arxiv-cli daily-digest --format markdown > today.md
# Add interesting papers to reading list
arxiv-cli add-reading 2401.12345 --priority high --tags "important"
# Track citations of your papers
arxiv-cli citations 1706.03762 --limit 20
# Export bibliography when writing
arxiv-cli export-reading --format bibtex --tags "my-paper" > refs.bib
Enhanced Daily Workflow Features
# Check if your saved papers have been updated
arxiv-cli check-updates --all
# Follow your favorite researchers
arxiv-cli follow "Yoshua Bengio" --notes "Deep learning pioneer"
arxiv-cli check-authors --days 30
# Quick citation copying (multiple styles)
arxiv-cli copy-cite 1706.03762 --style apa # Copies to clipboard!
# Save complex searches as templates
arxiv-cli save-search "ML Security" \
--query "adversarial attacks" \
--author "Goodfellow" \
--category "cs.LG"
arxiv-cli run-search "ML Security"
# Organize papers by project
arxiv-cli create-collection "PhD Chapter 3" --desc "Attention mechanisms"
arxiv-cli add-to-collection 1706.03762 "PhD Chapter 3" --notes "Seminal paper"
šŖ Core Automation Features
1. Evidence Mining (Bolster/Contradict)
The bot's most powerful feature - automatically mine papers for supporting or contradicting evidence:
# Find supporting evidence only
arxiv-cli find-support "Thermal storage at 600°C is feasible" --all --type bolster
# Find contradicting evidence only
arxiv-cli find-support "Thermal storage at 600°C is feasible" --all --type contradict
# Find both and build a balanced view
arxiv-cli find-support "Thermal storage at 600°C is feasible" --all --type both
# Search your findings database later
arxiv-cli search-findings "thermal storage" --type contradict
The bot analyzes papers section by section, extracting relevant excerpts with confidence scores and storing them in a searchable database.
2. Automated Literature Processing
Let the bot handle the repetitive tasks:
# Bot downloads papers matching your criteria
arxiv-cli batch-download --search "quantum error correction" --max 20
# Bot summarizes each paper
for paper in $(arxiv-cli list-papers); do
arxiv-cli summarize $paper --type abstract >> summaries.txt
done
# Bot extracts all code examples
arxiv-cli analyze-code 2401.12345 --lang python --extract-functions
3. Research Validation Automation
Validate your research claims against the literature:
# Bot compares papers with your research
arxiv-cli compare 2401.12345 "My approach uses topological quantum codes"
# Bot finds similar papers to check for prior art
arxiv-cli find-similar 2401.12345 --type content --top 10
4. Knowledge Base Building
The bot automatically builds a searchable research database:
# Bot stores your insights
arxiv-cli add-note 2401.12345 "Contradicts our approach but methodology is sound" \
--tag contradiction --tag methodology
# Bot searches your knowledge base
arxiv-cli search-findings "methodology" --top 20
š ļø Complete Tool Arsenal
Research Automation Tools
- find-support - Mine papers for supporting/contradicting evidence ā
- search-findings - Query your evidence database
- batch-download - Mass download papers matching criteria
- compare - Compare papers with your research
Information Extraction
- summarize - Auto-generate paper summaries
- extract-citations - Extract bibliography in any format
- extract-sections - Pull specific sections from papers
- analyze-code - Extract and analyze code blocks
Discovery & Organization
- search - Smart ArXiv search with filters
- find-similar - Discover related papers
- add-note - Build your knowledge base
- list-notes - Search your annotations
Analysis Tools
- describe-content - AI description of figures/tables
- conversion-options - PDF processing options
- system-stats - Bot performance metrics
Semantic Search
- index-papers - Build search index with embeddings
- semantic-search - Natural language search across papers
- search-stats - View search database statistics
š¬ Research Automation Examples
Hypothesis Testing Workflow
# Define your hypothesis
HYPOTHESIS="Transformer models scale linearly with data size"
# 1. Bot finds relevant papers
arxiv-cli search "transformer scaling laws" --max-results 30
# 2. Bot downloads them all
arxiv-cli batch-download --search "transformer scaling laws" --max 30
# 3. Bot finds all supporting evidence
arxiv-cli find-support "$HYPOTHESIS" --all --type bolster > supporting.txt
# 4. Bot finds all contradicting evidence
arxiv-cli find-support "$HYPOTHESIS" --all --type contradict > contradicting.txt
# 5. Bot helps you analyze the balance
echo "Supporting: $(wc -l < supporting.txt) findings"
echo "Contradicting: $(wc -l < contradicting.txt) findings"
Prior Art Search
# Your innovation
IDEA="Using attention mechanisms for time series forecasting"
# Bot searches for prior work
arxiv-cli find-support "$IDEA" --all --type bolster
# If bot finds many supporting cases, prior art exists
# If bot finds few/none, your idea might be novel!
Literature Review Automation
# Bot builds your literature review
TOPIC="quantum machine learning"
# 1. Systematic search
arxiv-cli search "$TOPIC" --from 2020-01-01 --max-results 100
# 2. Bulk download
arxiv-cli batch-download --search "$TOPIC" --max 50
# 3. Extract key findings
arxiv-cli find-support "Quantum ML provides exponential speedup" --all
# 4. Generate bibliography
for paper in $(arxiv-cli list-papers); do
arxiv-cli extract-citations $paper --format bibtex
done > bibliography.bib
š¤ MCP Server Integration
ArXivBot implements the Model Context Protocol (MCP), making all its tools available to AI assistants like Claude. This means you can ask Claude to use ArXivBot's capabilities naturally:
Example Claude Interactions:
- "Use ArXivBot to find papers that contradict the idea that quantum computers can break RSA"
- "Search for recent transformer papers and find evidence supporting attention mechanism efficiency"
- "Download papers about nuclear fusion and extract all their citations"
MCP Configuration Options
Option 1: Docker (Recommended)
{
"mcpServers": {
"arxiv-bot": {
"command": "bash",
"args": ["/path/to/arxiv-mcp-server/scripts/docker-mcp-wrapper.sh"]
}
}
}
Option 2: Local Python
{
"mcpServers": {
"arxiv-bot": {
"command": "python",
"args": ["-m", "arxiv_mcp_server"]
}
}
}
Or with custom storage path:
{
"mcpServers": {
"arxiv-bot": {
"command": "python",
"args": ["-m", "arxiv_mcp_server", "--storage-path", "/path/to/papers"]
}
}
}
The MCP server exposes all 45+ tools to AI assistants, allowing them to automate complex research workflows on your behalf, including:
- Reading list management and paper update checking
- Author following and new paper notifications
- Daily digests with personalized filtering
- Citation tracking and quick citation copying
- Search templates for repeated queries
- Paper collections for project organization
- Export to all major reference managers
š§ Troubleshooting
Search Not Working?
If you're experiencing issues with the search function, use the diagnostics tool:
# Check server health
arxiv-cli diagnostics
# Output shows:
# - ArXiv API connectivity status
# - Storage path and statistics
# - Python environment details
# - Any configuration issues
Common fixes:
- Network Issues: Ensure you can reach arxiv.org
- Rate Limiting: The server automatically handles rate limits with retries
- Empty Results: Try broader search terms or remove date filters
MCP Integration Issues?
For Claude Desktop or other MCP clients showing "No result received":
- Run
arxiv-cli diagnostics
to check server health - Check MCP server logs for detailed error messages
- Ensure proper path configuration in your MCP settings
āļø Configuration
Storage Location
export ARXIV_STORAGE_PATH=/path/to/your/papers
LLM Providers (for advanced features)
ArXivBot works out of the box with a mock provider for testing. For production use with real AI analysis:
# For Gemini (recommended - has free tier)
export GEMINI_API_KEY=your-key
# For OpenAI
export OPENAI_API_KEY=your-key
# For Anthropic
export ANTHROPIC_API_KEY=your-key
# Default is mock provider (no API key needed)
arxiv-cli find-support "test hypothesis" --all # Works without API keys!
PDF Conversion
# Fast mode (default)
arxiv-cli download 2401.12345 --converter pymupdf4llm
# Accurate mode (slower but better for tables/equations)
arxiv-cli download 2401.12345 --converter marker-pdf
šļø Architecture
arxiv-bot/
āāā src/arxiv_mcp_server/
ā āāā tools/ # 45+ automation tools
ā ā āāā research_support.py # Bolster/contradict engine
ā ā āāā reading_list.py # Paper organization
ā ā āāā daily_digest.py # Filtered notifications
ā ā āāā citation_tracking.py # Citation networks
ā ā āāā export_references.py # BibTeX/RIS export
ā ā āāā paper_updates.py # Version tracking
ā ā āāā author_follow.py # Author monitoring
ā ā āāā quick_cite.py # Citation copying
ā ā āāā search_templates.py # Saved searches
ā ā āāā paper_collections.py # Project organization
ā ā āāā ...
ā āāā converters/ # PDF processors
ā āāā llm_providers.py # AI integrations
ā āāā cli.py # Command interface
ā āāā server.py # MCP server
āāā examples/ # Automation examples
āāā docs/ # Documentation
š Documentation
- - Complete list of all 30+ research automation tools
- - All bot commands at a glance
- - Detailed automation workflows
- - Complete automation scripts
š¤ Contributing
Help make research automation even better:
- Fork the repository
- Create a feature branch
- Follow coding standards
- Submit a pull request
Ideas for contributions:
- Improve bolster/contradict detection algorithms
- Add new paper sources beyond ArXiv
- Create research workflow templates
- Enhance the evidence ranking system
š License
MIT License - see for details.
ArXivBot - Automating Literature Review Since 2024
Let the bot read papers while you do science