paper-intelligence by jonastbrg - MCP Server

Paper Intelligence System (PIS)

A local-first database and assistant layer for organizing, analyzing, and retrieving research papers efficiently.

Features

Paper Management: Add, query, and organize research papers with rich metadata
Local SQLite Database: Fast, reliable, and fully offline-capable
YAML Metadata: Human-readable metadata files for each paper
Flexible Querying: Search by title, author, tags, year, importance, and more
Export Capabilities: Export summaries and notes to Markdown
MCP Server: Interact with your paper database through AI assistants (Claude, etc.)
Extensible: Ready for AI integration, semantic search, and automation

Directory Structure

paper-intelligence/
│
├── papers.db                    # SQLite database (created on first run)
├── README.md                    # This file
├── MCP_SETUP.md                 # MCP server setup guide
├── requirements.txt             # Python dependencies
├── pyproject.toml               # Python project configuration
├── mcp_server.py                # MCP server implementation
├── .gitignore                   # Git ignore rules
│
├── raw/                         # PDF files (gitignored)
├── metadata/                    # YAML metadata files (gitignored)
├── scripts/                     # Python scripts
│   ├── init_db.py              # Database initialization
│   ├── ingest_paper.py         # Add new papers
│   ├── query_papers.py         # Query and search
│   └── summarize_paper.py      # Summarize and export
└── embeddings/                  # (Future) Vector embeddings (gitignored)

Setup

1. Install Dependencies

pip install -r requirements.txt

Core dependencies: pyyaml, mcp (for MCP server). Additional dependencies are optional for future features.

2. MCP Server Setup (Optional)

If you want to use this system with AI assistants like Claude:

For Claude Code (CLI)

Add to your Claude Code MCP settings file (~/.config/claude-code/mcp_settings.json):

{
  "mcpServers": {
    "paper-intelligence": {
      "command": "python3",
      "args": [
        "/path/to/paper-intelligence/mcp_server.py"
      ]
    }
  }
}

Replace /path/to/paper-intelligence/ with the actual path to your cloned repository.

Then restart Claude Code or reload the MCP servers.

For Claude Desktop

See for Claude Desktop configuration instructions.

3. Initialize Database

The database has already been initialized, but you can reinitialize it if needed:

python3 scripts/init_db.py

Usage

Add a New Paper

# Move PDF to database (removes original)
python3 scripts/ingest_paper.py path/to/paper.pdf

# Copy PDF to database (keeps original)
python3 scripts/ingest_paper.py path/to/paper.pdf --copy

You'll be prompted to enter:

Title
Authors
Collaborators (optional)
Publication date (YYYY-MM-DD)
Summary/Abstract
Key ideas
Tags
Importance rating (1-10)

Query Papers

List all papers:

python3 scripts/query_papers.py list

List with filters:

# Filter by author
python3 scripts/query_papers.py list --author "Smith"

# Filter by tag
python3 scripts/query_papers.py list --tag "robotics"

# Filter by year
python3 scripts/query_papers.py list --year 2024

# Filter by minimum importance
python3 scripts/query_papers.py list --min-importance 8

# Combine filters
python3 scripts/query_papers.py list --tag "ML" --min-importance 7 --year 2024

# Show detailed view
python3 scripts/query_papers.py list --detailed

# Limit results
python3 scripts/query_papers.py list --limit 10

# Sort by importance, date, or title
python3 scripts/query_papers.py list --sort importance

Show specific paper:

python3 scripts/query_papers.py show <paper_id>

Search papers:

python3 scripts/query_papers.py search "adversarial attacks"

View statistics:

python3 scripts/query_papers.py stats

Update Paper Summaries

Interactive update:

python3 scripts/summarize_paper.py update <paper_id>

You can update:

Summary
Key ideas
Personal notes

Export to Markdown:

python3 scripts/summarize_paper.py export <paper_id>

Database Schema

Table: `papers`

Column	Type	Description
`id`	INTEGER	Auto-incrementing ID
`title`	TEXT	Paper title
`authors`	TEXT	Author list (comma-separated)
`collaborators`	TEXT	Key collaborators
`date_published`	TEXT	Publication date (YYYY-MM-DD)
`summary`	TEXT	Abstract + personal summary
`key_ideas`	TEXT	Key insights
`tags`	TEXT	Keywords/categories
`importance`	INTEGER	Rating (1-10)
`file_path`	TEXT	Path to PDF
`metadata_path`	TEXT	Path to YAML metadata
`added_at`	TEXT	Timestamp of ingestion

Table: `embeddings`

(For future semantic search capabilities)

Column	Type	Description
`paper_id`	INTEGER	Foreign key to papers
`embedding`	BLOB	Vector representation
`model`	TEXT	Embedding model name
`created_at`	TEXT	Timestamp

Examples

Example Workflow

# 1. Add a new paper
python3 scripts/ingest_paper.py ~/Downloads/new_paper.pdf

# 2. List all papers
python3 scripts/query_papers.py list

# 3. View a specific paper
python3 scripts/query_papers.py show 1

# 4. Update summary and notes
python3 scripts/summarize_paper.py update 1

# 5. Search for papers on a topic
python3 scripts/query_papers.py search "reinforcement learning"

# 6. Export paper to markdown
python3 scripts/summarize_paper.py export 1

# 7. View statistics
python3 scripts/query_papers.py stats

Future Enhancements

Phase 2: Automation

Folder watcher for automatic ingestion
PDF metadata extraction (PyPDF2, pdfplumber)
API integration (CrossRef, Semantic Scholar)
Embedding generation for semantic search

Phase 3: AI Integration

Automatic summarization using LLMs
Semantic search with vector embeddings
Related paper recommendations
REST API for LLM agents

Phase 4: Sync & Collaboration

Google Drive sync
Multi-user support
Citation network visualization
Obsidian/Notion integration

Tips

Tags: Use consistent, hierarchical tags (e.g., ML/RL, CV/detection)
Importance: Rate based on relevance to your research
Metadata Files: You can manually edit YAML files in /metadata/
Backup: Regularly backup papers.db and /raw/ folder

Troubleshooting

Database locked error:

Close any SQLite browser tools
Only one script should write to the database at a time

Import error for yaml:

pip install pyyaml

Permission denied:

chmod +x scripts/*.py

License

Personal research tool. Use freely for academic and research purposes.

Contributing

This is a personal system, but feel free to fork and extend for your needs.

Version: 1.0.0 Last Updated: 2025-10-25