vlad-ds/pdf-agent-mcp
If you are the rightful owner of pdf-agent-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
PDF Agent MCP is a Model Context Protocol server designed for efficient PDF processing and content extraction, tailored for AI systems to handle large documents without overwhelming context windows.
get_pdf_metadata
Extracts metadata from the PDF to understand document properties.
get_pdf_outline
Retrieves the table of contents to help navigate the document.
get_pdf_images
Converts PDF pages to images for visual content analysis.
PDF Agent MCP
A Model Context Protocol server designed for agentic reading and selective PDF processing. Enables AI systems to efficiently navigate and extract content from PDFs without overwhelming context windows.
Features
- Metadata Extraction: Get PDF properties, page count, and file information
- Text Extraction: Native text extraction with hybrid processing for better results
- Image Conversion: Convert PDF pages to optimized images for visual analysis
- Content Search: Pattern/regex search with context snippets
- Table of Contents: Extract bookmarks and document outline
- Flexible Path Support: Use absolute paths or relative paths from
~/pdf-agent/
Usage Guide
PDF Agent MCP solves the common problem of context window overflow when working with PDFs in AI tools.
Important: Do not drag PDFs into the chat - this will load the entire PDF content traditionally and bypass the intelligent processing. Instead, provide file paths or URLs to activate the PDF Agent tools for selective processing.
How to Use
For Local PDFs:
- Provide the absolute file path to your PDF
- Quick tip: Right-click your PDF → "Open with Chrome" → copy the address bar URL for the absolute path
For Online PDFs:
- Simply provide the PDF URL - the agent will download and process it locally
Key Benefits
- Selective Reading: The AI first examines metadata and outline, then opens only relevant pages
- Token Efficiency: Avoids images when possible, uses them only when necessary for visual analysis
- Scalable: Works with large documents (1000+ page textbooks) and multiple PDFs simultaneously
- Search Capability: Built-in pattern/regex search across PDF content
Approach
This MCP uses agentic search with simple tools rather than complex alternatives:
- No embedding creation, chunking, or vector storage required
- No multi-agent coordination or handoff complexity
- Just clean, effective tools that modern AI systems can use intelligently
Perfect for researchers, students, and professionals working with extensive PDF libraries.
AI Assistant Prompt for Optimal Usage
Copy this prompt into your AI assistant's custom instructions or context for best results:
When working with PDFs using the PDF Agent MCP tools, follow this strategic approach:
### 1. Query Analysis & PDF Identification
- **Think carefully** about the user's search query and information needs
- **Identify which PDF(s)** are most likely to contain the answer
- Consider the document type, domain, and likely structure based on the query
### 2. Exploratory Phase (Always Start Here)
- **Get metadata** first using `get_pdf_metadata` to understand document size, creation date, and properties
- **Extract table of contents** with `get_pdf_outline` to understand document structure and navigation
- **Analyze the outline** to identify which sections are most relevant to the query
### 3. Strategic Content Extraction
Based on the outline and metadata:
- **Use page ranges** (`"5:10"`, `"20:"`) to focus on specific sections rather than entire documents
- **Extract images** with `get_pdf_images` when visual content is critical (charts, diagrams, tables, equations)
- **Choose text extraction strategy**: `hybrid` (default) for most cases, `native` for clean PDFs, `ocr` for scanned documents
### 4. Advanced Search Strategies
- **Use multiple search queries** with different keywords and synonyms
- **Apply regex patterns** for flexible matching: `/budget|cost|expense/gi` instead of single terms
- **Combine searches**: Start broad, then narrow down with specific terms
- **Use context characters** (150+ chars) to understand search result context
- **Implement early stopping** with `max_results` for large documents
### 5. Iterative Refinement
- **Start with targeted searches** based on outline analysis
- **Follow up with broader searches** if initial queries don't yield results
- **Extract specific page ranges** identified through search results
- **Use visual analysis** (images) when text extraction seems incomplete or when layout matters
### 6. Performance Optimization
- **Avoid processing entire large PDFs** - always use page ranges when possible
- **Use search with early stopping** before extracting large sections
- **Prefer search over full text extraction** for finding specific information
- **Extract images selectively** only when visual analysis is needed
### 7. Multi-Document Workflows
- **Process documents in parallel** when comparing multiple PDFs
- **Use consistent search terms** across documents for comparison
- **Combine results strategically** rather than processing everything at once
### Key Principles:
- **Strategic before comprehensive**: Understand document structure before diving deep
- **Search before extract**: Use pattern matching to locate relevant content first
- **Visual when necessary**: Extract images only when text extraction is insufficient
- **Iterative refinement**: Start targeted, expand scope as needed
- **Context preservation**: Always maintain enough context around search results
This approach maximizes efficiency, minimizes token usage, and provides more accurate, focused results than traditional "dump entire PDF" methods.
Requirements
Node.js: This extension requires Node.js LTS (Long Term Support version).
- Install Node.js LTS: Visit nodejs.org and download the LTS version
- Alternative: Use a Node.js version manager like nvm or fnm
Installation
Option 1: DXT Package (Recommended)
- Download the latest
pdf-agent-mcp.dxt
file from the releases - Double-click the
.dxt
file to install it in Claude Desktop
This will create a configuration file at:
Option 2: Manual Installation
- Clone this repository
- Build the project:
npm install && npm run build
- Find your Claude Desktop config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
Add the following:
{
"mcpServers": {
"pdf-agent": {
"command": "node",
"args": [
"PATH_TO_REPO/server/index.js"
]
}
}
}
Replace PATH_TO_REPO
with the actual path to your cloned repository.
Troubleshooting
If you experience issues loading the extension in Claude Desktop:
- Go to Claude > Settings > Extensions > Advanced Settings
- Disable "Use Built-in Node.js for MCP"
- Restart Claude Desktop
This ensures the extension uses your system's Node.js installation instead of Claude's built-in version.
Development
# Install dependencies
npm install
# Build the project
npm run build
# Create DXT package
npm run build:dxt
# Pack the final .dxt file for distribution
dxt pack
Viewing Logs
To debug issues, you can view the MCP server logs:
# View logs (macOS)
open "$HOME/Library/Logs/Claude/mcp-server-PDF Agent MCP.log"
# Stream logs in real-time (macOS)
tail -f "$HOME/Library/Logs/Claude/mcp-server-PDF Agent MCP.log"
# Clear/delete logs (macOS)
rm "$HOME/Library/Logs/Claude/mcp-server-PDF Agent MCP.log"
# View logs (Windows)
notepad "%LOCALAPPDATA%\Claude\Logs\mcp-server-PDF Agent MCP.log"
# Clear/delete logs (Windows)
del "%LOCALAPPDATA%\Claude\Logs\mcp-server-PDF Agent MCP.log"
License
MIT