mcp-pdf-reader

a3tai/mcp-pdf-reader

3.2

If you are the rightful owner of mcp-pdf-reader and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

MCP PDF Reader is an open source Model Context Protocol (MCP) server designed for reading and analyzing PDF documents, enabling seamless interaction with PDF files through a standardized protocol.

Tools
13
Resources
0
Prompts
0

MCP PDF Reader

Go Report Card Coverage Status License: MIT

A robust open source Model Context Protocol (MCP) server for reading and analyzing PDF documents. This server enables AI assistants and tools to seamlessly interact with PDF files through a standardized protocol.

๐ŸŒŸ Open Source & Community Driven - Built with โค๏ธ by the community, for the community.

๐Ÿš€ Features

  • ๐Ÿง  Smart Content Analysis: Intelligent PDF content type detection (text, scanned images, mixed, or no content)
  • ๐Ÿ“‹ Server Intelligence: New pdf_server_info tool provides comprehensive setup guidance and directory insights
  • ๐Ÿ“„ Enhanced PDF Processing: Read, validate, and extract text with automatic recommendations for next steps
  • ๐ŸŽฏ Workflow Guidance: Context-aware suggestions on when to use asset extraction based on content analysis
  • ๐Ÿ–ผ๏ธ Visual Asset Extraction: Detect and extract images from PDFs with format identification
  • ๐Ÿ” Smart Search: Find PDF files with fuzzy search capabilities
  • ๐Ÿ“Š Statistics: Get comprehensive directory and file statistics
  • ๐Ÿ—๏ธ Structured Data Extraction: Extract content with positioning coordinates, formatting, and semantic relationships
  • ๐Ÿ“Š Table Detection: Intelligent table structure recognition and data extraction
  • ๐Ÿ” Content Querying: Search and filter extracted content using flexible criteria
  • ๐Ÿ“‹ Comprehensive Metadata: Extract document properties, page information, and custom metadata
  • ๐Ÿ”„ Dual Mode Support:
    • Stdio Mode: Standard MCP protocol for AI assistants (Zed, Claude Desktop, etc.)
    • Server Mode: HTTP REST API with SSE transport for web integration
  • โšก Production Ready: Comprehensive error handling, logging, and graceful shutdown
  • ๐Ÿงช Well Tested: 65-76% test coverage with unit and integration tests
  • ๐Ÿ› ๏ธ Easy Integration: Simple installation and configuration

๐ŸŽฏ Use Cases

  • AI Code Editors: Integrate with Zed editor for PDF document analysis
  • Documentation Tools: Extract and analyze technical documentation with structure preservation
  • Research Assistants: Process academic papers and research documents with semantic understanding
  • Data Extraction: Extract structured data from forms, tables, and formatted documents
  • Content Management: Organize and search large PDF collections with intelligent querying
  • Web Applications: HTTP API for web-based PDF processing and analysis

๐Ÿ“ฆ Installation

Direct Install (Fastest)

If you have Go installed, you can install directly:

# Install directly from GitHub
go install github.com/a3tai/mcp-pdf-reader/cmd/mcp-pdf-reader@latest

# Verify installation
mcp-pdf-reader --help

Quick Install (Recommended)

# Clone the repository
git clone https://github.com/a3tai/mcp-pdf-reader.git
cd mcp-pdf-reader

# Build and install using Go's standard install method
make install

# Ensure Go's bin directory is in your PATH (usually already is)
export PATH="$(go env GOPATH)/bin:$PATH"

# Verify installation
mcp-pdf-reader --help

Manual Build

# Build from source (creates local binary)
make build

# Or install Go dependencies and build locally
go mod tidy
go build -o mcp-pdf-reader cmd/mcp-pdf-reader/main.go

# Or install directly with Go (installs to GOPATH/bin)
go install github.com/a3tai/mcp-pdf-reader/cmd/mcp-pdf-reader@latest

System Requirements

  • Go 1.21+ for building from source
  • Linux, macOS, or Windows (tested on all platforms)

๐Ÿ–ฅ๏ธ Usage

MCP Protocol Mode (Default)

Perfect for AI assistants and editors like Zed:

# Use current directory for PDFs (default)
mcp-pdf-reader

# Specify PDF directory
mcp-pdf-reader --dir=/path/to/documents

# Debug mode
mcp-pdf-reader --dir=/path/to/documents --log-level=debug

HTTP Server Mode

For web applications and REST API access:

# Start HTTP server
mcp-pdf-reader --mode=server --dir=/path/to/documents

# Custom host and port
mcp-pdf-reader --mode=server --host=0.0.0.0 --port=9090 --dir=/docs

# Health check
curl http://localhost:8080/health

๐Ÿ”ง Configuration Options

FlagDefaultDescription
--modestdioServer mode: stdio or server
--dircurrent directoryDirectory containing PDF files
--host127.0.0.1Server host (server mode only)
--port8080Server port (server mode only)
--log-levelinfoLog level: debug, info, warn, error
--max-file-size104857600Maximum PDF file size in bytes (100MB)

โšก Quick Reference

Common Commands

# Basic usage (stdio mode for MCP clients) - uses current directory
mcp-pdf-reader

# Specify custom directory
mcp-pdf-reader --dir=/path/to/pdfs

# Server mode for testing/debugging
mcp-pdf-reader --mode=server --dir=./docs

# Custom port and host
mcp-pdf-reader --mode=server --host=0.0.0.0 --port=9090

# Debug mode
mcp-pdf-reader --mode=server --log-level=debug --dir=./docs

# Larger file size limit (200MB)
mcp-pdf-reader --max-file-size=209715200 --dir=./docs

# Environment variables (alternative to flags)
MCP_PDF_DIR=/path/to/pdfs mcp-pdf-reader
MCP_PDF_MODE=server MCP_PDF_PORT=9090 mcp-pdf-reader

Quick Setup for Popular Editors

EditorConfig FileConfiguration
Zed~/.config/zed/settings.json"mcp-pdf-reader": {"command": {"path": "mcp-pdf-reader", "args": []}}
Cursor~/.cursor/settings.json"mcp-pdf-reader": {"command": "mcp-pdf-reader", "args": ["--dir=${workspaceFolder}"]}
Claude Desktop~/Library/Application Support/Claude/claude_desktop_config.json"mcp-pdf-reader": {"command": "mcp-pdf-reader", "args": ["--dir=/path/to/docs"]}
VS Code.vscode/settings.json"claude.mcpServers": {"mcp-pdf-reader": {"command": "mcp-pdf-reader", "args": ["--dir=${workspaceFolder}"]}}

Testing Your Setup

# 1. Verify installation
mcp-pdf-reader --help

# 2. Test with sample directory
mkdir -p ~/test-pdfs
mcp-pdf-reader -mode=server -pdfdir=~/test-pdfs

# 3. Check health endpoint (server mode)
curl http://localhost:8080/health

# 4. Test MCP tools
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | mcp-pdf-reader

๐Ÿ“ก MCP Tools

The server provides comprehensive PDF analysis tools via the MCP protocol, including both basic extraction and advanced structured analysis:

pdf_read_file

Extract text content from a PDF file.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/research.pdf"
}

pdf_assets_file

Extract visual assets like images from a PDF file.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/presentation.pdf"
}

pdf_validate_file

Validate if a file is a readable PDF.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/document.pdf"
}

pdf_stats_file

Get detailed statistics about a PDF file including metadata.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/report.pdf"
}

pdf_search_directory

List and search PDF files in a directory with optional fuzzy search.

Parameters:

  • directory (string): Directory path to search
  • query (string): Optional fuzzy search query

Example:

{
  "directory": "/home/user/documents",
  "query": "machine learning"
}

pdf_stats_directory

Get statistics about PDF files in a directory.

Parameters:

  • directory (string): Directory path to analyze

Example:

{
  "directory": "/home/user/documents"
}

pdf_extract_structured

Extract structured content with positioning coordinates and formatting information.

Parameters:

  • path (string): Full path to the PDF file
  • mode (string): Extraction mode - "raw", "structured", "semantic", "table", or "complete" (default: "structured")
  • config (object): Configuration options
    • extract_text (bool): Extract text content
    • extract_images (bool): Extract images
    • extract_tables (bool): Extract tables
    • extract_forms (bool): Extract form fields
    • extract_annotations (bool): Extract annotations
    • include_coordinates (bool): Include positioning coordinates
    • include_formatting (bool): Include formatting information
    • pages (array): Specific pages to extract (default: all)
    • min_confidence (number): Minimum confidence threshold

Example:

{
  "path": "/home/user/documents/form.pdf",
  "mode": "structured",
  "config": {
    "extract_text": true,
    "include_coordinates": true,
    "include_formatting": true,
    "pages": [1, 2, 3]
  }
}

pdf_extract_tables

Extract tabular data from PDF with structure preservation and cell-level analysis.

Parameters:

  • path (string): Full path to the PDF file
  • config (object): Configuration options
    • include_coordinates (bool): Include positioning coordinates
    • pages (array): Specific pages to extract (default: all)
    • min_confidence (number): Minimum confidence threshold

Example:

{
  "path": "/home/user/documents/spreadsheet.pdf",
  "config": {
    "include_coordinates": true,
    "min_confidence": 0.7
  }
}

pdf_extract_semantic

Extract content with semantic grouping and relationship detection.

Parameters:

  • path (string): Full path to the PDF file
  • config (object): Configuration options
    • include_coordinates (bool): Include positioning coordinates
    • include_formatting (bool): Include formatting information
    • pages (array): Specific pages to extract (default: all)
    • min_confidence (number): Minimum confidence threshold

Example:

{
  "path": "/home/user/documents/document.pdf",
  "config": {
    "include_coordinates": true,
    "include_formatting": true
  }
}

pdf_extract_complete

Comprehensive extraction of all content types (text, images, tables, forms, annotations).

Parameters:

  • path (string): Full path to the PDF file
  • config (object): Configuration options
    • pages (array): Specific pages to extract (default: all)
    • min_confidence (number): Minimum confidence threshold

Example:

{
  "path": "/home/user/documents/complex.pdf",
  "config": {
    "pages": [1, 2, 3],
    "min_confidence": 0.8
  }
}

pdf_query_content

Query and filter extracted PDF content using flexible search criteria.

Parameters:

  • path (string): Full path to the PDF file
  • query (object): Query criteria for filtering content
    • content_types (array): Content types to filter ("text", "image", "table", "form", "annotation")
    • pages (array): Pages to search
    • text_query (string): Text search query
    • min_confidence (number): Minimum confidence threshold
    • bounding_box (object): Spatial filter area
      • x (number): X coordinate
      • y (number): Y coordinate
      • width (number): Width
      • height (number): Height

Example:

{
  "path": "/home/user/documents/report.pdf",
  "query": {
    "content_types": ["text", "table"],
    "text_query": "revenue",
    "pages": [1, 2, 3],
    "min_confidence": 0.7
  }
}

pdf_get_page_info

Get detailed information about PDF pages including dimensions, layout, and properties.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/document.pdf"
}

pdf_get_metadata

Extract comprehensive document metadata and properties.

Parameters:

  • path (string): Full path to the PDF file

Example:

{
  "path": "/home/user/documents/document.pdf"
}

๐Ÿ”ฅ Enhanced Features

Smart Content Analysis

The PDF reader now provides intelligent content type detection and recommendations:

pdf_server_info - Get Started Faster

A new tool that provides comprehensive server information and usage guidance.

What it provides:

  • ๐Ÿ“‹ Server capabilities and configuration
  • ๐Ÿ“ Current directory contents (PDF files found)
  • ๐Ÿ› ๏ธ Complete list of available tools with usage guidance
  • ๐Ÿ“– Step-by-step workflow recommendations
  • ๐Ÿ–ผ๏ธ Supported image formats for asset extraction

Usage:

{
  "name": "pdf_server_info",
  "arguments": {}
}

Why use it: Start here to understand what PDFs are available and how to best analyze them.

Enhanced PDF Reading with Content Intelligence

The pdf_read_file tool now provides smart content analysis:

Content Type Detection:

  • ๐Ÿ“ text - PDF contains readable text content
  • ๐Ÿ–ผ๏ธ scanned_images - PDF contains scanned images with minimal text
  • ๐Ÿ”€ mixed - PDF contains both text and images
  • โŒ no_content - PDF appears empty or unreadable

Smart Recommendations:

  • โœ… Automatic guidance on whether to use pdf_assets_file
  • ๐Ÿ“Š Image count detection - know if images are present before extraction
  • ๐ŸŽฏ Next step suggestions based on content type

Enhanced Response Format:

Successfully read PDF: /path/to/document.pdf
Pages: 15
Size: 2458392 bytes
Content Type: mixed
Has Images: true
Image Count: 8

๐Ÿ’ก INFO: This PDF contains both text and images. You may want to use 'pdf_assets_file' to extract the images as well.

Content:
[extracted text content...]

Intelligent Workflow Guidance

The system now provides contextual recommendations:

  1. For text-based PDFs: Content is ready to use, no further action needed
  2. For scanned documents: Recommends using pdf_assets_file to extract images
  3. For mixed content: Suggests optional image extraction based on your needs
  4. For problematic files: Provides specific troubleshooting guidance

Better Error Handling and User Experience

  • ๐Ÿ” Proactive validation - tools suggest when files might not be readable
  • ๐Ÿ“‹ Rich context - understand your PDF directory contents upfront
  • ๐ŸŽฏ Targeted recommendations - know which tools to use when
  • ๐Ÿ“– Comprehensive guidance - built-in usage instructions and examples

๐ŸŽจ Integration Examples

๐ŸŽฏ Zed Editor

Add to your Zed settings (~/.config/zed/settings.json):

{
  "context_servers": {
    "mcp-pdf-reader": {
      "command": {
        "path": "mcp-pdf-reader",
        "args": ["-pdfdir=${workspaceFolder}"],
        "env": null
      },
      "settings": {}
    }
  }
}

Project-specific Zed configuration (.zed/settings.json in your project):

{
  "context_servers": {
    "mcp-pdf-reader": {
      "command": {
        "path": "mcp-pdf-reader",
        "args": ["-pdfdir=./docs"],
        "env": null
      },
      "settings": {}
    }
  }
}

๐ŸŽฏ Cursor IDE

Add to your Cursor settings (~/.cursor/settings.json):

{
  "mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "${workspaceFolder}"],
      "env": {}
    }
  }
}

For specific PDF directories:

{
  "mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "/path/to/your/documents"],
      "env": {}
    }
  }
}

๐ŸŽฏ Windsurf

Add to your Windsurf configuration (~/.windsurf/settings.json):

{
  "mcp": {
    "servers": {
      "mcp-pdf-reader": {
        "command": "mcp-pdf-reader",
        "args": ["-pdfdir", "${workspaceRoot}"],
        "env": {}
      }
    }
  }
}

Project-specific Windsurf config (.windsurf/settings.json):

{
  "mcp": {
    "servers": {
      "mcp-pdf-reader": {
        "command": "mcp-pdf-reader",
        "args": ["-pdfdir", "./documentation"],
        "env": {}
      }
    }
  }
}

๐ŸŽฏ Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "/path/to/your/documents"]
    }
  }
}

For multiple document directories:

{
  "mcpServers": {
    "mcp-pdf-reader-docs": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "/Users/yourname/Documents"]
    },
    "mcp-pdf-reader-research": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "/Users/yourname/Research/papers"]
    }
  }
}

๐ŸŽฏ Claude Code (VS Code Extension)

Add to your VS Code settings (settings.json):

{
  "claude.mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "${workspaceFolder}"],
      "env": {}
    }
  }
}

Workspace-specific settings (.vscode/settings.json):

{
  "claude.mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "./docs"],
      "env": {}
    }
  }
}

๐ŸŽฏ Roo Code

Add to your Roo configuration (~/.roo/config.json):

{
  "mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "{{workspace}}"],
      "cwd": "{{workspace}}"
    }
  }
}

For specific directories:

{
  "mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "/path/to/pdfs"],
      "cwd": "/path/to/pdfs"
    }
  }
}

๐ŸŽฏ Cline (VS Code Extension)

Add to your Cline settings in VS Code (settings.json):

{
  "cline.mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "${workspaceFolder}/docs"],
      "env": {}
    }
  }
}

Global Cline configuration:

{
  "cline.mcpServers": {
    "mcp-pdf-reader": {
      "command": "mcp-pdf-reader",
      "args": ["-pdfdir", "${env:HOME}/Documents"],
      "env": {}
    }
  }
}

๐Ÿ“ Common Configuration Patterns

Use Current Project Directory
# Most editors support workspace variables
-pdfdir=${workspaceFolder}      # Zed, VS Code-based
-pdfdir=${workspaceRoot}        # Windsurf
-pdfdir={{workspace}}           # Roo
Use Specific Subdirectory
# For documentation in your project
-pdfdir=./docs
-pdfdir=./documentation
-pdfdir=./papers
Use Home Directory
# For personal document collections
-pdfdir=${env:HOME}/Documents
-pdfdir=/Users/yourname/Documents      # macOS
-pdfdir=/home/yourname/Documents       # Linux
-pdfdir=C:\Users\yourname\Documents    # Windows
Multiple Instances

You can run multiple instances for different directories:

{
  "context_servers": {
    "mcp-pdf-reader-docs": {
      "command": {
        "path": "mcp-pdf-reader",
        "args": ["-pdfdir=./docs", "-port=8080"]
      }
    },
    "mcp-pdf-reader-research": {
      "command": {
        "path": "mcp-pdf-reader",
        "args": ["-pdfdir=/path/to/research", "-port=8081"]
      }
    }
  }
}

๐Ÿš€ Quick Setup Tips

  1. After Installation: The mcp-pdf-reader binary will be globally available if $(go env GOPATH)/bin is in your PATH (default with Go installations).

  2. Verify Installation: Run mcp-pdf-reader --help to ensure it's working.

  3. Test Configuration: Start with stdio mode (default) for MCP clients, use server mode for debugging.

  4. Path Variables: Most editors support workspace variables - use them for portable configurations.

  5. Multiple Directories: Create separate MCP server instances for different PDF collections.

๐Ÿ”ง Troubleshooting

Installation Issues

โŒ Command not found: mcp-pdf-reader

Problem: After installation, the binary is not found in PATH.

Solutions:

# Check if Go's bin directory is in your PATH
echo $PATH | grep $(go env GOPATH)/bin

# If not found, add to your shell profile
echo 'export PATH="$(go env GOPATH)/bin:$PATH"' >> ~/.bashrc  # Linux/WSL
echo 'export PATH="$(go env GOPATH)/bin:$PATH"' >> ~/.zshrc   # macOS (if using zsh)

# Reload your shell
source ~/.bashrc  # or ~/.zshrc
โŒ Permission denied during installation

Problem: Installation fails with permission errors.

Solutions:

# Don't use sudo with go install - it should install to your user directory
go install github.com/a3tai/mcp-pdf-reader/cmd/mcp-pdf-reader@latest

# If still having issues, check your GOPATH
go env GOPATH
go env GOBIN
โŒ Module not found or build errors

Problem: Build fails with module or dependency errors.

Solutions:

# Clean module cache and retry
go clean -modcache
go install github.com/a3tai/mcp-pdf-reader/cmd/mcp-pdf-reader@latest

# Or build from source
git clone https://github.com/a3tai/mcp-pdf-reader.git
cd mcp-pdf-reader
go mod tidy
make install

Configuration Issues

โŒ MCP server not connecting in editors

Problem: Editor can't connect to the MCP server.

Solutions:

  1. Verify binary is accessible:

    which mcp-pdf-reader
    mcp-pdf-reader --help
    
  2. Test in stdio mode:

    echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' | mcp-pdf-reader
    
  3. Check editor-specific config location:

    • Zed: ~/.config/zed/settings.json
    • Cursor: ~/.cursor/settings.json
    • Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
    • VS Code: .vscode/settings.json (workspace) or user settings
โŒ "Directory does not exist" errors

Problem: PDF directory path is invalid.

Solutions:

# Use absolute paths
"args": ["-pdfdir=/home/user/Documents"]

# Or verify workspace variables work in your editor
"args": ["-pdfdir=${workspaceFolder}/docs"]

# Create the directory if it doesn't exist
mkdir -p ~/Documents/pdfs
โŒ "No PDF files found" but files exist

Problem: Server can't find PDFs in the specified directory.

Solutions:

  1. Check file extensions (must be .pdf):

    ls -la /path/to/pdfs/*.pdf
    
  2. Test directory access:

    mcp-pdf-reader -mode=server -pdfdir=/path/to/pdfs
    # Then visit http://localhost:8080/health
    
  3. Check permissions:

    ls -la /path/to/pdfs/
    # Ensure read permissions on directory and files
    

Runtime Issues

โŒ Server crashes or exits immediately

Problem: MCP server terminates unexpectedly.

Solutions:

  1. Run in server mode for debugging:

    mcp-pdf-reader -mode=server -pdfdir=./docs -loglevel=debug
    
  2. Check for port conflicts (server mode):

    lsof -i :8080  # Check if port 8080 is in use
    mcp-pdf-reader -mode=server -port=8081  # Try different port
    
  3. Verify PDF directory permissions:

    # Test with a simple directory
    mkdir -p ~/test-pdfs
    mcp-pdf-reader -mode=server -pdfdir=~/test-pdfs
    
โŒ Large PDF files cause errors

Problem: "File too large" or memory errors.

Solutions:

# Increase file size limit (default: 100MB)
mcp-pdf-reader -maxfilesize=209715200  # 200MB

# Check file sizes
ls -lh /path/to/pdfs/*.pdf
โŒ PDF text extraction fails

Problem: PDF content appears empty or garbled.

Solutions:

  1. Test with different PDFs (some PDFs may be image-only or encrypted)
  2. Use validation tool:
    mcp-pdf-reader -mode=server -pdfdir=./docs
    # Then test with the validate_pdf tool
    

Editor-Specific Issues

๐ŸŽฏ Zed Editor
  • Restart Zed after config changes
  • Check Zed's output panel for MCP errors
  • Use absolute paths if workspace variables don't work
๐ŸŽฏ Cursor IDE
  • Restart Cursor after configuration changes
  • Check the "Output" tab for MCP-related logs
  • Ensure the MCP extension is enabled
๐ŸŽฏ Claude Desktop
  • Restart Claude Desktop after config changes
  • Check ~/Library/Logs/Claude/ for error logs (macOS)
  • Verify JSON syntax in config file
๐ŸŽฏ VS Code Extensions
  • Check extension logs in the "Output" panel
  • Verify the extension supports MCP servers
  • Try disabling/re-enabling the extension

Getting Help

If you're still having issues:

  1. Check the server health (server mode):

    curl http://localhost:8080/health
    
  2. Enable debug logging:

    mcp-pdf-reader -mode=server -loglevel=debug -pdfdir=./docs
    
  3. Create a minimal test case:

    mkdir test-mcp
    cd test-mcp
    echo "Test content" > test.pdf  # Not a real PDF, but tests basic functionality
    mcp-pdf-reader -mode=server -pdfdir=.
    
  4. Open an issue on GitHub with:

    • Your operating system
    • Go version (go version)
    • Editor/tool being used
    • Complete error messages
    • Configuration file contents

๐Ÿงช Development

Building and Testing

# Install dependencies
make deps

# Run tests
make test

# Run tests with coverage
make test-coverage

# Build for development
make build

# Run development server
make run

# Run in server mode
make run-server

Code Quality

# Format code
make fmt

# Run linter (requires golangci-lint)
make lint

# Cross-compile for all platforms
make build-all

Project Structure

mcp-pdf-reader/
โ”œโ”€โ”€ cmd/mcp-pdf-reader/     # Main application entry point
โ”œโ”€โ”€ internal/
โ”‚   โ”œโ”€โ”€ config/             # Configuration management
โ”‚   โ”œโ”€โ”€ mcp/               # MCP server implementation
โ”‚   โ””โ”€โ”€ pdf/               # PDF processing logic
โ”œโ”€โ”€ Makefile               # Build and development commands
โ”œโ”€โ”€ go.mod                 # Go module definition
โ””โ”€โ”€ README.md             # This file

๐ŸŒ API Reference (Server Mode)

Health Check

GET /health

Returns server health status and version information.

MCP Endpoints

GET /sse                   # Server-Sent Events endpoint
POST /message              # MCP message endpoint

๐Ÿค Contributing

We love contributions! This is an open source project and we welcome contributions from everyone. Whether you're fixing bugs, adding features, improving documentation, or helping with tests - every contribution matters.

How to Contribute

  1. ๐Ÿด Fork the repository on GitHub
  2. ๐ŸŒฟ Create a feature branch: git checkout -b feature/amazing-feature
  3. โœจ Make your changes and add comprehensive tests
  4. ๐Ÿงช Run the test suite: make test (ensure all tests pass)
  5. ๐ŸŽจ Format your code: make fmt
  6. ๐Ÿ“ Update documentation if needed
  7. ๐Ÿš€ Submit a pull request with a clear description

Ways to Contribute

  • ๐Ÿ› Bug Reports: Found a bug? Open an issue with reproduction steps
  • ๐Ÿ’ก Feature Requests: Have an idea? We'd love to hear it!
  • ๐Ÿ“– Documentation: Help improve our docs and examples
  • ๐Ÿงช Testing: Add tests or improve existing ones
  • ๐Ÿ”ง Code: Fix bugs or implement new features
  • ๐ŸŒ Translation: Help make this accessible to more people

Development Guidelines

  • Write clear, documented code
  • Add tests for new functionality
  • Follow Go best practices and idioms
  • Keep pull requests focused and atomic
  • Be respectful and constructive in discussions

๐Ÿ“Š Performance

  • Memory Efficient: Streaming PDF processing with configurable limits
  • Fast Search: Optimized file system traversal and indexing
  • Concurrent Safe: Handle multiple requests simultaneously
  • Resource Limits: Configurable file size limits and timeouts

๐Ÿ”’ Security

  • Input Validation: Comprehensive validation of all inputs
  • Path Sanitization: Prevents directory traversal attacks
  • File Size Limits: Configurable limits to prevent resource exhaustion
  • Secure Defaults: Safe configuration out of the box
  • Automated Security Scanning: Continuous security analysis with gosec

Security Scanning

This project uses gosec for automated security scanning of Go code. Security scans are automatically run on every pull request and release.

Running Security Scans Locally
# Install gosec
go install github.com/securego/gosec/v2/cmd/gosec@latest

# Run security scan
make gosec

# Or run directly with gosec
gosec -conf .gosec.json ./...
Security Configuration

Security scanning is configured via .gosec.json with:

  • Customized rules for Go security best practices
  • Exclusions for test files and false positives
  • Integration with GitHub Security tab via SARIF reports

๐Ÿ“„ License

This project is licensed under the MIT License - see the file for details.

๐ŸŒŸ Open Source Community

This project is proudly open source and maintained by contributors from around the world. We believe in the power of community-driven development to create better tools for everyone.

Join Our Community

Project Values

  • ๐Ÿ”“ Open: Transparent development and decision-making
  • ๐Ÿค Inclusive: Welcoming to all contributors regardless of experience level
  • ๐Ÿš€ Quality: Maintaining high standards through testing and code review
  • ๐Ÿ“– Documentation: Keeping documentation up-to-date and comprehensive

๐Ÿข About Rude Company LLC

Rude Company LLC is building innovative AI-powered development tools and open source solutions. We create intelligent systems that enhance developer productivity and enable seamless human-AI collaboration.

A3T is brought to you by Rude Company LLC and focuses on AI development tools and automation.

๐Ÿ“ž Support


Built with โค๏ธ by Rude Company LLC.