pdf-to-text-mcp

xxx87/pdf-to-text-mcp

3.2

If you are the rightful owner of pdf-to-text-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

A Model Context Protocol (MCP) server for converting PDF files to text, designed for seamless integration with Cursor IDE and other MCP-compatible applications.

Tools
1
Resources
0
Prompts
0

šŸ“„ PDF to Text MCP Server

License: MIT Node.js TypeScript MCP

A Model Context Protocol (MCP) server for converting PDF files to text, designed for seamless integration with Cursor IDE and other MCP-compatible applications.

šŸš€ Quick Start

# Clone the repository
git clone https://github.com/xxx87/pdf-to-text-mcp.git
cd pdf-to-text-mcp-server

# Install dependencies
yarn install

# Build the project
yarn build

# Test the server
yarn test

✨ Features

  • šŸ“‘ Multi-file Support - Convert one or multiple PDF files simultaneously
  • šŸ” Text Extraction - Extract text while preserving document structure
  • ⚔ Fast Processing - Efficient PDF parsing with pdf-parse library
  • šŸ”§ MCP Protocol - Full Model Context Protocol compliance
  • šŸŽÆ Cursor Integration - Designed specifically for Cursor IDE
  • šŸ›”ļø TypeScript - Fully typed for better development experience
  • āœ… Testing - Comprehensive test suite included

šŸ“‹ Table of Contents

šŸ› ļø Installation

Prerequisites

  • Node.js 18+
  • Yarn package manager
  • Cursor IDE (for MCP integration)

Local Installation

  1. Clone the repository

    git clone https://github.com/xxx87/pdf-to-text-mcp.git
    cd pdf-to-text-mcp-server
    
  2. Install dependencies

    yarn install
    
  3. Build the project

    yarn build
    
  4. Verify installation

    yarn test
    

šŸŽÆ Usage

Running as Standalone Server

yarn start

Integration with Cursor IDE

  1. Add to Cursor Configuration

    Add the following to your Cursor MCP settings:

    {
      "mcpServers": {
        "pdf-to-text": {
          "command": "node",
          "args": ["/absolute/path/to/pdf-to-text-mcp-server/dist/index.js"],
          "cwd": "/absolute/path/to/pdf-to-text-mcp-server"
        }
      }
    }
    

    āš ļø Important: Replace /absolute/path/to/pdf-to-text-mcp-server with your actual project path.

  2. Using in Cursor

    • Add PDFs: Drag and drop PDF files into Cursor
    • Convert: Use the pdf_to_text tool for automatic conversion
    • Analyze: The extracted text becomes available for AI analysis

Manual MCP Usage

// Example MCP JSON-RPC request
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "pdf_to_text",
    "arguments": {
      "file_paths": ["document1.pdf", "document2.pdf"]
    }
  }
}

āš™ļø Configuration

Environment Variables

VariableDescriptionDefault
NODE_ENVEnvironment modeproduction
LOG_LEVELLogging levelinfo

Custom Options

The server automatically handles PDF parsing with optimized settings. For custom configurations, modify the pdf-parse options in src/index.ts.

šŸ“š API Reference

Tools

pdf_to_text

Converts PDF files to readable text format.

Parameters:

  • file_paths (string[]): Array of PDF file paths to convert

Returns:

{
  content: [
    {
      type: "text",
      text: string // Extracted text with file separators
    }
  ];
}

Example Response:

{
  "content": [
    {
      "type": "text",
      "text": "Successfully converted 2 PDF file(s) to text:\n\n=== document1.pdf ===\nExtracted content here...\n\n=== document2.pdf ===\nMore content here..."
    }
  ]
}

šŸ—ļø Development

Project Structure

pdf-to-text-mcp-server/
ā”œā”€ā”€ src/
│   ā”œā”€ā”€ index.ts              # Main MCP server implementation
│   └── types/
│       └── pdf-parse.d.ts    # Type definitions
ā”œā”€ā”€ dist/                     # Compiled JavaScript output
ā”œā”€ā”€ test-server.js            # Test utilities
ā”œā”€ā”€ package.json              # Project configuration
ā”œā”€ā”€ tsconfig.json             # TypeScript configuration
ā”œā”€ā”€ cursor-config.json        # Example Cursor configuration
└── README.md                 # This file

Available Scripts

ScriptDescription
yarn buildCompile TypeScript to JavaScript
yarn startRun the compiled server
yarn devRun in development mode with hot reload
yarn testExecute test suite
yarn lintRun code linting

Building from Source

# Development mode with file watching
yarn dev

# Production build
yarn build

# Run tests
yarn test

Dependencies

PackagePurposeVersion
@modelcontextprotocol/sdkMCP protocol implementation^0.5.0
pdf-parsePDF text extraction^1.1.1
zodRuntime type validation^3.22.4
typescriptTypeScript compiler^5.0.0

šŸ› Troubleshooting

Common Issues

IssueCauseSolution
ENOENT: no such file or directoryInvalid file pathVerify PDF file exists and path is correct
File is not a PDFWrong file formatEnsure file has .pdf extension and is valid
Empty text outputImage-based PDFThis tool only extracts text-based content
Build errorsMissing dependenciesRun yarn install to install all dependencies

Debug Mode

Enable verbose logging:

NODE_ENV=development yarn start

Testing

Run the comprehensive test suite:

# Run all tests
yarn test

# Test with specific PDF
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "pdf_to_text", "arguments": {"file_paths": ["your-file.pdf"]}}}' | node dist/index.js

šŸ¤ Contributing

We welcome contributions! Please see our for details.

Development Setup

  1. Fork the repository
  2. Clone your fork
  3. Create a feature branch: git checkout -b feature/amazing-feature
  4. Make your changes
  5. Test thoroughly: yarn test
  6. Commit changes: git commit -m 'Add amazing feature'
  7. Push to branch: git push origin feature/amazing-feature
  8. Open a Pull Request

Code Style

  • Follow existing TypeScript conventions
  • Add tests for new features
  • Update documentation as needed
  • Ensure all tests pass

šŸ“„ License

This project is licensed under the MIT License - see the file for details.

šŸ™ Acknowledgments

šŸ“ž Support


Made with ā¤ļø for the MCP community

⭐ Star this repo • šŸ› Report Bug • šŸ’” Request Feature