mcp_pdf_reader

labeveryday/mcp_pdf_reader

3.2

If you are the rightful owner of mcp_pdf_reader and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The MCP PDF Reader Server is a robust server built using FastMCP, designed to handle comprehensive PDF processing tasks such as text and image extraction, and OCR for reading text within images.

The MCP PDF Reader Server is a powerful tool designed to facilitate comprehensive PDF processing. Built on the FastMCP framework, it offers a range of functionalities that cater to the needs of users requiring detailed analysis and extraction capabilities from PDF documents. The server supports text extraction, image extraction, and optical character recognition (OCR) to read text embedded within images. It is equipped to handle multiple languages, making it versatile for international use. The server can process specific page ranges, allowing users to target their extraction efforts efficiently. Additionally, it provides detailed insights into the PDF's structure and metadata, offering a comprehensive analysis of the document. The server is designed to be user-friendly, with easy installation and setup processes, and it supports various operating systems including Windows, macOS, and Linux.

Features

  • Text Extraction: Extracts text content from PDF pages.
  • Image Extraction: Extracts all images from PDF files.
  • OCR Capabilities: Reads text from images using Tesseract OCR.
  • Comprehensive Analysis: Provides detailed PDF structure and metadata.
  • Page Range Support: Processes specific page ranges.

Tools

  1. read_pdf_text

    Extract text content from PDF pages.

  2. extract_pdf_images

    Extract all images from a PDF file.

  3. read_pdf_with_ocr

    Extract text from both regular text and images using OCR.

  4. get_pdf_info

    Get comprehensive metadata and statistics about a PDF.

  5. analyze_pdf_structure

    Analyze the structure and content distribution of a PDF.