DeepSeekMine/mcp-pdf-reader

3.5

If you are the rightful owner of mcp-pdf-reader and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

A PDF file reading server based on FastMCP, supporting PDF text extraction, OCR recognition, and image extraction via the MCP protocol.

📄 MCP PDF Server

A PDF file reading server based on FastMCP.

Supports PDF text extraction, OCR recognition, and image extraction via the MCP protocol, with a built-in web debugger for easy testing.

🚀 Features

read_pdf_text
Extracts normal text from a PDF (page by page).
read_by_ocr
Uses OCR to recognize text from scanned or image-based PDFs.
read_pdf_images
Extracts all images from a specified PDF page (Base64 encoded output).

📂 Project Structure

mcp-pdf-server/
├── pdf_resources/        # Directory for uploaded and processed PDF files
├── txt_server.py         # Main server entry point
└── README.md             # Project documentation

⚙️ Installation

Recommended Python version: 3.9+

pip install pymupdf mcp

Note: To use OCR features, you may need a MuPDF build with OCR support or external OCR libraries.

🔦 Start the Server

Run the following command:

python txt_server.py

You should see logs like:

Serving on http://127.0.0.1:6231

🌐 Web Debugging Interface

Open your browser and visit:

http://127.0.0.1:6231

Select a tool from the left panel
Fill in parameters on the right panel
Click "Run" to test the tool

No coding required — easily debug and test via the web UI.

🛠️ API Tool List

Tool	Description	Input Parameters	Returns
`read_pdf_text`	Extracts normal text from PDF pages	`file_path`, `start_page`, `end_page`	List of page texts
`read_by_ocr`	Recognizes text via OCR	`file_path`, `start_page`, `end_page`, `language`, `dpi`	OCR extracted text
`read_pdf_images`	Extracts images from a PDF page	`file_path`, `page_number`	List of images (Base64 encoded)

📝 Example Usage

Extract text from pages 1 to 5:

mcp run read_pdf_text --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 5}'

Perform OCR recognition on page 1:

mcp run read_by_ocr --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 1, "language": "eng"}'

Extract all images from page 3:

mcp run read_pdf_images --args '{"file_path": "pdf_resources/example.pdf", "page_number": 3}'

📢 Notes

Files must be placed inside the pdf_resources/ directory, or an absolute path must be provided.
OCR functionality requires appropriate OCR support in the environment.
When processing large files, adjust memory and timeout settings as needed.

📜 License

This project is licensed under the MIT License.
For commercial use, please credit the original source.

Related MCP Servers

View all file_systems servers →

markdownify-mcp

4.6

by zcaceres

Markdownify is a Model Context Protocol (MCP) server that converts various file types and web content to Markdown format.

file_systems

excel-mcp-server

4.5

by negokaz

A Model Context Protocol (MCP) server that reads and writes MS Excel data.

file_systems

mcp-filesystem-server

4.5

by mark3labs

This MCP server provides secure access to the local filesystem via the Model Context Protocol (MCP).

file_systems

claude-code-mcp

4.5

by auchenberg

Claude Code MCP is an implementation of Claude Code as a Model Context Protocol (MCP) server, enabling its software engineering capabilities through a standardized interface.

developer_tools

mcp-everything-search

4.4

by mamertofabian

An MCP server that provides fast file searching capabilities across Windows, macOS, and Linux.

file_systems

Office-Word-MCP-Server

4.4

by GongRzhe

A Model Context Protocol (MCP) server for creating, reading, and manipulating Microsoft Word documents.

file_systems

python-mcp-server-client

4.2

by GobinFan

MCP Server is a server implementing the Model Context Protocol (MCP) to provide a standardized interface for AI models, connecting external data sources and tools like file systems, databases, or APIs.

ai_chatbot

vertex-ai-mcp-server

4.0

by shariqriazz

This project implements a Model Context Protocol (MCP) server that provides a comprehensive suite of tools for interacting with Google Cloud's Vertex AI Gemini models, focusing on coding assistance and general query answering.

ai_chatbot

Archive-Agent

3.8

by shredEngineer

Archive Agent is an open-source semantic file tracker with OCR and AI search capabilities.

file_systems

claude-code-mcp

3.8

by steipete

An MCP (Model Context Protocol) server that allows running Claude Code in one-shot mode with permissions bypassed automatically.

developer_tools

mcp-obsidian

3.7

by smithery-ai

The Obsidian Model Context Protocol (MCP) is a connector that allows Claude Desktop or any MCP client to read and search directories containing Markdown notes, such as an Obsidian vault.

file_systems

agent

3.7

by 1mcp-app

1MCP is a unified Model Context Protocol server that aggregates multiple MCP servers into one.

ai_chatbot

editor-mcp

3.7

by danielpodrazka

Editor MCP is a Python-based text editor server built with FastMCP, providing tools for file operations through a standardized API.

file_systems

obsidian-mcp

3.6

by newtype-01

Obsidian MCP (Model Context Protocol) 服务器用于连接 AI 模型与 Obsidian 知识库，支持笔记和文件夹的管理操作。

knowledge_and_memory

ebook-mcp

3.6

by onebirdrocks

Ebook-MCP is a Model Context Protocol server designed for processing electronic books, supporting EPUB and PDF formats.

file_systems

moling

3.6

by gojue

MoLing is a dependency-free local office automation assistant that interacts with the system through operating system APIs, enabling file system operations and command execution.

os_automation

cursor-mcp-file-organizer

3.6

by AlexanderVTr

Cursor MCP File Organizer is a Model Context Protocol server designed to organize files in your Downloads folder using Cursor IDE.

file_systems

nextcloud-mcp-server

3.6

by cbcoutinho

The Nextcloud MCP Server enables interaction between Large Language Models (LLMs) and Nextcloud, focusing on automating actions via the Notes API.

file_systems

rust-mcp-filesystem

3.6

by rust-mcp-stack

Rust MCP Filesystem is a high-performance, asynchronous MCP server for efficient filesystem operations, rewritten in Rust for enhanced capabilities.

file_systems

mcp-client-server-markdown

3.6

by zou-hong-run

MCP Markdown Server is a server application based on the Model Context Protocol (MCP) that provides functionalities for creating, editing, searching, converting to HTML, and managing Markdown documents.

file_systems

mcp-ui

3.6

by machaojin1917939763

MCP聊天应用是一个基于Vue.js构建的现代化聊天界面，支持通过Model Context Protocol (MCP)与各种服务和工具进行交互。

ai_chatbot

mcp-openapi-proxy

3.6

by matthewhand

mcp-openapi-proxy is a Python package that implements a Model Context Protocol (MCP) server, designed to dynamically expose REST APIs—defined by OpenAPI specifications—as MCP tools.

developer_tools

paperless-mcp

3.6

by nloui

An MCP server for managing documents, tags, correspondents, and document types in Paperless-NGX.

file_systems

google-workspace-mcp

3.5

by aaronsb

The Google Workspace MCP Server is a Model Context Protocol server that allows users to manage their Google Workspace, including Gmail, Calendar, and Drive, through a secure and efficient interface.

cloud_platforms

gdrive-mcp-server

3.5

by felores

A powerful Model Context Protocol (MCP) server that provides seamless integration with Google Drive, allowing AI models to search, list, and read files from Google Drive.

file_systems

mcp-sharepoint

3.5

by Sofias-ai

A lightweight MCP Server for seamless integration with Microsoft SharePoint, enabling MCP clients to interact with documents, folders, and other SharePoint resources.

file_systems

mcp-filesystem

3.5

by safurrier

A powerful Model Context Protocol (MCP) server for filesystem operations optimized for intelligent interaction with large files and filesystems.

file_systems