lh/mcp-pdf-modesty
If you are the rightful owner of mcp-pdf-modesty and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP PDF Modesty Server is a Model Context Protocol server that provides PDF text extraction capabilities using the pdf2json library.
extract_text
Extract text content from a PDF file.
extract_form_fields
Extract form fields from a PDF file.
MCP PDF Modesty Server
An MCP (Model Context Protocol) server that provides PDF text extraction capabilities by wrapping the excellent pdf2json library.
Attribution
This project uses the pdf2json library created by Modesty Zhang. The original library can be found at:
All PDF parsing functionality is provided by pdf2json. This project simply wraps it in an MCP server interface.
Features
- Extract text content from PDF files
- Extract form fields from PDF files
- Multiple output formats: plain text, JSON, or detailed metadata
- Zero-dependency PDF parsing (inherited from pdf2json v3.1.6+)
Installation
From npm (when published)
npm install mcp-pdf-modesty
From source
- Clone the repository:
git clone https://github.com/lh/mcp-pdf-modesty.git
cd mcp-pdf-modesty
- Install dependencies and build:
npm install
npm run build
npm link
Usage
In Claude Code
After building and linking from source, add the server to Claude Code:
claude mcp add mcp-pdf-modesty mcp-pdf-modesty
Then restart Claude Code for the server to be available.
In Claude Desktop
Add to your claude_desktop_config.json
:
{
"mcpServers": {
"pdf": {
"command": "node",
"args": ["/path/to/mcp-pdf-modesty/dist/index.js"]
}
}
}
Or if installed from npm:
{
"mcpServers": {
"pdf": {
"command": "npx",
"args": ["mcp-pdf-modesty"]
}
}
}
Available Tools
extract_text
Extract text content from a PDF file.
Parameters:
path
(required): Path to the PDF fileformat
(optional): Output format"text"
(default): Plain text output"json"
: Structured data with text and metadata"detailed"
: Full PDF data structure
Example:
extract_text({ path: "/path/to/document.pdf", format: "text" })
extract_form_fields
Extract form fields from a PDF file.
Parameters:
path
(required): Path to the PDF file
Example:
extract_form_fields({ path: "/path/to/form.pdf" })
Development
# Install dependencies
npm install
# Build
npm run build
# Run in development mode
npm run dev
License
This MCP wrapper is licensed under the MIT License. See file for details.
The underlying pdf2json library has its own license. Please refer to the pdf2json repository for its licensing terms.
Acknowledgments
Special thanks to Modesty Zhang for creating and maintaining the pdf2json library that makes this MCP server possible.