sergiudanstan/mcp-pdf-to-csv
If you are the rightful owner of mcp-pdf-to-csv and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
FastMCP server for extracting tables from PDF files to CSV format.
PDF to CSV MCP Server
FastMCP server for extracting tables from PDF files and converting them to CSV format.
Features
- Batch PDF processing: Extract tables from single or multiple PDFs
- Multiple extraction strategies: Auto, lattice (grid-based), and stream (whitespace-based)
- Flexible output: Save individual tables or merge all tables per PDF
- Configurable folders: Set custom input/output directories
Installation
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install fastmcp pdfplumber pandas
Configuration
The server is configured in Claude Desktop at:
~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"pdf-to-csv": {
"command": "/Users/sara/MCP_PDF_CSV/.venv/bin/python3",
"args": ["/Users/sara/MCP_PDF_CSV/pdf_to_csv_server.py"]
}
}
}
Usage
Default Folders
- Input:
/Users/sara/MCP_PDF_CSV/pdfs/ - Output:
/Users/sara/MCP_PDF_CSV/output_csv/
Available MCP Tools
-
set_folder(folder: str)
- Set the input folder containing PDF files
- Example:
set_folder("/path/to/pdfs")
-
set_output(folder: str)
- Set the output folder for CSV files
- Example:
set_output("/path/to/output")
-
list_pdfs()
- List all PDF files in the current input folder
-
extract_tables(filename: str, pages: str = "all", strategy: str = "auto", merge: bool = False, strip_ws: bool = True)
- Extract tables from a single PDF
- Parameters:
filename: PDF filename in the input folderpages: "all" or specific pages like "1,3-5"strategy: "auto", "lattice", or "stream"merge: Create merged CSV with all tablesstrip_ws: Strip whitespace from cells
-
extract_all(pages: str = "all", strategy: str = "auto", merge: bool = False, strip_ws: bool = True)
- Extract tables from all PDFs in the folder
- Same parameters as extract_tables (except filename)
-
show_config()
- Display current input/output folder configuration
Testing
Run the test script to verify installation:
source .venv/bin/activate
python test_server.py
Troubleshooting
Check Claude Desktop Logs
tail -50 ~/Library/Logs/Claude/mcp-server-pdf-to-csv.log
Common Issues
- "Read-only file system" error: Fixed by using absolute paths (SCRIPT_DIR)
- Connection errors: Restart Claude Desktop after configuration changes
- Missing dependencies: Reinstall with
pip install fastmcp pdfplumber pandas
Technical Details
- Framework: FastMCP 2.13.0.2
- PDF Library: pdfplumber (with pdfminer.six backend)
- Data Processing: pandas
- Transport: STDIO (Standard Input/Output)
- Protocol: MCP (Model Context Protocol) 2025-06-18
Files
pdf_to_csv_server.py- Main MCP servertest_server.py- Validation scriptREADME.md- This file.venv/- Python virtual environmentpdfs/- Default input folderoutput_csv/- Default output folder