NimbleBrainInc/mcp-pdfco
If you are the rightful owner of mcp-pdfco and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
PDF.co MCP Server provides a comprehensive suite of tools for PDF manipulation, conversion, and automation, supporting a wide range of document processing needs.
MCP Server PDF.co
About
MCP server for PDF.co API. Comprehensive PDF manipulation, conversion, OCR, text extraction, and document automation with support for barcodes, watermarks, and security features.
Features
- Full API Coverage: Complete implementation of PDF.co API endpoints
- Strongly Typed: All responses use Pydantic models for type safety
- S-Tier Architecture: Production-ready with separated concerns (API client, models, server)
- HTTP Transport: Supports streamable-http with health endpoint
- Async/Await: Built on aiohttp for high performance
- Type Safe: Full mypy strict mode compliance
- Comprehensive Testing: Unit tests with pytest and AsyncMock
- Docker Ready: Production Dockerfile included
Available Tools
PDF Conversion Tools
pdf_to_text- Extract text content from PDF documentspdf_to_json- Extract structured data from PDFspdf_to_html- Convert PDF to HTML formatpdf_to_csv- Extract tables from PDF to CSV
PDF Manipulation Tools
pdf_merge- Combine multiple PDFs into onepdf_split- Split PDF into separate pages or rangespdf_rotate- Rotate pages in a PDF documentpdf_compress- Reduce PDF file size with configurable compressionpdf_add_watermark- Add text watermarks to PDFs
PDF Security Tools
pdf_protect- Add password protection to PDFspdf_unlock- Remove password protection from PDFs
PDF Information
pdf_info- Get PDF metadata (pages, size, dimensions, etc.)
Document Creation Tools
html_to_pdf- Convert HTML content to PDFurl_to_pdf- Convert web pages to PDFimage_to_pdf- Convert images to PDF documents
Barcode Tools
barcode_generate- Generate QR codes and barcodesbarcode_read- Read and decode barcodes from images
OCR Tools
ocr_pdf- OCR scanned PDFs to make them searchable
Installation
Using uv (recommended)
# Clone the repository
git clone <repository-url>
cd mcp-pdfco
# Install with uv
uv pip install -e .
# Install with development dependencies
uv pip install -e ".[dev]"
Using pip
pip install -e .
Configuration
API Key
Get your free API key from PDF.co Dashboard and set it as an environment variable:
export PDFCO_API_KEY=your_api_key_here
Or create a .env file:
PDFCO_API_KEY=your_api_key_here
Claude Desktop Configuration
Add to your Claude Desktop configuration file:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"pdfco": {
"command": "uvx",
"args": ["mcp-pdfco"],
"env": {
"PDFCO_API_KEY": "your_api_key_here"
}
}
}
}
Running the Server
Development Mode
# Using Python module
uv run python -m mcp_pdfco.server
# Using the Makefile
make run
Production Mode (Docker)
# Build the Docker image
docker build -t mcp-pdfco .
# Run with Docker
docker run -e PDFCO_API_KEY=your_key -p 8000:8000 mcp-pdfco
# Run with Docker Compose
docker-compose up
HTTP Transport
The server supports HTTP transport with a health check endpoint:
# Start with uvicorn
uvicorn mcp_pdfco.server:app --host 0.0.0.0 --port 8000
# Check health
curl http://localhost:8000/health
Usage Examples
Extract Text from PDF
result = await pdf_to_text(
url="https://example.com/document.pdf",
pages="1-5"
)
print(result.text)
Merge Multiple PDFs
result = await pdf_merge(
urls=[
"https://example.com/doc1.pdf",
"https://example.com/doc2.pdf"
],
name="merged_document.pdf"
)
print(f"Merged PDF: {result.url}")
Convert HTML to PDF
result = await html_to_pdf(
html="<h1>Hello World</h1><p>This is a PDF</p>",
name="hello.pdf",
page_size="A4",
orientation="Portrait"
)
print(f"Generated PDF: {result.url}")
Add Watermark
result = await pdf_add_watermark(
url="https://example.com/document.pdf",
text="CONFIDENTIAL",
x=200,
y=400,
font_size=48,
color="FF0000",
opacity=0.3,
pages="0-", # Apply to all pages
name="watermarked_document.pdf"
)
print(f"Watermarked PDF: {result.url}")
Generate QR Code
result = await barcode_generate(
value="https://example.com",
barcode_type="QRCode",
format="png"
)
print(f"QR Code: {result.url}")
OCR a Scanned PDF
result = await ocr_pdf(
url="https://example.com/scanned.pdf",
pages="1-10",
lang="eng"
)
print(f"OCR'd PDF: {result.url}")
print(f"Extracted text: {result.text}")
Development
Quick Start
make help # Show all available commands
make install # Install dependencies
make dev-install # Install with dev dependencies
make format # Format code with ruff
make lint # Lint code with ruff
make typecheck # Type check with mypy
make test # Run tests with pytest
make test-cov # Run tests with coverage
make check # Run all checks (lint + typecheck + test)
make clean # Clean up artifacts
Project Structure
.
├── src/
│ └── mcp_pdfco/
│ ├── __init__.py
│ ├── server.py # FastMCP server with tool definitions
│ ├── api_client.py # Async PDF.co API client
│ └── api_models.py # Pydantic models for type safety
├── tests/
│ ├── __init__.py
│ ├── test_server.py # Server tool tests
│ └── test_api_client.py # API client tests
├── pyproject.toml # Project configuration
├── Makefile # Development commands
├── Dockerfile # Container deployment
└── README.md # This file
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=src/mcp_pdfco --cov-report=term-missing
# Run specific test file
pytest tests/test_server.py -v
Code Quality
This project uses:
- ruff: Fast Python linter and formatter
- mypy: Static type checker (strict mode)
- pytest: Testing framework with async support
All code must pass:
make check # Runs lint + typecheck + test
Architecture
This server follows S-Tier MCP architecture principles:
-
Separation of Concerns
api_client.py: HTTP communication layerapi_models.py: Data models and type definitionsserver.py: MCP tool definitions and routing
-
Type Safety
- Full type hints on all functions
- Pydantic models for API responses
- Mypy strict mode compliance
-
Async All the Way
- aiohttp for HTTP requests
- Async/await throughout
- Context managers for resource cleanup
-
Error Handling
- Custom
PDFcoAPIErrorexception - Context logging via
ctx.error()andctx.warning() - Graceful error messages
- Custom
-
Production Ready
- Docker support
- Health check endpoint
- Environment-based configuration
- Comprehensive logging
Requirements
- Python 3.13+
- aiohttp >= 3.12.15
- fastapi >= 0.117.1
- fastmcp >= 2.12.4
- pydantic >= 2.0.0
- uvicorn >= 0.34.0
API Documentation
For detailed API documentation, visit PDF.co API Documentation.
Supported Input Formats
- PDF: URL or base64 encoded
- Images: PNG, JPG, GIF, BMP, TIFF
- HTML: Raw HTML string or URL
Supported Output Formats
- PDF: High-quality PDF generation
- Text: Plain text extraction
- JSON: Structured data extraction
- HTML: Formatted HTML output
- CSV: Table data extraction
- Images: PNG, JPG, SVG for barcodes
Rate Limits
PDF.co has rate limits based on your subscription plan. Free plans include:
- 100 API calls per month
- 10 API calls per minute
Check your dashboard for current usage.
Troubleshooting
Common Issues
Issue: PDFCO_API_KEY is not set warning
Solution: Set the environment variable:
export PDFCO_API_KEY=your_key_here
Issue: Network error or timeout
Solution: Check your internet connection and increase timeout:
client = PDFcoClient(timeout=180.0) # 3 minutes
Issue: API Error 401: Unauthorized
Solution: Verify your API key is valid at https://app.pdf.co/dashboard
Issue: Docker container won't start
Solution: Ensure the API key is passed correctly:
docker run -e PDFCO_API_KEY=your_key_here -p 8000:8000 mcp-pdfco
Contributing
Contributions are welcome! Please see for guidelines.
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests:
make check - Submit a pull request
Issue Tracker: GitHub Issues
License
MIT
Links
Part of the NimbleTools Registry - an open source collection of production-ready MCP servers. For enterprise deployment, check out NimbleBrain.