cordlesssteve/file-converter-mcp
3.2
If you are the rightful owner of file-converter-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Document Organizer MCP Server is a comprehensive tool designed for converting PDFs to Markdown, organizing documents, and standardizing project documentation.
Tools
12
Resources
0
Prompts
0
File Converter MCP
A Model Context Protocol (MCP) server that aggregates various file conversion tools for quick formatting and file type transformations.
Features
Supported Conversions
- PDF to Markdown - Convert PDF documents to markdown format
- Image Format Conversion - Transform between common image formats (PNG, JPG, WebP, etc.)
- Document Conversion - Convert between document formats (DOCX, TXT, HTML, etc.)
- Spreadsheet Conversion - Transform spreadsheet formats (CSV, XLSX, JSON, etc.)
- Code Format Conversion - Convert between code formats and syntax highlighting
- Archive Operations - Extract and create archive files (ZIP, TAR, etc.)
Conversion Engines
- PDF Engine: marker (recommended) and pymupdf4llm support
- Image Engine: Sharp and ImageMagick integration
- Document Engine: Pandoc integration for broad format support
- Archive Engine: Built-in Node.js compression libraries
Installation
npm install -g file-converter-mcp
Dependencies
Install conversion engines based on your needs:
# PDF conversion engines
pip install marker-pdf pymupdf4llm
# Image processing (choose one)
npm install sharp
# OR
brew install imagemagick # macOS
apt-get install imagemagick # Ubuntu
# Document conversion
brew install pandoc # macOS
apt-get install pandoc # Ubuntu
# Archive tools (usually pre-installed)
# zip, unzip, tar, gzip
Usage
MCP Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"file-converter": {
"command": "file-converter-mcp",
"args": []
}
}
}
Available Tools
PDF Conversion
convert_pdf_to_markdown
- Convert PDF files to Markdownextract_pdf_text
- Extract plain text from PDF filesextract_pdf_images
- Extract images from PDF files
Image Conversion
convert_image_format
- Convert between image formatsresize_image
- Resize images with quality optionscompress_image
- Reduce image file size
Document Conversion
convert_document
- Convert between document formats using Pandocextract_document_text
- Extract text from various document formatsconvert_markdown_to_html
- Convert Markdown to HTML with styling
Spreadsheet Conversion
convert_csv_to_json
- Convert CSV data to JSON formatconvert_json_to_csv
- Convert JSON data to CSV formatconvert_xlsx_to_csv
- Extract CSV data from Excel files
Archive Operations
create_archive
- Create ZIP or TAR archives from files/foldersextract_archive
- Extract contents from archive fileslist_archive_contents
- List files in archive without extracting
Utility Tools
detect_file_type
- Identify file format and encodingvalidate_conversion
- Check if conversion is supportedbatch_convert
- Convert multiple files in one operation
Examples
Basic PDF Conversion
// Convert PDF to Markdown
await client.callTool("convert_pdf_to_markdown", {
input_path: "/path/to/document.pdf",
output_path: "/path/to/output.md",
options: {
engine: "marker",
preserve_formatting: true
}
});
Image Format Conversion
// Convert PNG to WebP with compression
await client.callTool("convert_image_format", {
input_path: "/path/to/image.png",
output_path: "/path/to/image.webp",
options: {
quality: 80,
format: "webp"
}
});
Document Conversion
// Convert DOCX to Markdown using Pandoc
await client.callTool("convert_document", {
input_path: "/path/to/document.docx",
output_path: "/path/to/document.md",
options: {
format: "markdown",
preserve_styles: false
}
});
Batch Operations
// Convert multiple files at once
await client.callTool("batch_convert", {
input_directory: "/path/to/input/",
output_directory: "/path/to/output/",
conversions: [
{ from: "pdf", to: "markdown" },
{ from: "png", to: "webp" },
{ from: "docx", to: "txt" }
]
});
Configuration Options
Conversion Settings
interface ConversionOptions {
engine?: string; // Conversion engine to use
quality?: number; // Output quality (1-100)
preserve_formatting?: boolean; // Maintain original formatting
output_format?: string; // Specific output format
compression_level?: number; // Compression level (0-9)
custom_options?: Record<string, any>; // Engine-specific options
}
Supported File Types
Input Formats
- Documents: PDF, DOCX, DOC, RTF, TXT, HTML, XML
- Images: PNG, JPG, JPEG, WebP, GIF, BMP, TIFF, SVG
- Spreadsheets: CSV, XLSX, XLS, JSON, TSV
- Archives: ZIP, TAR, GZ, 7Z, RAR (extract only)
- Code: Various programming language files
Output Formats
- Text: Markdown, HTML, TXT, RTF
- Images: PNG, JPG, WebP, GIF, BMP
- Data: JSON, CSV, XML, YAML
- Archives: ZIP, TAR, GZ
Performance Considerations
- Memory Usage: Large files are processed in chunks to prevent memory issues
- Processing Speed: Different engines have different speed/quality tradeoffs
- Batch Processing: More efficient for multiple file conversions
- Caching: Converted files can be cached to avoid re-processing
Error Handling
The server provides comprehensive error handling:
- Input file validation and format detection
- Graceful fallback between conversion engines
- Detailed error messages with suggested solutions
- Progress tracking for long-running conversions
Development
# Clone repository
git clone https://github.com/cordlesssteve/file-converter-mcp.git
cd file-converter-mcp
# Install dependencies
npm install
# Build project
npm run build
# Run development mode
npm run dev
# Run tests
npm test
Contributing
- Fork the repository
- Create a feature branch
- Add support for new file formats or conversion engines
- Add tests for new functionality
- Submit a pull request
License
MIT License - see file for details.