digitalcyphrpnk/markitdown-mcp
If you are the rightful owner of markitdown-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
MarkItDown MCP Server is a versatile document conversion server that transforms various file formats into Markdown using the MarkItDown tool.
MarkItDown MCP Server
An MCP (Model Context Protocol) server that provides document conversion to Markdown using MarkItDown.
Features
- Multi-Format Support: Convert PDF, Word, Excel, PowerPoint, images, audio, HTML, and more
- Batch Processing: Convert multiple files simultaneously
- Azure AI Integration: Enhanced document processing with Azure Document Intelligence
- OCR Capabilities: Extract text from images
- Audio Transcription: Convert speech to text
- Web Content: Convert live web pages to Markdown
- LLM Enhancement: Use LLMs for image descriptions and content analysis
Tools Available
convert_file_to_markdown
Convert various file formats to Markdown.
Parameters:
file_path(string): Path to file to convertoutput_path(string, optional): Output path for converted fileuse_azure_ai(boolean, optional): Use Azure Document Intelligenceazure_endpoint(string, optional): Azure Document Intelligence endpointllm_client(string, optional): LLM client for image descriptionsllm_model(string, optional): LLM model for image descriptions
convert_url_to_markdown
Convert web content to Markdown.
Parameters:
url(string): URL to convert to Markdownsave_to_file(string, optional): Path to save the converted content
batch_convert_files
Convert multiple files to Markdown in batch.
Parameters:
file_paths(array): List of file paths to convertoutput_dir(string, optional): Directory to save converted filesfile_formats(array, optional): Filter by specific file formatsuse_azure_ai(boolean, optional): Use Azure Document Intelligenceazure_endpoint(string, optional): Azure endpoint
extract_document_metadata
Extract metadata from document without full conversion.
Parameters:
file_path(string): Path to document file
convert_clipboard_content
Convert clipboard or pasted content to Markdown.
Parameters:
content_type(string): Type of content ("url", "html", "text")content(string): The actual content to convertsave_to(string, optional): Path to save converted content
get_supported_formats
Get list of all supported file formats for conversion.
Supported Formats
Documents
- PDF (.pdf) - Portable Document Format
- Word (.docx) - Microsoft Word documents
- PowerPoint (.pptx) - Microsoft PowerPoint presentations
- Excel (.xlsx, .xls) - Microsoft Excel spreadsheets
Web Content
- HTML (.html, .htm) - Web pages
- URLs - Live web content
Data Formats
- CSV (.csv) - Comma-separated values
- JSON (.json) - JavaScript Object Notation
- XML (.xml) - Extensible Markup Language
Images (with OCR)
- JPEG (.jpg, .jpeg)
- PNG (.png)
- GIF (.gif)
- BMP (.bmp)
- TIFF (.tiff)
Audio (with transcription)
- WAV (.wav)
- MP3 (.mp3)
Archives
- ZIP (.zip) - Extracts and converts contents
Installation
Using uv (recommended)
# Clone the repository
git clone https://github.com/digitalcyphrpnk/markitdown-mcp.git
cd markitdown-mcp
# Setup with uv
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
uv pip install -e .
Using pip
pip install markitdown-mcp
Configuration
OpenCode Configuration
Add to your opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"markitdown": {
"type": "local",
"command": ["markitdown-mcp"],
"enabled": true,
"environment": {
"AZURE_DOC_INTEL_ENDPOINT": "${AZURE_DOC_INTEL_ENDPOINT}",
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
}
}
}
}
Claude Desktop Configuration
Add to your Claude Desktop configuration:
{
"mcpServers": {
"markitdown": {
"command": "markitdown-mcp",
"env": {
"AZURE_DOC_INTEL_ENDPOINT": "your_azure_endpoint_here",
"OPENAI_API_KEY": "your_openai_key_here"
}
}
}
}
Usage Examples
Basic Document Conversion
Convert this PDF file to Markdown: /path/to/document.pdf
Batch Processing
Convert all PDF files in the /documents folder to Markdown
Web Content Conversion
Convert the content from https://example.com to Markdown
Enhanced Processing
Convert this document using Azure Document Intelligence for better accuracy
Audio Transcription
Convert this audio file to text: /path/to/audio.wav
Image OCR
Extract text from this image: /path/to/image.png
Environment Variables
AZURE_DOC_INTEL_ENDPOINT: Azure Document Intelligence endpoint for enhanced processingOPENAI_API_KEY: OpenAI API key for LLM-powered image descriptionsANTHROPIC_API_KEY: Anthropic API key for Claude-powered features
Development
Setup Development Environment
# Clone and setup
git clone https://github.com/digitalcyphrpnk/markitdown-mcp.git
cd markitdown-mcp
# Install with development dependencies
uv pip install -e ".[dev]"
# Run tests
pytest
# Format code
black src tests
ruff check src tests
# Type checking
mypy src
Testing
# Run all tests
pytest
# Test specific functionality
pytest tests/test_conversion.py
# Test with different file formats
pytest tests/test_formats.py
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Run the test suite
- Submit a pull request
License
MIT License - see file for details.
Related Projects
- MarkItDown - The underlying document conversion tool
- Model Context Protocol - The protocol specification
- OpenCode - AI coding agent that supports MCP servers