svngoku/mcp-server-mistral-ocr
If you are the rightful owner of mcp-server-mistral-ocr and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
A very simple Python template for building MCP servers using Streamable HTTP transport.
MarkThat OCR MCP Server
A powerful MCP server for converting images and PDFs to Markdown using state-of-the-art multimodal LLMs. Now with support for both local files and URLs!
Overview
This MCP server provides OCR capabilities through multimodal language models, converting images and PDFs to well-formatted Markdown. It supports multiple providers including OpenAI, Anthropic, Google Gemini, Mistral, and OpenRouter.
Prerequisites
Installation
- Clone the repository:
git clone git@github.com:alpic-ai/mcp-server-template-python.git
cd mcp-server-template-python
- Install python version & dependencies:
uv python install
uv sync --locked
Features
- ✅ URL Support: Process files directly from URLs without manual downloading
- ✅ Multi-Provider Support: OpenAI, Anthropic, Gemini, Mistral, OpenRouter
- ✅ Advanced Figure Extraction: Extract and process figures from PDFs
- ✅ Image Description Generation: Generate detailed descriptions for accessibility
- ✅ Async Processing: Fast, non-blocking file processing
- ✅ Automatic Cleanup: Temporary files are cleaned up automatically
Usage
Start the server on port 3000:
uv run main.py
Running the Inspector
Requirements
- Node.js: ^22.7.5
Quick Start (UI mode)
To get up and running right away with the UI, just execute the following:
npx @modelcontextprotocol/inspector
The inspector server will start up and the UI will be accessible at http://localhost:6274.
You can test your server locally by selecting:
- Transport Type: Streamable HTTP
- URL: http://127.0.0.1:3000/mcp
Available Tools
1. Convert Image/PDF to Markdown
Converts images or PDFs to Markdown. Supports both local files and URLs.
Example with URL:
{
"file_path": "https://example.com/document.pdf",
"model": "gemini-2.5-flash"
}
Example with local file:
{
"file_path": "/path/to/local/document.pdf",
"model": "gpt-4o"
}
2. Advanced OCR with Figure Extraction
Extracts and processes figures from PDF documents.
{
"file_path": "https://arxiv.org/pdf/2301.00001.pdf",
"model": "claude-3-5-sonnet-20241022",
"figure_detector_model": "gemini-2.5-flash",
"coordinate_model": "gemini-2.5-flash",
"parsing_model": "gemini-2.5-flash-lite"
}
3. Generate Image Description
Generates detailed descriptions of images for accessibility.
{
"file_path": "https://example.com/image.jpg",
"model": "gemini-2.5-flash",
"additional_instructions": "Focus on colors and composition"
}
Environment Variables
Set the appropriate API keys based on the models you plan to use:
export GEMINI_API_KEY="your_gemini_api_key"
export OPENAI_API_KEY="your_openai_api_key"
export ANTHROPIC_API_KEY="your_anthropic_api_key"
export MISTRAL_API_KEY="your_mistral_api_key"
export OPENROUTER_API_KEY="your_openrouter_api_key"
Development
Adding New Tools
To add a new tool, modify main.py
:
@mcp.tool(
title="Your Tool Name",
description="Tool Description for the LLM",
)
async def new_tool(
tool_param1: str = Field(description="The description of the param1 for the LLM"),
tool_param2: float = Field(description="The description of the param2 for the LLM")
)-> str:
"""The new tool underlying method"""
result = await some_api_call(tool_param1, tool_param2)
return result
Adding New Resources
To add a new resource, modify main.py
:
@mcp.resource(
uri="your-scheme://{param1}/{param2}",
description="Description of what this resource provides",
name="Your Resource Name",
)
def your_resource(param1: str, param2: str) -> str:
"""The resource template implementation"""
# Your resource logic here
return f"Resource content for {param1} and {param2}"
The URI template uses {param_name}
syntax to define parameters that will be extracted from the resource URI and passed to your function.
Adding New Prompts
To add a new prompt , modify main.py
:
@mcp.prompt("")
async def your_prompt(
prompt_param: str = Field(description="The description of the param for the user")
) -> str:
"""Generate a helpful prompt"""
return f"You are a friendly assistant, help the user and don't forget to {prompt_param}."