raahulrawat/docsynthai
If you are the rightful owner of docsynthai and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
DocSynthAI is an open-source Intelligent Document Processing (IDP) engine powered by the Model Context Protocol (MCP).
DocSynthAI – Intelligent Document Processing MCP Server
A modular, extensible, next-generation document understanding engine powered by MCP (Model Context Protocol) and Gemini Vision.
Quick Overview
DocSynthAI is a modular document understanding platform built on MCP. It provides:
- AI-powered document classification (Gemini Vision integration)
- Rule-based & general LLM classification modes
- STDIO MCP server and async STDIO client for local/dev integration
- Roadmap: Extraction → Validation → Knowledge Graph creation → HTTP/SSE transport
Install & Setup
1. Clone
git clone https://github.com/raahulrawat/docsynthai.git
2. Python packages
Install dependencies (recommended to use a virtualenv):
python -m venv .venv
Run — MCP Server (STDIO) — Current
This starts the MCP server in STDIO mode (default/current). Clients connect over stdio pipes.
Start server (local)
python server.py
Start server (explicit stdio mode)
DOCSYNTH_TRY_HTTP=0 python server.py
By default the server will load rules from classifier_rules.json if present
and persist rules to that file. The server exposes the following MCP tools:
setup_classifiercreate_ruleget_all_rulesdelete_ruleclassify_document
Running the STDIO client demo
python mcp_stdio_client.py
The demo will: launch the server subprocess, do MCP initialization, ask for API key, and let you classify a local image file.
Run — HTTP & SSE (Next Release)
Planned in the next release:
- HTTP Transport:
mcp.run(transport="http", host="0.0.0.0", port=8000)— REST-like access to tools - SSE Transport: streaming support for long-running/extraction tasks
When HTTP is enabled you will be able to run:
python server.py # will detect DOCSYNTH_TRY_HTTP=1 and bind to host/port
Client libraries will be updated to support HTTP tool discovery and SSE streaming.
Running MCP Server JSON (STDIO config)
Use this sample JSON for external orchestrators or MCP host configs (e.g., Cursor / IDE tool integrations):
{
"mcpServers": {
"docsynth": {
"command": "python",
"args": [
"server.py"
],
"transport": {
"type": "stdio"
},
"env": {}
}
}
}
Save as .mcp/docsynth-mcp.json or include in your MCP host configuration. This tells an MCP host to spawn server.py and connect via stdio.
Classification Roadmap (current support)
Core pipeline stages we implement or plan to implement — each becomes an MCP tool.
Stage 1 — Classification (current)
- Rule-based classification (user-defined rules)
- General LLM classification (Gemini Vision)
- Single-image & batch classification
- Strict JSON response format for downstream parsing
Stage 2 — Extraction (next)
- Key–Value pair extraction (KV)
- Table extraction → CSV/JSON
- Multi-page PDF → page images conversion (optional helper)
- Tool:
extract_document
Stage 3 — Validation
- Field-level validation (PAN/Aadhaar format, dates, totals)
- Cross-document validation (e.g., PAN ↔ Bank Statement)
- Rule-based & model-assisted validation
- Tool:
validate_document
Stage 4 — Knowledge Graph Creation
- Triplet extraction (subject, predicate, object)
- Ontology mapping & transformation
- Neo4j / Memgraph integrations
- Tool:
kg_insert,kg_generate_triplets
Testing & Development Tips
- Use
DocumentClassifier(mock_mode=True)for fast local tests without Gemini API calls. - Persist rules in
classifier_rules.jsonto re-use definitions across restarts. - To test HTTP mode when implemented, set
DOCSYNTH_TRY_HTTP=1and passDOCSYNTH_HOST/DOCSYNTH_PORT. - Create pytest tests that launch the server subprocess via stdio and call the client (mock_mode recommended).
Contributing
PRs welcome. Suggested first issues:
- HTTP transport adapter & docs
- SSE streaming for long extraction jobs
- PDF→image helper & multi-page handling
- KG connector for Neo4j
Please follow the code style, add tests, and include changelog entries for breaking changes.