docsynthai

raahulrawat/docsynthai

3.2

If you are the rightful owner of docsynthai and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

DocSynthAI is an open-source Intelligent Document Processing (IDP) engine powered by the Model Context Protocol (MCP).

DocSynthAI – Intelligent Document Processing MCP Server

A modular, extensible, next-generation document understanding engine powered by MCP (Model Context Protocol) and Gemini Vision.

Quick Overview

DocSynthAI is a modular document understanding platform built on MCP. It provides:

  • AI-powered document classification (Gemini Vision integration)
  • Rule-based & general LLM classification modes
  • STDIO MCP server and async STDIO client for local/dev integration
  • Roadmap: Extraction → Validation → Knowledge Graph creation → HTTP/SSE transport

Install & Setup

1. Clone

git clone https://github.com/raahulrawat/docsynthai.git 

2. Python packages

Install dependencies (recommended to use a virtualenv):

python -m venv .venv 

Run — MCP Server (STDIO) — Current

This starts the MCP server in STDIO mode (default/current). Clients connect over stdio pipes.

Start server (local)

python server.py

Start server (explicit stdio mode)

DOCSYNTH_TRY_HTTP=0 python server.py

By default the server will load rules from classifier_rules.json if present and persist rules to that file. The server exposes the following MCP tools:

  • setup_classifier
  • create_rule
  • get_all_rules
  • delete_rule
  • classify_document

Running the STDIO client demo

python mcp_stdio_client.py

The demo will: launch the server subprocess, do MCP initialization, ask for API key, and let you classify a local image file.

Run — HTTP & SSE (Next Release)

Planned in the next release:

  • HTTP Transport: mcp.run(transport="http", host="0.0.0.0", port=8000) — REST-like access to tools
  • SSE Transport: streaming support for long-running/extraction tasks

When HTTP is enabled you will be able to run:

python server.py   # will detect DOCSYNTH_TRY_HTTP=1 and bind to host/port

Client libraries will be updated to support HTTP tool discovery and SSE streaming.

Running MCP Server JSON (STDIO config)

Use this sample JSON for external orchestrators or MCP host configs (e.g., Cursor / IDE tool integrations):


{
  "mcpServers": {
    "docsynth": {
      "command": "python",
      "args": [
        "server.py"
      ],
      "transport": {
        "type": "stdio"
      },
      "env": {}
    }
  }
}
    
  

Save as .mcp/docsynth-mcp.json or include in your MCP host configuration. This tells an MCP host to spawn server.py and connect via stdio.

Classification Roadmap (current support)

Core pipeline stages we implement or plan to implement — each becomes an MCP tool.

Stage 1 — Classification (current)

  • Rule-based classification (user-defined rules)
  • General LLM classification (Gemini Vision)
  • Single-image & batch classification
  • Strict JSON response format for downstream parsing

Stage 2 — Extraction (next)

  • Key–Value pair extraction (KV)
  • Table extraction → CSV/JSON
  • Multi-page PDF → page images conversion (optional helper)
  • Tool: extract_document

Stage 3 — Validation

  • Field-level validation (PAN/Aadhaar format, dates, totals)
  • Cross-document validation (e.g., PAN ↔ Bank Statement)
  • Rule-based & model-assisted validation
  • Tool: validate_document

Stage 4 — Knowledge Graph Creation

  • Triplet extraction (subject, predicate, object)
  • Ontology mapping & transformation
  • Neo4j / Memgraph integrations
  • Tool: kg_insert, kg_generate_triplets

Testing & Development Tips

  • Use DocumentClassifier(mock_mode=True) for fast local tests without Gemini API calls.
  • Persist rules in classifier_rules.json to re-use definitions across restarts.
  • To test HTTP mode when implemented, set DOCSYNTH_TRY_HTTP=1 and pass DOCSYNTH_HOST/DOCSYNTH_PORT.
  • Create pytest tests that launch the server subprocess via stdio and call the client (mock_mode recommended).

Contributing

PRs welcome. Suggested first issues:

  • HTTP transport adapter & docs
  • SSE streaming for long extraction jobs
  • PDF→image helper & multi-page handling
  • KG connector for Neo4j

Please follow the code style, add tests, and include changelog entries for breaking changes.

© DocSynthAI — Built for MCP experimentation and production prototyping