landingai-ade-mcp by avaxia8 - MCP Server

LandingAI ADE MCP Server

A Model Context Protocol (MCP) server providing direct integration with LandingAI's Agentic Document Extraction (ADE) API. Extract text, tables, and structured data from PDFs, images, and office documents.

Features

📄 Document Parsing - Parse entire documents and return markdown output
🔍 Data Extraction - Extract structured data using JSON schemas
⚡ Parse Jobs - Handle large documents with background processing
🛡️ Zero Data Retention - Privacy-focused processing support

Installation

Prerequisites

Python 3.9 or higher
LandingAI API key from LandingAI

Option 1: Using uv (Recommended - Simplest)

uv is a fast Python package manager that handles virtual environments automatically.

Install uv (if not already installed)

# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or with Homebrew
brew install uv

Set up the project

# Clone the repository
git clone https://github.com/avaxia8/landingai-ade-mcp.git
cd landingai-ade-mcp

# Install dependencies with uv
uv sync

# Or if starting fresh:
uv init
uv add fastmcp httpx pydantic python-multipart aiofiles

Option 2: Using pip with Virtual Environment

# Clone the repository
git clone https://github.com/avaxia8/landingai-ade-mcp.git
cd landingai-ade-mcp

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Configuration

Set Your API Key

export LANDINGAI_API_KEY="your-api-key-here"

Claude Desktop Configuration

Configuration File Location

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Configuration Examples

Using uv (Recommended)

{
  "mcpServers": {
    "landingai-ade-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/landingai-ade-mcp",
        "run",
        "python",
        "-m",
        "server"
      ],
      "env": {
        "LANDINGAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Using Virtual Environment

{
  "mcpServers": {
    "landingai-ade-mcp": {
      "command": "/path/to/landingai-ade-mcp/venv/bin/python",
      "args": [
        "/path/to/landingai-ade-mcp/server.py"
      ],
      "env": {
        "LANDINGAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

After Configuration

Save the configuration file
Restart Claude Desktop completely (quit and reopen)
The server should appear as "landingai-ade-mcp" in your MCP servers

Available Tools

`parse_document`

Parse entire documents and return markdown output.

# Parse a local file
result = await parse_document(
    document_path="/path/to/document.pdf",
    model="dpt-2-latest",  # optional
    split="page"  # optional, for page-level splits
)

# Parse from URL
result = await parse_document(
    document_url="https://example.com/document.pdf"
)

`extract_data`

Extract structured data from markdown using a JSON schema.

schema = {
    "type": "object",
    "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
    }
}

# Extract from markdown content string
result = await extract_data(
    schema=schema,
    markdown="# Invoice\nInvoice #123\nTotal: $100.00"
)

# Or extract from a markdown file
result = await extract_data(
    schema=schema,
    markdown="/path/to/document.md"  # Will detect if it's a file path
)

# Or extract from URL
result = await extract_data(
    schema=schema,
    markdown_url="https://example.com/document.md"
)

`create_parse_job`

Create a parse job for large documents (>50MB recommended).

job = await create_parse_job(
    document_path="/path/to/large_document.pdf",
    split="page"  # optional
)
job_id = job["job_id"]

`get_parse_job_status`

Check status and retrieve results of a parse job.

status = await get_parse_job_status(job_id)

# Check status
if status["status"] == "completed":
    # For small files, data is included directly
    # For large files (>1MB), data is auto-fetched from output_url
    results = status["data"]
elif status["status"] == "processing":
    print(f"Progress: {status['progress'] * 100:.1f}%")

`list_parse_jobs`

List all parse jobs with filtering and pagination.

jobs = await list_parse_jobs(
    page=0,  # optional, default 0
    pageSize=10,  # optional, 1-100, default 10
    status="completed"  # optional filter
)

`process_folder`

Process all supported files in a folder - parse documents or extract structured data.

Supported formats:

Images: APNG, BMP, DCX, DDS, DIB, GD, GIF, ICNS, JP2, JPEG, JPG, PCX, PNG, PPM, PSD, TGA, TIFF, WEBP
Documents: PDF, DOC, DOCX, PPT, PPTX, ODP, ODT

# Parse all PDFs in a folder
result = await process_folder(
    folder_path="/path/to/documents",
    operation="parse",  # or "extract" for structured data
    file_types="pdf",   # optional filter
    model="dpt-2-latest"
)

# Extract structured data from all documents
schema = {
    "type": "object",
    "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"},
        "date": {"type": "string"}
    }
}

result = await process_folder(
    folder_path="/path/to/invoices",
    operation="extract",
    schema=schema,
    file_types="pdf,jpg"  # Process PDFs and images
)

# Process everything with defaults
result = await process_folder(
    folder_path="/path/to/mixed_documents"
)

Features:

Automatic file size detection (uses direct parsing for <50MB, jobs for larger)
Concurrent processing with rate limiting
Progress tracking for long-running operations
Organized output in ade_results folder
Aggregated data for extraction operations
Continues processing even if individual files fail

`health_check`

Check server status and API connectivity.

health = await health_check()
# Returns server status, API connectivity, available tools

File Size Guidelines

< 50MB: Use parse_document directly
> 50MB: Always use create_parse_job

Error Handling

result = await parse_document(document_path="/path/to/file.pdf")

if result.get("status") == "error":
    print(f"Error: {result['error']}")
    print(f"Status Code: {result.get('status_code')}")
else:
    # Process successful result
    markdown = result["markdown"]

Common Error Codes

401: Invalid API key
413: File too large (use parse jobs)
422: Validation error
429: Rate limit exceeded

Troubleshooting

Common Issues and Solutions

"Could not connect to MCP server"

Python not found: Make sure the Python path in your config is correct
```
# Find your Python path
which python3
```
Module not found errors: Dependencies aren't installed in the Python environment
- If using uv: Run uv sync in the project directory
- If using venv: Activate it and run pip install -r requirements.txt
- Check that the Python path in config matches your environment
spawn python ENOENT: The system can't find Python
- Use the full path to Python (e.g., /usr/bin/python3 instead of just python)
- For virtual environments, use the full path to the venv's Python

"Server disconnected"

Check the server can run manually:

cd /path/to/landingai-ade-mcp
python server.py
# Should see: "Starting LandingAI ADE MCP Server"

Check API key is set:
```
echo $LANDINGAI_API_KEY
```

Check dependencies are installed:

python -c "import fastmcp, httpx, pydantic"
# Should complete without errors

"ModuleNotFoundError: No module named 'fastmcp'"

This means fastmcp isn't installed in the Python environment being used:

If using virtual environment: The config is pointing to the wrong Python
Solution: Use uv or ensure the Python path matches your environment

Platform-Specific Issues

macOS: If you installed Python with Homebrew, the path might be /opt/homebrew/bin/python3 (Apple Silicon) or /usr/local/bin/python3 (Intel)

Windows: Use forward slashes in paths or escape backslashes: C:/path/to/python.exe or C:\\path\\to\\python.exe

Linux: Some systems use python3 instead of python. Always use python3 for clarity.

Debug Steps

Test the server standalone:
```
python server.py
```

Check MCP communication:

echo '{"jsonrpc": "2.0", "method": "initialize", "id": 1}' | python server.py

Verify configuration:
- Open Claude Desktop developer settings
- Check the logs for specific error messages
- Ensure all paths are absolute, not relative

Validate API key:

python -c "import os; print('API Key set:', bool(os.environ.get('LANDINGAI_API_KEY')))"