document-understanding-mcp-server
If you are the rightful owner of document-understanding-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
MCP server providing tools to extract text, metadata, layout, and search documents (primarily PDFs).
The Document Understanding MCP Server provides a set of tools for extracting information from PDF documents, including text content extraction, metadata extraction, layout information extraction, table extraction, image extraction, document outline/bookmarks extraction, text search functionality, and language detection. These tools can be used by AI models to analyze and understand PDF documents, enabling more sophisticated document processing workflows. The server implements the Model Context Protocol (MCP) specification, allowing AI models to interact with PDF documents through a standardized interface. Future plans include expanding support to multiple document types beyond PDF.
Features
- Text content extraction with OCR fallback for scanned documents
- Metadata extraction including author, title, and creation date
- Layout information extraction such as text blocks, images, and drawings
- Table and image extraction
- Text search functionality and language detection