danielitus/mcp-document-server
If you are the rightful owner of mcp-document-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP Document Server is a Model Context Protocol server designed to facilitate interaction and analysis of documents by Claude, offering tools for searching, reading, and extracting information from various document formats.
list_documents
Lists all available documents with metadata.
search_documents
Searches for text across all documents.
extract_sections
Extracts sections from a document based on regex patterns.
MCP Document Server
A Model Context Protocol (MCP) server that enables Claude to interact with and analyze a collection of documents. This server provides tools for searching, reading, and extracting information from various document formats.
Table of Contents
- Features
- Prerequisites
- Installation
- Configuration
- Usage
- Available Tools
- Supported File Formats
- Examples
- Environment Variables
- Troubleshooting
- Development
Features
- Multi-format Support: Read and process various document formats including plain text, Markdown, PDF, and JSON
- Full-text Search: Search across all documents with configurable case sensitivity
- Document Management: List all documents with metadata (size, modification date, type)
- Content Extraction: Extract specific sections from documents using regex patterns
- Resource Access: Direct access to document contents through MCP resource URIs
- Real-time Updates: Automatically detects new documents added to the monitored directory
Prerequisites
- Node.js 18.0 or higher
- npm or yarn package manager
- Claude Desktop application
Installation
- Clone or download this repository:
git clone <repository-url>
cd mcp-document-server
- Install dependencies:
npm install
- Create a documents directory (or prepare your existing directory):
mkdir documents
Configuration
Claude Desktop Configuration
- Open Claude Desktop settings
- Navigate to the Developer settings
- Add the following to your MCP servers configuration:
{
"mcpServers": {
"document-server": {
"command": "node",
"args": ["/absolute/path/to/mcp-document-server/src/index.js"],
"env": {
"DOCUMENTS_PATH": "/absolute/path/to/your/documents"
}
}
}
}
Note: Replace /absolute/path/to/
with the actual paths on your system.
Configuration Options
DOCUMENTS_PATH
: Environment variable to specify the directory containing your documents (defaults to./documents
)
Usage
Starting the Server
The server starts automatically when Claude Desktop launches if configured correctly. To test it manually:
# Using default documents directory
node src/index.js
# Using custom documents directory
DOCUMENTS_PATH=/path/to/docs node src/index.js
Adding Documents
Simply place your documents in the configured documents directory. The server will automatically detect them when:
- Listing documents
- Performing searches
- Accessing resources
Interacting Through Claude
Once configured, you can ask Claude to:
-
List all documents:
- "Show me all available documents"
- "What documents do you have access to?"
-
Search across documents:
- "Search for 'machine learning' in all documents"
- "Find all mentions of 'API' (case sensitive)"
-
Read specific documents:
- "Read the contents of example.txt"
- "Show me what's in the README file"
-
Extract sections:
- "Extract all headings from the markdown file"
- "Find all sections starting with '##' in documentation.md"
Available Tools
1. list_documents
Lists all available documents with metadata.
Parameters: None
Returns:
{
"documents": [
{
"name": "example.txt",
"path": "/full/path/to/example.txt",
"size": 1234,
"modified": "2024-01-15T10:30:00.000Z",
"type": "txt"
}
]
}
2. search_documents
Searches for text across all documents.
Parameters:
query
(required): Text to search forcase_sensitive
(optional): Boolean for case-sensitive search (default: false)
Returns:
{
"results": [
{
"file": "example.txt",
"path": "/full/path/to/example.txt",
"matches": [
{
"line": 5,
"text": "This line contains the search term"
}
],
"total_matches": 3
}
]
}
3. extract_sections
Extracts sections from a document based on regex patterns.
Parameters:
file_path
(required): Path to the documentpattern
(required): Regex pattern to match sections
Returns:
{
"sections": [
{
"heading": "## Section Title",
"line_start": 10,
"line_end": 25,
"content": "Section content..."
}
]
}
Supported File Formats
- Plain Text (
.txt
): Direct text reading - Markdown (
.md
): Treated as plain text with full markdown syntax preserved - PDF (
.pdf
): Text extraction from PDF documents - JSON (
.json
): Pretty-printed JSON content
Examples
Example 1: Document Research Workflow
User: "I need to find all mentions of authentication in my documentation"
Claude will:
1. Use search_documents tool with query "authentication"
2. Present all matches with file names and line numbers
3. Offer to read specific documents for more context
Example 2: Extracting API Documentation
User: "Extract all API endpoints from my API.md file"
Claude will:
1. Use extract_sections with pattern "^###.*endpoint|^###.*api"
2. Return all matching sections with their content
Example 3: Document Overview
User: "Give me an overview of all technical documents"
Claude will:
1. Use list_documents to get all files
2. Filter for technical documentation
3. Provide summary with file sizes and last modified dates
Environment Variables
Variable | Description | Default |
---|---|---|
DOCUMENTS_PATH | Path to the documents directory | ./documents |
Troubleshooting
Common Issues
-
Server not appearing in Claude:
- Verify the configuration path is absolute, not relative
- Check that Node.js is in your system PATH
- Restart Claude Desktop after configuration changes
-
Documents not found:
- Ensure DOCUMENTS_PATH is set correctly
- Check file permissions on the documents directory
- Verify documents are in the root of the directory (not subdirectories)
-
PDF reading errors:
- Some PDFs may have text extraction issues
- Try converting to text format if problems persist
Debug Mode
To see server logs:
node src/index.js 2> server.log
Development
Project Structure
mcp-document-server/
āāā src/
ā āāā index.js # Main server implementation
āāā documents/ # Default documents directory
āāā package.json # Project dependencies
āāā README.md # This file
Adding New Features
To add support for new file formats:
- Add the file extension to
getMimeType()
method - Add parsing logic in
loadDocument()
method - Install any necessary parsing libraries
Contributing
Feel free to submit issues or pull requests for:
- New file format support
- Additional search capabilities
- Performance improvements
- Bug fixes
License
[Your chosen license]
Acknowledgments
Built using the Model Context Protocol SDK by Anthropic.