mmaun/askmydocs
If you are the rightful owner of askmydocs and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Knowledge Management MCP Server is a robust solution for document ingestion, vector search, and knowledge management, leveraging TypeScript/Node.js and ChromaDB.
AskMyDoc – Knowledge Management Add-on
AskMyDoc makes it simple to upload, organise, and search your documents. Just add your files, and AskMyDoc instantly turns them into a searchable knowledge base so you can ask questions and get clear answers in seconds. It’s fast, easy to set up, and helps you find the information you need without digging through folders.
Features
• 📄 Works with Many File Types: Upload PDFs, Word docs, text files, spreadsheets, and more
• 🔍 Smart Search: Quickly find answers based on meaning, not just keywords
• 🧠 Flexible Options: Choose between built-in, cloud, or local AI for powering your search
• 📊 Breaks Down Large Documents: Splits files into easy-to-understand sections for better results
• 🏷️ Organise with Tags: Add labels or notes to keep documents easy to find
• 🔄 Upload in Bulk: Bring in entire folders of files at once
• 🚀 Quick to Start: Run instantly without complicated setup
🚀 Quick Start (For Non-Technical Users)
Don't worry if you're not technical! This guide will walk you through everything step-by-step using your computer's normal file manager and text editor.
What You'll Need:
- ✅ A computer (Mac or Windows)
- ✅ Claude Desktop installed
- ✅ About 10 minutes to set up
What We'll Do:
- 📁 Create two special folders on your computer
- ⚙️ Tell Claude Desktop where to find these folders
- 🔄 Restart Claude Desktop
- 🎉 Start asking questions about your documents!
Ready? Let's start! 👇
Support
- Help with installation: If you need help with installation just email hello@biznezstack.com
1. Create Required Folders
You MUST create these folders before starting Claude Desktop. Think of them as special folders where AskMyDoc will store your documents and search data.
🖱️ Easy Way: Using Your Computer's File Manager
On Mac:
- Open Finder (the folder icon in your dock)
- Click on your username in the sidebar (it's usually at the top)
- Right-click in an empty space and select "New Folder"
- Name it exactly:
knowledge-storage - Create another folder and name it exactly:
knowledge-chroma
On Windows:
- Open File Explorer (the folder icon on your taskbar)
- Click on "This PC" in the sidebar
- Double-click on "Users" then your username folder
- Right-click in an empty space and select "New" → "Folder"
- Name it exactly:
knowledge-storage - Create another folder and name it exactly:
knowledge-chroma
📍 How to Find Your Folder Paths
On Mac:
- Open Finder
- Click on your username folder
- Right-click on the
knowledge-storagefolder - Hold Option key and select "Copy [folder name] as Pathname"
- Paste it somewhere to see the full path (it will look like
/Users/YourName/knowledge-storage)
On Windows:
- Open File Explorer
- Navigate to your username folder (usually
C:\Users\YourName\) - Right-click on the
knowledge-storagefolder - Select "Copy as path"
- Paste it somewhere to see the full path (it will look like
C:\Users\YourName\knowledge-storage)
💻 Alternative: Using Terminal/Command Prompt (For Advanced Users)
If you're comfortable with command line tools:
Mac/Linux Terminal:
# Create the folders
mkdir -p ~/knowledge-storage
mkdir -p ~/knowledge-chroma
# Get the full paths
echo "$HOME/knowledge-storage"
echo "$HOME/knowledge-chroma"
Windows Command Prompt:
# Create the folders
mkdir %USERPROFILE%\knowledge-storage
mkdir %USERPROFILE%\knowledge-chroma
# Get the full paths
echo %USERPROFILE%\knowledge-storage
echo %USERPROFILE%\knowledge-chroma
2. Configure Claude Desktop
This step tells Claude Desktop where to find AskMyDoc and where to store your documents.
📁 Step 1: Find Your Configuration File
On Mac:
- Open Finder
- Press Cmd + Shift + G (Go to Folder)
- Type:
~/Library/Application Support/Claude/ - Press Enter
- Look for a file called
claude_desktop_config.json
On Windows:
- Open File Explorer
- In the address bar, type:
%APPDATA%\Claude\ - Press Enter
- Look for a file called
claude_desktop_config.json
✏️ Step 2: Edit the Configuration File
- Right-click on
claude_desktop_config.json - Select "Open with" → "TextEdit" (Mac) or "Notepad" (Windows)
- Replace everything in the file with this code:
{
"mcpServers": {
"knowledge_mgmt": {
"command": "npx",
"args": ["-y", "knowledge-mgmt-mcp"],
"env": {
"STORAGE_DIR": "REPLACE_WITH_YOUR_STORAGE_PATH",
"CHROMA_DB_DIR": "REPLACE_WITH_YOUR_CHROMA_PATH",
"EMBEDDING_PROVIDER": "transformers",
"EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
"CHUNK_SIZE": "1000",
"CHUNK_OVERLAP": "200",
"CHUNKING_STRATEGY": "sentence"
}
}
}
}
🔄 Step 3: Replace the Paths
You need to replace the placeholder text with your actual folder paths:
- Find
REPLACE_WITH_YOUR_STORAGE_PATHin the file - Replace it with the path you copied earlier (like
/Users/YourName/knowledge-storage) - Find
REPLACE_WITH_YOUR_CHROMA_PATHin the file - Replace it with the second path you copied (like
/Users/YourName/knowledge-chroma)
📝 Real Examples
If your name is "Sarah" on Mac:
- Replace
REPLACE_WITH_YOUR_STORAGE_PATHwith:/Users/Sarah/knowledge-storage - Replace
REPLACE_WITH_YOUR_CHROMA_PATHwith:/Users/Sarah/knowledge-chroma
If your name is "Mike" on Windows:
- Replace
REPLACE_WITH_YOUR_STORAGE_PATHwith:C:\Users\Mike\knowledge-storage - Replace
REPLACE_WITH_YOUR_CHROMA_PATHwith:C:\Users\Mike\knowledge-chroma
💾 Step 4: Save the File
- Press Cmd + S (Mac) or Ctrl + S (Windows)
- Close the text editor
📋 What These Folders Do
STORAGE_DIR: This is where AskMyDoc keeps copies of your documents and their textCHROMA_DB_DIR: This is where AskMyDoc stores the search index (like a library catalog)
3. Restart Claude Desktop
The server will automatically download and start when Claude needs it.
4. Start Using It
Ask Claude to help you:
- "Ingest this PDF document for me"
- "Search my documents for information about X"
- "List all my ingested documents"
- "What are the statistics of my knowledge base?"
Configuration Options
Environment Variables
| Variable | Description | Default | Options |
|---|---|---|---|
STORAGE_DIR | Directory for document storage | ~/.knowledge-mgmt-mcp/storage | Any valid path |
CHROMA_DB_DIR | ChromaDB database directory | ~/.knowledge-mgmt-mcp/chroma_db | Any valid path |
EMBEDDING_PROVIDER | Embedding provider to use | transformers | transformers, openai, cohere |
EMBEDDING_MODEL | Model to use for embeddings | Xenova/all-MiniLM-L6-v2 | See below |
OPENAI_API_KEY | OpenAI API key (if using OpenAI) | - | Your API key |
COHERE_API_KEY | Cohere API key (if using Cohere) | - | Your API key |
CHUNK_SIZE | Maximum chunk size in characters | 1000 | 100-10000 |
CHUNK_OVERLAP | Overlap between chunks | 200 | 0-500 |
CHUNKING_STRATEGY | Chunking method | sentence | sentence, paragraph, fixed |
MAX_FILE_SIZE | Max file size in bytes | 104857600 (100MB) | Any number |
ALLOWED_FILE_TYPES | Comma-separated file types | pdf,docx,txt,md,csv,json,html | Any subset |
LOG_LEVEL | Logging verbosity | INFO | DEBUG, INFO, WARN, ERROR |
Embedding Provider Options
Transformers.js (Local - No API Key Required)
{
"EMBEDDING_PROVIDER": "transformers",
"EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2"
}
Available Models:
Xenova/all-MiniLM-L6-v2(384 dimensions, fast)Xenova/all-mpnet-base-v2(768 dimensions, more accurate)
OpenAI
{
"EMBEDDING_PROVIDER": "openai",
"EMBEDDING_MODEL": "text-embedding-3-small",
"OPENAI_API_KEY": "sk-..."
}
Available Models:
text-embedding-3-small(1536 dimensions)text-embedding-3-large(3072 dimensions)text-embedding-ada-002(1536 dimensions)
Cohere
{
"EMBEDDING_PROVIDER": "cohere",
"EMBEDDING_MODEL": "embed-english-v3.0",
"COHERE_API_KEY": "..."
}
Available Models:
embed-english-v3.0(1024 dimensions)embed-multilingual-v3.0(1024 dimensions)
Available Tools
1. ingest_document
Ingest a document from file path or raw text.
Parameters:
file_path(string, optional): Path to file (mutually exclusive with text_content)text_content(string, optional): Raw text content (mutually exclusive with file_path)metadata(object, optional): Custom metadata key-value pairstags(array, optional): Tags for categorization
Example:
{
"file_path": "/path/to/document.pdf",
"tags": ["research", "2024"],
"metadata": {
"author": "John Doe",
"project": "AI Research"
}
}
Returns:
{
"documentId": "doc_1234567890_abc123",
"chunksCreated": 15,
"status": "success",
"message": "Document ingested successfully with 15 chunks"
}
2. search_knowledge
Search the knowledge base semantically.
Parameters:
query(string, required): Search querymax_results(number, optional): Maximum results (default: 10)similarity_threshold(number, optional): Minimum similarity 0-1 (default: 0.0)filter_metadata(object, optional): Filter by metadatafilter_tags(array, optional): Filter by tags
Example:
{
"query": "What are the key findings about machine learning?",
"max_results": 5,
"similarity_threshold": 0.7,
"filter_tags": ["research"]
}
Returns:
{
"results": [
{
"content": "Machine learning models showed 95% accuracy...",
"source": "research_paper.pdf",
"score": 0.89,
"metadata": {
"document_id": "doc_1234567890_abc123",
"file_type": "pdf",
"tags": ["research", "2024"]
},
"chunk_index": 3
}
],
"total_results": 5
}
3. list_documents
List all ingested documents.
Parameters:
tags(array, optional): Filter by tagslimit(number, optional): Max documents (default: 50)offset(number, optional): Skip documents (default: 0)
Returns:
{
"documents": [
{
"id": "doc_1234567890_abc123",
"filename": "research_paper.pdf",
"file_type": "pdf",
"tags": ["research", "2024"],
"chunks_count": 15,
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:30:00Z"
}
],
"total": 1
}
4. get_document
Retrieve full document content and chunks.
Parameters:
document_id(string, required): Document ID
Returns:
{
"id": "doc_1234567890_abc123",
"content": "Full document text...",
"filename": "research_paper.pdf",
"file_type": "pdf",
"tags": ["research"],
"chunks": [
{
"index": 0,
"content": "First chunk content...",
"start_char": 0,
"end_char": 1000
}
],
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:30:00Z"
}
5. delete_document
Delete a document and its chunks.
Parameters:
document_id(string, required): Document ID
Returns:
{
"success": true,
"message": "Document doc_1234567890_abc123 deleted successfully"
}
6. update_document_metadata
Update document metadata and tags.
Parameters:
document_id(string, required): Document IDmetadata(object, optional): Custom metadata to updatetags(array, optional): Replace existing tags
Returns:
{
"success": true,
"message": "Document metadata updated successfully"
}
7. get_collection_stats
Get knowledge base statistics.
Returns:
{
"total_documents": 10,
"total_chunks": 150,
"collection_size": 150,
"average_chunks_per_document": 15,
"file_types": {
"pdf": 5,
"docx": 3,
"txt": 2
}
}
8. batch_ingest
Ingest multiple documents from a directory.
Parameters:
directory_path(string, required): Directory pathfile_patterns(array, optional): Patterns like["*.pdf", "*.txt"](default:["*"])recursive(boolean, optional): Search subdirectories (default: true)
Example:
{
"directory_path": "/path/to/documents",
"file_patterns": ["*.pdf", "*.docx"],
"recursive": true
}
Returns:
{
"totalFiles": 10,
"successCount": 9,
"failedCount": 1,
"documents": [...],
"errors": [
{
"file": "/path/to/corrupt.pdf",
"error": "PDF processing failed"
}
]
}
Chunking Strategies
Sentence-Aware (Default)
Splits text at sentence boundaries, respecting natural language structure.
Best for: General documents, articles, research papers
Pros: Maintains semantic coherence Cons: Variable chunk sizes
Paragraph-Aware
Splits text at paragraph boundaries (double newlines).
Best for: Documents with clear paragraph structure
Pros: Preserves document structure Cons: May create very large or small chunks
Fixed-Size
Splits text into fixed-size chunks with overlap.
Best for: Uniform processing, technical documents
Pros: Predictable chunk sizes Cons: May split mid-sentence
Supported File Formats
| Format | Extension | Description |
|---|---|---|
.pdf | Portable Document Format | |
| Word | .docx | Microsoft Word documents |
| Text | .txt | Plain text files |
| Markdown | .md | Markdown formatted text |
| CSV | .csv | Comma-separated values |
| JSON | .json | JSON data files |
| HTML | .html, .htm | Web pages |
Troubleshooting
📁 Folder Not Found Errors
Problem: You see an error like "no such file or directory" or "folder not found"
Solution: The folders don't exist yet. Go back to Step 1 and create them using your computer's file manager:
On Mac:
- Open Finder
- Click on your username in the sidebar
- Right-click and create a new folder called
knowledge-storage - Create another folder called
knowledge-chroma
On Windows:
- Open File Explorer
- Go to This PC → Users → your username
- Right-click and create a new folder called
knowledge-storage - Create another folder called
knowledge-chroma
🔒 Permission Denied Errors
Problem: You see "Permission denied" or "Access denied" errors
Solution: The folders might be locked. Try this:
On Mac:
- Right-click on the
knowledge-storagefolder - Select "Get Info"
- Make sure "Read & Write" is selected for your user
- Do the same for
knowledge-chroma
On Windows:
- Right-click on the
knowledge-storagefolder - Select "Properties"
- Go to "Security" tab
- Make sure your user has "Full control"
- Do the same for
knowledge-chroma
Server Not Starting
- Check logs: Set
LOG_LEVEL=DEBUGto see detailed logs - Verify paths: Ensure
STORAGE_DIRandCHROMA_DB_DIRare writable - Check Node version: Requires Node.js 18+
node --version # Should be 18.0.0 or higher
Embedding Generation Slow
Using Transformers.js? First run downloads the model (~100MB). Subsequent runs are fast.
Solution: Use a smaller model like Xenova/all-MiniLM-L6-v2 or switch to OpenAI/Cohere for faster processing.
Out of Memory
Large documents? Reduce CHUNK_SIZE or process files individually instead of batch ingestion.
{
"CHUNK_SIZE": "500",
"MAX_FILE_SIZE": "10485760"
}
ChromaDB Connection Error
Solution: Ensure CHROMA_DB_DIR exists and is writable:
mkdir -p ~/knowledge-chroma
chmod 755 ~/knowledge-chroma
File Processing Failures
PDF extraction issues? Some PDFs are scanned images. Use OCR preprocessing before ingestion.
DOCX errors? Ensure file isn't corrupted. Try opening in Word first.
Development
Local Setup
# Clone repository
git clone https://github.com/yourusername/knowledge-mgmt-mcp.git
cd knowledge-mgmt-mcp
# Install dependencies
npm install
# Build
npm run build
# Test locally
npm link
Testing with Claude Desktop
{
"mcpServers": {
"knowledge_mgmt_dev": {
"command": "node",
"args": ["/absolute/path/to/knowledge-mgmt-mcp/dist/index.js"],
"env": {
"LOG_LEVEL": "DEBUG",
...
}
}
}
}
Performance Tips
-
Choose the right embedding provider:
- Local (Transformers.js): Free, private, slower first run
- OpenAI: Fast, costs $0.0001/1K tokens
- Cohere: Fast, costs $0.0001/1K tokens
-
Optimize chunk size:
- Smaller chunks (500-800): Better for specific searches
- Larger chunks (1000-1500): Better for context
-
Use appropriate chunking strategy:
- Sentence: Most documents
- Paragraph: Long-form content
- Fixed: Technical/structured data
-
Batch operations:
- Use
batch_ingestfor multiple files - Process directories rather than individual files
- Use
Security Considerations
- File path validation: Prevents directory traversal attacks
- Input sanitization: All user inputs are validated
- API key security: Never log or expose API keys
- File size limits: Configurable max file size prevents abuse
- Rate limiting: Consider implementing rate limiting for production
License
MIT License - see LICENSE file for details
Contributing
Contributions welcome! Please open an issue or submit a pull request.
Support
Help with configuration: email hello@biznezstack.com
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Changelog
v1.0.0 (Initial Release)
- Multi-format document ingestion
- Semantic search with ChromaDB
- Multiple embedding providers
- Intelligent chunking strategies
- Complete MCP tool implementation
- Comprehensive error handling
- Full documentation