patrickkarle/loads-mcp-server
If you are the rightful owner of loads-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
LOADS MCP Server is an advanced document search solution designed to optimize the efficiency of large language models (LLMs) by intelligently searching documents and reducing token usage.
LOADS MCP Server
LLM-Optimized Adaptive Document Search - The intelligent document search solution that makes LLMs 26x more efficient.
The Problem
When LLMs need to search a 100-page document, they typically consume the entire file - wasting tokens, context window, and money:
📄 100-page PDF (48,207 tokens)
↓ Traditional approach: Read everything
💰 Cost: 48,207 tokens = $$$
⏱️ Time: Multiple seconds
🧠 Context: Window filled with mostly irrelevant content
The Solution
LOADS uses smart search with Bloom filters and TF-IDF scoring to find exactly what you need:
📄 100-page PDF (48,207 tokens)
↓ LOADS: Smart search for "security vulnerabilities"
🎯 Result: 3 relevant sections (1,851 tokens)
💰 Cost: 1,851 tokens = 26x cheaper
⏱️ Time: < 100ms
🧠 Context: Laser-focused on relevant content
26x more efficient. Get the exact sections you need, every time.
Why LOADS?
🚀 Efficiency
- 26x token reduction on average documents
- Up to 100x on very large files
- Mathematically proven optimal threshold algorithm
🎯 Accuracy
- Bloom filter pre-filtering - O(1) elimination of irrelevant sections
- TF-IDF relevance scoring - Smart ranking by importance
- Token budget awareness - Fits perfectly within context limits
🔧 Easy to Use
- Natural language queries - Just ask Claude to search a file
- Drag-and-drop installation - MCPB format for Claude Desktop
- Works everywhere - Claude Desktop, Claude Code, any MCP client
📚 Multi-Format Support
- ✅ Markdown (
.md) - ✅ PDF (
.pdf) - ✅ Word Documents (
.docx) - ✅ HTML (
.html,.htm) - ✅ Plain Text (
.txt)
Quick Start (5 Minutes)
Claude Desktop (Easiest)
- Download
loads-mcp-server.mcpbfrom Releases - Drag-and-drop onto Claude Desktop window
- Confirm installation when prompted
- Done! Try it:
In C:/Users/me/Documents/report.pdf find everything about quarterly revenue
That's it! No configuration, no command line, just drag-and-drop.
Claude Code (5 minutes)
- Install dependencies:
npm install - Add to Claude Code settings:
{ "mcpServers": { "loads": { "command": "node", "args": ["/absolute/path/to/loads-mcp-server/server/index.js"] } } } - Restart Claude Code
See for complete documentation and advanced usage.
CLI / Custom Clients
# Install dependencies
npm install
# Test the server
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | node server/index.js
See for advanced usage, automation, and API documentation.
Usage Examples
Natural Language (Recommended)
Just ask Claude naturally:
Search C:/docs/api-spec.md for authentication methods
What does ~/reports/annual-2024.pdf say about revenue growth?
Find security vulnerabilities in /path/to/codebase-docs/
Claude automatically uses LOADS tools based on your query.
Directory Operations
Search entire folders recursively:
Search all PDFs in C:/research/papers for "machine learning"
Scan ~/Documents/contracts and search for "termination clause"
Explicit Tool Usage
For precise control:
Use loads_search on "/path/to/document.pdf" with query "error handling" and maxSections 5
List sections in ~/thesis.md then read section 3.2
Available Tools
| Tool | Purpose | Example |
|---|---|---|
loads_search | Smart search with relevance scoring | Find "authentication" in api-spec.md |
list_document_sections | Get document structure (TOC) | Show headings in thesis.pdf |
read_section | Read specific section by ID | Read section-42 from document |
read_lines | Read specific line ranges | Show lines 100-200 |
search_content | Full-text regex search | Find all TODO comments |
scan_directory | List all documents in folder | Show all PDFs in ~/reports |
search_directory | Search across all files | Find "revenue" in all docs |
How It Works
Two-Barrier Defense Pattern
LOADS uses a two-stage efficiency strategy:
┌─────────────────────────────────────────┐
│ Document arrives (unknown size) │
└────────────────┬────────────────────────┘
│
┌───────▼───────┐
│ BARRIER 1: │
│ Pre-Parse │ ← Check file size BEFORE parsing
│ Passthrough │ If < 1.5KB → Return whole doc (zero overhead!)
└───────┬───────┘
│ >= 1.5KB
┌───────▼───────┐
│ Parse Doc │ ← Parse into sections
│ to Sections │
└───────┬───────┘
│
┌───────▼───────┐
│ BARRIER 2: │
│ Smart │ ← Compare costs: full doc vs filtered
│ Threshold │ Return whichever is cheaper
└───────┬───────┘
│
┌───────▼───────┐
│ Bloom │ ← If filtering, use O(1) elimination
│ Filters │
└───────┬───────┘
│
┌───────▼───────┐
│ TF-IDF │ ← Score & rank by relevance
│ Scoring │
└───────┬───────┘
│
┌───────▼───────┐
│ Budget │ ← Fit within token limit
│ Manager │
└───────┬───────┘
│
┌───────▼───────┐
│ Return Best │ ← Optimal result (always!)
│ Result │
└───────────────┘
Why This is Optimal
Theoretical Proof: The only way to improve would be to require zero tokens to access documents. Since that's impossible, LOADS represents the theoretical optimum given current system constraints.
Mathematical Validation: See for 37 proof tests validating every numerical claim.
Performance
Expected Performance
| Document Size | Search Time | Efficiency Gain |
|---|---|---|
| < 100KB | < 100ms | 5-10x |
| 100KB - 1MB | 100-500ms | 10-26x |
| 1MB - 10MB | 500ms - 2s | 26-50x |
| 10MB - 50MB | 2-5s | 50-100x |
Real-World Example
Document: 100-page technical manual (48,207 tokens)
Query: "error handling procedures"
Traditional approach:
- Read entire document: 48,207 tokens
- Cost: 100% of document
- Time: 3-5 seconds
LOADS approach:
- Pre-parse check: < 1ms (file too large for passthrough)
- Bloom filter: 15ms (eliminate 92% of sections)
- TF-IDF scoring: 8ms (rank remaining sections)
- Budget fitting: 2ms (optimize for context)
- Result: 3 sections (1,851 tokens)
- Cost: 3.8% of document (26x reduction!)
- Time: 89ms total
Installation
Option 1: Claude Desktop (MCPB - Recommended)
Drag-and-drop installation in 30 seconds:
- Download
loads-mcp-server.mcpbfrom Releases - Drag file onto Claude Desktop window
- Confirm installation
- Done!
See for complete tool documentation.
Option 2: Claude Code (VS Code Extension)
Manual JSON configuration:
Edit your VS Code settings (.claude/mcp.json):
{
"mcpServers": {
"loads": {
"command": "node",
"args": ["/absolute/path/to/loads-mcp-server/server/index.js"]
}
}
}
See for complete setup guide and usage examples.
Option 3: CLI / Custom Clients
For developers and automation:
# Clone repository
git clone https://github.com/patrickkarle/loads-mcp-server.git
cd loads-mcp-server
# Install dependencies
npm install
# Test installation
npm test
# Use via JSON-RPC 2.0 over stdio
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | node server/index.js
See for complete API documentation and advanced usage patterns.
Documentation
Core Documentation
- - Complete tool reference, usage patterns, and examples
- - Test suite documentation (94 assertions, 100% pass rate)
- - Development insights and lessons learned
Troubleshooting
Quick Health Check
# Test server installation
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | node server/index.js
# Should output JSON with 7 tools listed
Common Issues
"Server not showing up"
- Check absolute paths in config (not relative)
- Restart Claude Desktop/Code completely
- Verify Node.js installed:
node --version
"File not found"
- Use absolute paths:
/Users/patrick/docs/file.pdf✅ - Not relative:
./docs/file.pdf❌ - Windows: Use forward slashes
C:/path/file.pdfor escaped backslashesC:\\path\\file.pdf
"Search returns no results"
- Check if PDF is scanned (image-based) - LOADS can't OCR
- Try broader query terms
- Use
search_contentwith regex for unstructured docs
See for complete troubleshooting guide and platform-specific solutions.
MCP Compatibility
LOADS implements the Model Context Protocol specification, making it compatible with any MCP client:
- ✅ Claude Desktop (native MCPB support)
- ✅ Claude Code (VS Code extension)
- ✅ Custom Clients (via JSON-RPC 2.0 over stdio)
- ✅ Future MCP Clients (protocol-compliant)
Protocol Details
- Transport: stdio (standard input/output)
- Format: JSON-RPC 2.0
- Methods:
tools/list,tools/call - Version: MCP v1.0
Development
Prerequisites
- Node.js >= 18.0.0
- npm >= 8.0.0
Setup
# Clone repository
git clone https://github.com/patrickkarle/loads-mcp-server.git
cd loads-mcp-server
# Install dependencies
npm install
# Run tests
npm test
# Expected output: 94 passing tests across 5 test suites
Test Suite
LOADS has comprehensive test coverage:
- Unit Tests: Core algorithms (Bloom filters, TF-IDF, token estimation)
- Integration Tests: End-to-end tool execution
- Proof Tests: Mathematical validation of efficiency claims
- Pre-Parse Tests: Two-barrier defense verification
- Performance Tests: Benchmark validation
100% pass rate - 94 assertions across 5 test suites.
See for detailed test documentation.
Project Structure
loads-mcp-server/
├── server/
│ ├── index.js # Main MCP server entry point
│ ├── tools/ # Tool implementations
│ │ ├── loads_search.js # Smart search
│ │ ├── list_document_sections.js
│ │ ├── read_section.js
│ │ ├── read_lines.js
│ │ ├── search_content.js
│ │ ├── scan_directory.js
│ │ └── search_directory.js
│ ├── parsers/ # Document parsers
│ │ ├── markdown_parser.js
│ │ ├── pdf_parser.js
│ │ ├── docx_parser.js
│ │ ├── html_parser.js
│ │ └── txt_parser.js
│ ├── search/ # Search algorithms
│ │ ├── bloom_filter.js
│ │ ├── tfidf_scorer.js
│ │ └── budget_manager.js
│ └── utils/
│ ├── token_estimator.js # THRESHOLD_CONSTANTS
│ └── response_guard.js
├── tests/ # Test suites
│ ├── loads_search.test.js
│ ├── pre_parse_passthrough.test.js
│ ├── proof_tests.test.js
│ ├── threshold_constants.test.js
│ └── unified_tools.test.js
├── README.md # This file
├── TESTING.md # Test documentation
├── OODA_LOOP_LESSONS_LEARNED.md # Development insights
└── package.json
Contributing
Contributions welcome! Here's how:
Reporting Issues
- Check existing issues
- Use the bug report template below
- Include:
- Platform (Windows/macOS/Linux)
- Node.js version
- Error logs with
DEBUG=mcp:* - Steps to reproduce
Suggesting Features
- Open a GitHub Discussion
- Describe your use case
- Explain desired behavior
- Consider implementation approach
Pull Requests
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Make changes with tests
- Ensure all tests pass:
npm test - Commit with clear message:
git commit -m "Add amazing feature" - Push to branch:
git push origin feature/amazing-feature - Open Pull Request with description
Development Guidelines
- Add tests for new features (maintain 100% pass rate)
- Update documentation (README, tool guides, troubleshooting)
- Follow existing code style (2-space indent, ES6+)
- Validate mathematical claims (add proof tests if changing algorithms)
- Test on multiple platforms (Windows, macOS, Linux if possible)
Roadmap
Planned Features
- NPM Package -
npx loads-mcp-serverfor zero-config installation - Caching Layer - Remember previously parsed documents for instant re-queries
- OCR Support - Extract text from scanned PDFs using Tesseract
- More File Formats - RTF, ODT, EPUB support
- Streaming Results - Progressive result delivery for very large documents
- Multi-Query Optimization - Batch multiple searches for same document
- Custom Tokenizers - Support for non-GPT tokenization schemes
Community Requests
Have an idea? Open a Discussion or vote on existing feature requests!
FAQ
Q: How does LOADS achieve 26x efficiency? A: Two-barrier defense pattern: (1) Pre-parse passthrough for tiny docs (zero overhead), (2) Smart threshold comparing full-doc vs filtered costs. Always chooses cheaper option. Mathematically optimal given system constraints.
Q: Does LOADS work offline?
A: Yes! After npm install, all processing is local. No internet required (except for initial dependency installation).
Q: Can LOADS search password-protected PDFs?
A: No. Decrypt PDFs first using qpdf or similar tools.
Q: Does LOADS support OCR for scanned PDFs? A: Not yet. Scanned PDFs (image-based) require text to be extractable. OCR support planned for future release.
Q: What's the largest document LOADS can handle? A: Tested up to 50MB PDFs successfully. Practical limit is system memory.
Q: Can I use LOADS with other LLMs besides Claude? A: Yes! Any MCP-compatible client works. LOADS is Claude-agnostic.
Q: Is LOADS faster than just reading the whole document? A: For documents > 1.5KB, yes. Up to 26x more efficient for typical documents, 100x for very large files.
Q: Can I customize token budgets?
A: Yes, via contextBudget parameter in loads_search. See for examples.
Q: Does LOADS modify my documents? A: No. LOADS only reads files. Zero write permissions required.
Technical Details
Token Estimation
LOADS uses GPT-4 tokenization approximation: 4 bytes per token (empirically validated).
Threshold Constants
All numerical values are defined in THRESHOLD_CONSTANTS (single source of truth):
export const THRESHOLD_CONSTANTS = {
RESPONSE_BASE_OVERHEAD: 130, // JSON wrapper overhead
PER_HIT_OVERHEAD: 25, // Per-section metadata
FULL_DOC_WRAPPER: 10, // Full doc wrapper
MINIMUM_SEARCH: 180, // Min search cost
AVG_SECTION_OVERHEAD: 20, // Avg section overhead
BYTES_PER_TOKEN: 4, // GPT-4 approximation
MINIMUM_CROSSOVER_BYTES: 800, // (180+20)*4
TYPICAL_CROSSOVER_BYTES: 1520, // (180+200)*4
MINIMUM_CROSSOVER_TOKENS: 200, // 800/4
TYPICAL_CROSSOVER_TOKENS: 380 // 1520/4
};
See for mathematical derivations and proof tests.
Bloom Filter Parameters
- Hash functions: 3 (optimal for typical document sections)
- Bit array size: 1024 bits (balance between false positives and memory)
- False positive rate: ~1% (acceptable for pre-filtering)
TF-IDF Implementation
- Term Frequency: Log-normalized with smoothing
- Inverse Document Frequency: Standard IDF formula
- Normalization: L2 norm for score comparability
License
MIT License - See file for details.
In short: Use freely, modify freely, distribute freely. Attribution appreciated but not required.
Author
Patrick Karle - @patrickkarle
Acknowledgments
- Anthropic - For Claude and the Model Context Protocol specification
- MCP Community - For feedback and testing
- Open Source Contributors - For dependency libraries (unpdf, mammoth, turndown)
Support
Get Help
- 📖 Documentation: Start with
- 🐛 Bug Reports: GitHub Issues
- 💬 Questions: GitHub Discussions
- 🔍 Troubleshooting: See USER_GUIDE.md
Stay Updated
- ⭐ Star this repo to get notifications of new releases
- 📬 Watch releases to know when new features ship
- 💡 Join discussions to shape the roadmap
Built with ❤️ for the Claude community. Making LLMs smarter, one search at a time.
Quick Links
Version: 2.4.2 Last Updated: 2025-12-03 Status: Production-Ready