devdotbo/CAG-MCP
If you are the rightful owner of CAG-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The CAG MCP Server is a high-performance server implementing the Model Context Protocol, designed to enhance response times and context coherence by preloading documents into the model's context window.
CAG MCP Server
A high-performance Cache-Augmented Generation (CAG) server implementing the Model Context Protocol (MCP) standard.
What is CAG?
Cache-Augmented Generation (CAG) is an alternative to RAG (Retrieval-Augmented Generation) that preloads documents into the model's context window or cache instead of searching a vector database at runtime. This approach offers:
- Faster response times - No vector search overhead
- Better context coherence - All relevant documents are already in memory
- Simpler architecture - No need for embeddings or vector databases
- Lower latency - Direct access to cached content
Features
- ๐ High Performance - In-memory caching with sub-millisecond access times
- ๐ MCP Compatible - Full implementation of Model Context Protocol
- ๐ Smart Caching - LRU eviction, size limits, and automatic content management
- ๐ Advanced Search - Full-text search across cached documents
- ๐ ๏ธ Dual Implementation - Both Python (FastMCP) and TypeScript versions
- ๐งช Thoroughly Tested - TDD approach with real integration tests
- ๐ Production Ready - Error handling, logging, and monitoring
Quick Start
Python Version
# Install dependencies
uv sync
# Run the server
uv run python -m cag_mcp_server
# Run tests
uv run pytest
TypeScript Version
# Install dependencies
pnpm install
# Build and run
pnpm build
pnpm start
# Run tests
pnpm test
Integration with Claude Desktop
- Add to your Claude Desktop config:
{
"mcpServers": {
"cag-server": {
"command": "uv",
"args": ["run", "python", "-m", "cag_mcp_server"],
"cwd": "/path/to/cag-mcp-server"
}
}
}
- Restart Claude Desktop
- The CAG server will appear in the MCP menu
Architecture
The CAG MCP server consists of:
- Cache Manager - Handles document storage with size limits and LRU eviction
- Document Loader - Loads and validates documents at startup
- MCP Server - Implements the Model Context Protocol with resources, tools, and prompts
- Search Engine - Provides fast full-text search across cached content
Configuration
Create a config.json
file:
{
"cache": {
"maxSizeMB": 100,
"evictionPolicy": "lru",
"preloadDirectory": "./documents"
},
"server": {
"logLevel": "info",
"enableMetrics": true
}
}
MCP Capabilities
Resources
Each cached document is exposed as an MCP resource with URI pattern cache://filename
.
Tools
search_cache
- Search across all cached documentsget_document
- Retrieve a specific documentcache_stats
- Get cache statistics and performance metrics
Prompts
Pre-built prompt templates for common query patterns.
Development
# Setup development environment
./scripts/setup-dev.sh
# Run all tests
./scripts/integration-test.sh
# Verify MCP compatibility
./scripts/verify-mcp.sh
Testing
This project uses Test-Driven Development (TDD) without mocks. All tests use real implementations:
# Python tests
uv run pytest -xvs
# TypeScript tests
pnpm test
# Integration tests
./scripts/test-server.sh
Performance
- Document access: < 1ms
- Search latency: < 10ms for 1000 documents
- Memory efficiency: ~1.2x document size
- Startup time: < 5s for 100MB cache
Contributing
- Read
CLAUDE.md
for development guidelines - Follow TDD approach - write tests first
- Ensure all tests pass before submitting PR
- Verify with real MCP client integration
License
MIT License - see LICENSE file for details