CAG-MCP

devdotbo/CAG-MCP

3.2

If you are the rightful owner of CAG-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The CAG MCP Server is a high-performance server implementing the Model Context Protocol, designed to enhance response times and context coherence by preloading documents into the model's context window.

Tools
3
Resources
0
Prompts
0

CAG MCP Server

A high-performance Cache-Augmented Generation (CAG) server implementing the Model Context Protocol (MCP) standard.

What is CAG?

Cache-Augmented Generation (CAG) is an alternative to RAG (Retrieval-Augmented Generation) that preloads documents into the model's context window or cache instead of searching a vector database at runtime. This approach offers:

  • Faster response times - No vector search overhead
  • Better context coherence - All relevant documents are already in memory
  • Simpler architecture - No need for embeddings or vector databases
  • Lower latency - Direct access to cached content

Features

  • ๐Ÿš€ High Performance - In-memory caching with sub-millisecond access times
  • ๐Ÿ”Œ MCP Compatible - Full implementation of Model Context Protocol
  • ๐Ÿ“š Smart Caching - LRU eviction, size limits, and automatic content management
  • ๐Ÿ” Advanced Search - Full-text search across cached documents
  • ๐Ÿ› ๏ธ Dual Implementation - Both Python (FastMCP) and TypeScript versions
  • ๐Ÿงช Thoroughly Tested - TDD approach with real integration tests
  • ๐Ÿ”’ Production Ready - Error handling, logging, and monitoring

Quick Start

Python Version

# Install dependencies
uv sync

# Run the server
uv run python -m cag_mcp_server

# Run tests
uv run pytest

TypeScript Version

# Install dependencies
pnpm install

# Build and run
pnpm build
pnpm start

# Run tests
pnpm test

Integration with Claude Desktop

  1. Add to your Claude Desktop config:
{
  "mcpServers": {
    "cag-server": {
      "command": "uv",
      "args": ["run", "python", "-m", "cag_mcp_server"],
      "cwd": "/path/to/cag-mcp-server"
    }
  }
}
  1. Restart Claude Desktop
  2. The CAG server will appear in the MCP menu

Architecture

The CAG MCP server consists of:

  • Cache Manager - Handles document storage with size limits and LRU eviction
  • Document Loader - Loads and validates documents at startup
  • MCP Server - Implements the Model Context Protocol with resources, tools, and prompts
  • Search Engine - Provides fast full-text search across cached content

Configuration

Create a config.json file:

{
  "cache": {
    "maxSizeMB": 100,
    "evictionPolicy": "lru",
    "preloadDirectory": "./documents"
  },
  "server": {
    "logLevel": "info",
    "enableMetrics": true
  }
}

MCP Capabilities

Resources

Each cached document is exposed as an MCP resource with URI pattern cache://filename.

Tools

  • search_cache - Search across all cached documents
  • get_document - Retrieve a specific document
  • cache_stats - Get cache statistics and performance metrics

Prompts

Pre-built prompt templates for common query patterns.

Development

# Setup development environment
./scripts/setup-dev.sh

# Run all tests
./scripts/integration-test.sh

# Verify MCP compatibility
./scripts/verify-mcp.sh

Testing

This project uses Test-Driven Development (TDD) without mocks. All tests use real implementations:

# Python tests
uv run pytest -xvs

# TypeScript tests
pnpm test

# Integration tests
./scripts/test-server.sh

Performance

  • Document access: < 1ms
  • Search latency: < 10ms for 1000 documents
  • Memory efficiency: ~1.2x document size
  • Startup time: < 5s for 100MB cache

Contributing

  1. Read CLAUDE.md for development guidelines
  2. Follow TDD approach - write tests first
  3. Ensure all tests pass before submitting PR
  4. Verify with real MCP client integration

License

MIT License - see LICENSE file for details