metamcp-rag-server by cordlesssteve - MCP Server

MetaMCP RAG Server

Attribution

This repository is based on metatool-ai/metamcp - "A specification for Model Context Protocol (MCP) servers that can dynamically discover and connect to other MCP servers."

Important: This repository is intentionally disconnected from the upstream to prevent accidental pull requests. Please submit contributions to the original project.

Original Project

Upstream: https://github.com/metatool-ai/metamcp
License: MIT License
Documentation: See links in the original repository

Overview

The MetaMCP RAG Server solves the context bloat problem that occurs when Claude Code starts up with many MCP servers. Instead of loading all MCP tools into context immediately, this server provides lazy loading and on-demand tool discovery to keep your context window clean and efficient.

The Problem

Multiple MCP servers consume significant context tokens on startup
Hundreds of tool definitions loaded whether you need them or not
Context window filled with unused tool schemas
Slower startup times and reduced available context for actual work

The Solution

The MetaMCP RAG Server acts as a smart proxy that:

Prevents context bloat by not auto-loading all MCP tools on startup
Discovers tools on-demand only when you need them
Uses RAG (Retrieval-Augmented Generation) to find relevant tools for your queries
Lazy loads MCP servers and their tools as needed
Maintains clean context by exposing only essential tools by default

Key Benefits

Startup Optimization: Dramatically faster Claude Code startup with minimal context usage
Context Management: Keep your 200k token context window available for actual work
Smart Discovery: RAG-powered tool selection finds relevant tools without loading everything
Lazy Loading: MCP servers start only when their tools are needed
Query-Aware Filtering: Semantic tool selection based on what you're actually trying to do

Features

Context Bloat Prevention: Minimal tool exposure on startup to preserve context
On-Demand Tool Discovery: Tools are discovered and loaded only when needed
RAG-Powered Selection: Semantic search to filter relevant tools for each query
Multi-Server Aggregation: Manages multiple MCP servers behind a single interface
Lazy Server Initialization: MCP servers start on first tool request, not at startup
Smart Tool Routing: Routes tool calls to appropriate underlying servers
Graceful Degradation: Falls back to all tools if RAG service is unavailable

Installation

npm install
npm run build

Configuration

Add to your Claude Code MCP configuration:

{
  "mcpServers": {
    "metamcp-rag": {
      "command": "node",
      "args": ["./dist/index.js"],
      "env": {
        "RAG_MAX_DISCOVERY_TOOLS": "30",
        "RAG_MAX_ESSENTIAL_TOOLS": "10"
      }
    }
  }
}

Environment Variables

RAG_MAX_DISCOVERY_TOOLS (default: 30): Maximum tools returned by discover_tools. Higher values provide more options but increase context usage.
RAG_MAX_ESSENTIAL_TOOLS (default: 10): Maximum essential tools exposed by default to minimize startup context bloat.
RAG_SERVICE_HOST (default: 127.0.0.1): RAG service host address.
RAG_SERVICE_PORT (default: 8002): RAG service port.

Context Management Strategy

Startup Behavior

Minimal Tool Exposure: Only essential tools (≤10) loaded into context on startup
Lazy Server Discovery: MCP servers are discovered but not immediately connected
Clean Context Window: Maximum context preserved for your actual work

On-Demand Discovery

Query Analysis: When you use discover_tools, your query is analyzed
RAG Filtering: Semantic search finds relevant tools from all available servers
Lazy Loading: Only relevant servers are started and connected
Context Efficiency: Tools loaded only when they match your needs

Tool Routing

Smart Routing: Tool calls automatically routed to appropriate MCP server
Connection Management: Maintains persistent connections to active servers
Error Handling: Graceful degradation if servers become unavailable

Supported MCP Servers

The server automatically manages connections to:

Core Knowledge & Analytics

memory: Knowledge graph and entity relationship management
document-organizer: Document processing and organization
claude-telemetry: Usage analytics and telemetry tracking

Development Workflow

mitosis: Session handoff and context transfer
github: GitHub repository and issue management
security-scanner: Package and repository security analysis
git: Git repository operations and version control

RAG Integration

The server integrates with an external RAG service for intelligent tool selection:

RAG Service URL: http://localhost:8002
Tool Selection: Uses semantic similarity to filter relevant tools
Query Extraction: Automatically extracts query context from tool arguments
Fallback Behavior: Returns all tools if RAG service is unavailable

RAG Service Endpoints

GET /health - Health check
POST /select-tools - Semantic tool selection

Query Extraction Patterns

The server extracts queries from various argument patterns:

Direct query arguments: query, question, content, text
Action-based queries: description, expression, filename
Composite queries: Joins all string values from arguments

How It Works

1. Startup Optimization

// Minimal context consumption on startup
const essentialTools = [
  'discover_tools',  // For finding tools when needed
  'health_check',    // For service monitoring
  // ... maximum 10 essential tools
];

2. Lazy Tool Discovery

// Tools discovered on-demand via RAG
const response = await axios.post('/select-tools', {
  query: userQuery,
  available_tools: allAvailableToolNames,
  limit: 30,
  similarity_threshold: 0.1
});

3. Smart Server Management

// Servers started only when their tools are needed
if (!serverConnections[serverName]) {
  await startMCPServer(serverName);
}
const result = await sendMCPRequest(connection, 'tools/call', args);

Performance Benefits

Startup Speed: 5-10x faster Claude Code startup with clean context
Context Preservation: 95%+ of context window available for actual work
Memory Efficiency: Servers loaded only when needed
Response Time: RAG service provides sub-second tool discovery

Development

# Install dependencies
npm install

# Build project
npm run build

# Start in development mode
npm run dev

# Test server directly
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}' | node dist/index.js

Troubleshooting

RAG Service Issues

# Check if RAG service is running
curl http://localhost:8002/health

# Start RAG service manually
cd /path/to/metaMCP-RAG/rag-tool-retriever
python rag_service.py

Context Bloat Detection

# Monitor context usage before/after
token-analyzer-mcp analyze

# Check startup time improvement
time claude-code --startup-benchmark

License

MIT