claude-praetorian-mcp

Vvkmnn/claude-praetorian-mcp

3.4

If you are the rightful owner of claude-praetorian-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Claude Praetorian MCP is a server designed for aggressive context compaction, enabling significant token savings by structuring and compacting data efficiently.

claude-praetorian-mcp

An Model Context Protocol (MCP) server for aggressive context compaction in Claude Code. Save 90%+ tokens by compacting web research, task outputs, and conversations into structured snapshots.


npm version License: MIT TypeScript Node.js Claude GitHub stars


Inspired by this talk by Dexter Horthy from HumanLayer, and his team's work on ACE: Advanced Context Engineering for Coding Agents, 12-Factor Agents & the TOON (Token-Oriented Object Notation) Format.

install

Requirements:

Claude Code

From shell:

claude mcp add claude-praetorian-mcp -- npx claude-praetorian-mcp

From inside Claude (restart required):

Add this to our global mcp config: npx claude-praetorian-mcp

Install this mcp: https://github.com/Vvkmnn/claude-praetorian-mcp

From any manually configurable mcp.json: (Cursor, Windsurf, etc.)

{
  "mcpServers": {
    "claude-praetorian-mcp": {
      "command": "npx",
      "args": ["claude-praetorian-mcp"],
      "env": {}
    }
  }
}

There is no npm install required -- no external databases or services, only flat files.

However, if npx resolves the wrong package, you can force resolution with:

npm install -g claude-praetorian-mcp

Optionally, install the skill to teach Claude when to proactively use praetorian:

npx skills add Vvkmnn/claude-praetorian-mcp --skill claude-praetorian --global
# Optional: add --yes to skip interactive prompt and install to all agents

This makes Claude automatically compact after research, subagent tasks, and before context resets. The MCP works without the skill, but the skill improves discoverability.

plugin

For full automation with hooks, install from the claude-emporium marketplace:

/plugin marketplace add Vvkmnn/claude-emporium
/plugin install claude-praetorian@claude-emporium

The claude-praetorian plugin provides:

Hooks (targeted, fires only at high-value moments):

  • Before EnterPlanMode → Restore prior compactions for this project
  • Before context compaction → Save decisions/insights before context resets
  • After WebFetch/WebSearch → Compact web research findings
  • After SubagentStop → Compact subagent task results

Commands: /praetorian-compact, /praetorian-restore, /praetorian-search

Requires the MCP server installed first. See the emporium for other Claude Code plugins and MCPs.

features

MCP server for aggressive context compaction. Generates structured incremental snapshots to yield 90%+ token savings and easily refresh context with "frequent intentional compaction".

Runs project by project, saves artifacts to {$project}/.claude/praetorian (with a royal guard ⚜️):

praetorian_compact

(Incrementally) compact context using the TOON format to get the most valuable tokens from an activity.

⚜️ praetorian_compact type=<type> title=<title>
  > "ACE Framework research - save 1,450 tokens"
  > "Icon rendering bug investigation - compact the findings"
  > "Database architecture decisions - preserve the rationale"
  > "WebFetch results from authentication docs"
  > "Task output from explore subagent - code structure analysis"
⚜️ compact | Created

┌─ ⚜️  ────────────────────────────────────────────────── Created ─┐
│ Compacted: "ACE Framework Research" • 1,450 tokens saved
│ Type: web_research • ID: cpt_1765245902396_nxetoc
└───────────────────────────────────────────────────────────────────┘
{
  "type": "web_research",
  "title": "ACE Framework Research",
  "source": "https://github.com/humanlayer/ace-fca",
  "key_insights": [
    "Frequent intentional compaction saves 90%+ tokens",
    "TOON format is 30-60% smaller than JSON/YAML",
    "Compaction should happen after every expensive operation"
  ],
  "refs": ["ace-fca.md:42 - compaction strategy", "toon-spec.md:1 - format definition"],
  "recommendations": ["Compact after every WebFetch", "Use type='decisions' for architecture choices"]
}
⚜️ compact | Merged

┌─ ⚜️  ───────────────────────────────────────────────────── Merged ─┐
│ Compacted: "Authentication Patterns" • 890 tokens saved
│ Type: decisions • ID: cpt_1765245903512_xk9mp1
│ Merged with: cpt_1765245903512_xk9mp1
└────────────────────────────────────────────────────────────────────┘
{
  "type": "decisions",
  "title": "Authentication Patterns",
  "decisions": [
    { "chose": "JWT with refresh tokens", "over": ["sessions", "API keys"], "reason": "Stateless, works across microservices" },
    { "chose": "httpOnly cookies", "over": ["localStorage"], "reason": "XSS protection" }
  ],
  "refs": ["src/middleware/auth.ts:45 - token validation", "src/routes/login.ts:23 - refresh flow"],
  "anti_patterns": ["Never store tokens in localStorage", "Never skip CSRF on cookie-based auth"]
}
praetorian_restore

Search and restore context by injecting TOON tokens back into current context as needed.

⚜️ praetorian_restore query=<query>
  > "What did we learn about authentication?"
  > "Find the Docker container debugging session"
  > "Show recent architecture decisions"
  > "Search for MCP server implementation patterns"
  > "" (empty = recent compactions)
⚜️ restore | Search

┌─ ⚜️  ───────────────────────────────────────────────────── Search ─┐
│ Found 2 compactions
│ Query: "authentication"
└────────────────────────────────────────────────────────────────────┘
{
  "compactions": [
    {
      "id": "cpt_1765245903512_xk9mp1",
      "type": "decisions",
      "title": "Authentication Patterns",
      "relevance": 0.85,
      "key_insights": ["JWT with refresh tokens", "httpOnly cookies for XSS protection"],
      "refs": ["src/middleware/auth.ts:45", "src/routes/login.ts:23"]
    },
    {
      "id": "cpt_1765245902396_nxetoc",
      "type": "web_research",
      "title": "OAuth2 Best Practices",
      "relevance": 0.72,
      "key_insights": ["PKCE flow for public clients", "Token rotation every 15min"]
    }
  ],
  "total": 2
}
⚜️ restore | Recent

┌─ ⚜️  ───────────────────────────────────────────────────── Recent ─┐
│ Found 3 compactions
└────────────────────────────────────────────────────────────────────┘

Status indicators:

  • Created - New compaction saved
  • Merged - Updated existing compaction (>70% title similarity via Jaccard index)
  • Search - Search results returned (keyword matching)
  • Recent - Recent compactions listed (by updated time)

Praetorian is designed for heavy, frequent use. The more you compact, the more you save.

When to compact:

  • After every WebFetch
  • After every Task/subagent completes
  • After reading multiple files
  • After making decisions
  • During long conversations (proactive compaction)
  • Before context gets >60% full

Real-world example session:

CompactionBeforeAfterSaved
Web research (3 URLs)4,5003004,200
Subagent outputs (2)3,5003003,200
Architecture debates5,0003004,700
Hook research1,5001501,350
Total14,5001,05013,450 (93%)

Next session: restore() loads ~1,000 tokens. Instant resume, no re-research.

methodology

How claude-praetorian-mcp works:

              ⚜️  claude-praetorian-mcp
              ════════════════════════


        compact (save)          restore (load)
        ──────────────          ──────────────

            INPUT                   QUERY
              │                       │
              ▼                       ▼
          ┌─────────┐           ┌─────────┐
          │   Zod   │           │ Inverted│
          │Validate │           │  Index  │
          └────┬────┘           └────┬────┘
               │                     │
               ▼                     ▼
          ┌─────────┐           ┌─────────┐
          │ Jaccard │           │  TOON   │
          │ >70% ?  │           │ Decode  │
          └──┬───┬──┘           └────┬────┘
             │   │                   │
          new│   │match              ▼
             │   │                OUTPUT
             ▼   ▼
          ┌─────────┐
          │  TOON   │
          │ Encode  │
          └────┬────┘
               │
               ▼
          ┌─────────┐
          │  Index  │
          │ Update  │
          └────┬────┘
               │
               ▼
            OUTPUT


        storage: .claude/praetorian/
        ────────────────────────────
        index.json            word index + metadata
        compactions/*.toon    encoded compaction files

Core optimizations:

Design principles:

  • Incremental -- merge similar compactions via Jaccard similarity, don't replace
  • Token-minimal -- TOON format for 30-60% fewer tokens than YAML/JSON
  • Project-scoped -- each project stores in its own .claude/praetorian/
  • Flat files only -- no databases, no external services
  • Offline -- never leaves your machine, no network calls

File access:

  • Stores in: <project>/.claude/praetorian/
  • Format: .toon files (encoded compactions) + index.json (word index + metadata)

development

git clone https://github.com/Vvkmnn/claude-praetorian-mcp && cd claude-praetorian-mcp
npm install && npm run build
npm test

Package requirements:

  • Node.js: >=20.0.0 (ES modules)
  • Runtime: @modelcontextprotocol/sdk, @toon-format/toon, zod
  • Zero external databases -- works with npx

Development workflow:

npm run build          # TypeScript compilation with executable permissions
npm run dev            # Watch mode with tsc --watch
npm run start          # Run the MCP server directly
npm run lint           # ESLint code quality checks
npm run lint:fix       # Auto-fix linting issues
npm run format         # Prettier formatting (src/)
npm run format:check   # Check formatting without changes
npm run typecheck      # TypeScript validation without emit
npm run test           # Lint + type check
npm run prepublishOnly # Pre-publish validation (build + lint + format:check)

Git hooks (via Husky):

  • pre-commit: Auto-formats staged .ts files with Prettier and ESLint

Contributing:

  • Fork the repository and create feature branches
  • Test with multiple compaction types before submitting PRs
  • Follow TypeScript strict mode and MCP protocol standards

Learn from examples:

license


A Roman Emperor AD 41 by Lawrence Alma-Tadema (1871). Gratus of the Praetorian Guard discovers Claudius hiding behind a curtain and declares him emperor. Walters Art Museum, public domain.