Vvkmnn/claude-praetorian-mcp
If you are the rightful owner of claude-praetorian-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
Claude Praetorian MCP is a server designed for aggressive context compaction, enabling significant token savings by structuring and compacting data efficiently.
claude-praetorian-mcp
An Model Context Protocol (MCP) server for aggressive context compaction in Claude Code. Save 90%+ tokens by compacting web research, task outputs, and conversations into structured snapshots.
Inspired by this talk by Dexter Horthy from HumanLayer, and his team's work on ACE: Advanced Context Engineering for Coding Agents, 12-Factor Agents & the TOON (Token-Oriented Object Notation) Format.
install
Requirements:
From shell:
claude mcp add claude-praetorian-mcp -- npx claude-praetorian-mcp
From inside Claude (restart required):
Add this to our global mcp config: npx claude-praetorian-mcp
Install this mcp: https://github.com/Vvkmnn/claude-praetorian-mcp
From any manually configurable mcp.json: (Cursor, Windsurf, etc.)
{
"mcpServers": {
"claude-praetorian-mcp": {
"command": "npx",
"args": ["claude-praetorian-mcp"],
"env": {}
}
}
}
There is no npm install required -- no external databases or services, only flat files.
However, if npx resolves the wrong package, you can force resolution with:
npm install -g claude-praetorian-mcp
Optionally, install the skill to teach Claude when to proactively use praetorian:
npx skills add Vvkmnn/claude-praetorian-mcp --skill claude-praetorian --global
# Optional: add --yes to skip interactive prompt and install to all agents
This makes Claude automatically compact after research, subagent tasks, and before context resets. The MCP works without the skill, but the skill improves discoverability.
plugin
For full automation with hooks, install from the claude-emporium marketplace:
/plugin marketplace add Vvkmnn/claude-emporium
/plugin install claude-praetorian@claude-emporium
The claude-praetorian plugin provides:
Hooks (targeted, fires only at high-value moments):
- Before EnterPlanMode → Restore prior compactions for this project
- Before context compaction → Save decisions/insights before context resets
- After WebFetch/WebSearch → Compact web research findings
- After SubagentStop → Compact subagent task results
Commands: /praetorian-compact, /praetorian-restore, /praetorian-search
Requires the MCP server installed first. See the emporium for other Claude Code plugins and MCPs.
features
MCP server for aggressive context compaction. Generates structured incremental snapshots to yield 90%+ token savings and easily refresh context with "frequent intentional compaction".
Runs project by project, saves artifacts to {$project}/.claude/praetorian (with a royal guard ⚜️):
praetorian_compact
(Incrementally) compact context using the TOON format to get the most valuable tokens from an activity.
⚜️ praetorian_compact type=<type> title=<title>
> "ACE Framework research - save 1,450 tokens"
> "Icon rendering bug investigation - compact the findings"
> "Database architecture decisions - preserve the rationale"
> "WebFetch results from authentication docs"
> "Task output from explore subagent - code structure analysis"
⚜️ compact | Created
┌─ ⚜️ ────────────────────────────────────────────────── Created ─┐
│ Compacted: "ACE Framework Research" • 1,450 tokens saved
│ Type: web_research • ID: cpt_1765245902396_nxetoc
└───────────────────────────────────────────────────────────────────┘
{
"type": "web_research",
"title": "ACE Framework Research",
"source": "https://github.com/humanlayer/ace-fca",
"key_insights": [
"Frequent intentional compaction saves 90%+ tokens",
"TOON format is 30-60% smaller than JSON/YAML",
"Compaction should happen after every expensive operation"
],
"refs": ["ace-fca.md:42 - compaction strategy", "toon-spec.md:1 - format definition"],
"recommendations": ["Compact after every WebFetch", "Use type='decisions' for architecture choices"]
}
⚜️ compact | Merged
┌─ ⚜️ ───────────────────────────────────────────────────── Merged ─┐
│ Compacted: "Authentication Patterns" • 890 tokens saved
│ Type: decisions • ID: cpt_1765245903512_xk9mp1
│ Merged with: cpt_1765245903512_xk9mp1
└────────────────────────────────────────────────────────────────────┘
{
"type": "decisions",
"title": "Authentication Patterns",
"decisions": [
{ "chose": "JWT with refresh tokens", "over": ["sessions", "API keys"], "reason": "Stateless, works across microservices" },
{ "chose": "httpOnly cookies", "over": ["localStorage"], "reason": "XSS protection" }
],
"refs": ["src/middleware/auth.ts:45 - token validation", "src/routes/login.ts:23 - refresh flow"],
"anti_patterns": ["Never store tokens in localStorage", "Never skip CSRF on cookie-based auth"]
}
praetorian_restore
Search and restore context by injecting TOON tokens back into current context as needed.
⚜️ praetorian_restore query=<query>
> "What did we learn about authentication?"
> "Find the Docker container debugging session"
> "Show recent architecture decisions"
> "Search for MCP server implementation patterns"
> "" (empty = recent compactions)
⚜️ restore | Search
┌─ ⚜️ ───────────────────────────────────────────────────── Search ─┐
│ Found 2 compactions
│ Query: "authentication"
└────────────────────────────────────────────────────────────────────┘
{
"compactions": [
{
"id": "cpt_1765245903512_xk9mp1",
"type": "decisions",
"title": "Authentication Patterns",
"relevance": 0.85,
"key_insights": ["JWT with refresh tokens", "httpOnly cookies for XSS protection"],
"refs": ["src/middleware/auth.ts:45", "src/routes/login.ts:23"]
},
{
"id": "cpt_1765245902396_nxetoc",
"type": "web_research",
"title": "OAuth2 Best Practices",
"relevance": 0.72,
"key_insights": ["PKCE flow for public clients", "Token rotation every 15min"]
}
],
"total": 2
}
⚜️ restore | Recent
┌─ ⚜️ ───────────────────────────────────────────────────── Recent ─┐
│ Found 3 compactions
└────────────────────────────────────────────────────────────────────┘
Status indicators:
- Created - New compaction saved
- Merged - Updated existing compaction (>70% title similarity via Jaccard index)
- Search - Search results returned (keyword matching)
- Recent - Recent compactions listed (by updated time)
Praetorian is designed for heavy, frequent use. The more you compact, the more you save.
When to compact:
- After every WebFetch
- After every Task/subagent completes
- After reading multiple files
- After making decisions
- During long conversations (proactive compaction)
- Before context gets >60% full
Real-world example session:
| Compaction | Before | After | Saved |
|---|---|---|---|
| Web research (3 URLs) | 4,500 | 300 | 4,200 |
| Subagent outputs (2) | 3,500 | 300 | 3,200 |
| Architecture debates | 5,000 | 300 | 4,700 |
| Hook research | 1,500 | 150 | 1,350 |
| Total | 14,500 | 1,050 | 13,450 (93%) |
Next session: restore() loads ~1,000 tokens. Instant resume, no re-research.
methodology
How claude-praetorian-mcp works:
⚜️ claude-praetorian-mcp
════════════════════════
compact (save) restore (load)
────────────── ──────────────
INPUT QUERY
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ Zod │ │ Inverted│
│Validate │ │ Index │
└────┬────┘ └────┬────┘
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ Jaccard │ │ TOON │
│ >70% ? │ │ Decode │
└──┬───┬──┘ └────┬────┘
│ │ │
new│ │match ▼
│ │ OUTPUT
▼ ▼
┌─────────┐
│ TOON │
│ Encode │
└────┬────┘
│
▼
┌─────────┐
│ Index │
│ Update │
└────┬────┘
│
▼
OUTPUT
storage: .claude/praetorian/
────────────────────────────
index.json word index + metadata
compactions/*.toon encoded compaction files
Core optimizations:
- TOON format: 30-60% fewer tokens than YAML/JSON
- Zod validation: Production-grade runtime type safety
- Jaccard similarity: Smart deduplication via title matching (>70% threshold)
- Inverted index: Fast keyword search without vector embeddings
- Smart merging: Combine similar compactions without duplication
Design principles:
- Incremental -- merge similar compactions via Jaccard similarity, don't replace
- Token-minimal -- TOON format for 30-60% fewer tokens than YAML/JSON
- Project-scoped -- each project stores in its own
.claude/praetorian/ - Flat files only -- no databases, no external services
- Offline -- never leaves your machine, no network calls
File access:
- Stores in:
<project>/.claude/praetorian/ - Format:
.toonfiles (encoded compactions) +index.json(word index + metadata)
development
git clone https://github.com/Vvkmnn/claude-praetorian-mcp && cd claude-praetorian-mcp
npm install && npm run build
npm test
Package requirements:
- Node.js: >=20.0.0 (ES modules)
- Runtime:
@modelcontextprotocol/sdk,@toon-format/toon,zod - Zero external databases -- works with
npx
Development workflow:
npm run build # TypeScript compilation with executable permissions
npm run dev # Watch mode with tsc --watch
npm run start # Run the MCP server directly
npm run lint # ESLint code quality checks
npm run lint:fix # Auto-fix linting issues
npm run format # Prettier formatting (src/)
npm run format:check # Check formatting without changes
npm run typecheck # TypeScript validation without emit
npm run test # Lint + type check
npm run prepublishOnly # Pre-publish validation (build + lint + format:check)
Git hooks (via Husky):
- pre-commit: Auto-formats staged
.tsfiles with Prettier and ESLint
Contributing:
- Fork the repository and create feature branches
- Test with multiple compaction types before submitting PRs
- Follow TypeScript strict mode and MCP protocol standards
Learn from examples:
- Official MCP servers for reference implementations
- TypeScript SDK for best practices
- Creating Node.js modules for npm package development
license
A Roman Emperor AD 41 by Lawrence Alma-Tadema (1871). Gratus of the Praetorian Guard discovers Claudius hiding behind a curtain and declares him emperor. Walters Art Museum, public domain.