bgregorutti/mcp-local-agent
If you are the rightful owner of mcp-local-agent and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Model Context Protocol (MCP) Server facilitates efficient task delegation to local models, optimizing cost and maintaining quality through iterative feedback.
Local Agent MCP Server
Delegate code generation tasks from Claude Code to local models (LM Studio/Ollama) while maintaining quality through iterative feedback. Save 60-90% on API costs.
Core Features
- 🎯 Intelligent Delegation - Claude Code automatically delegates tasks to local models
- 🔄 Feedback Loop - Iterative review and improvement (max 3 iterations)
- 📊 Real-Time Statistics - Track costs and savings on every delegation
- 💾 Persistent Tracking - Statistics saved across sessions
- ⚙️ Configurable Pricing - Accurate cost calculations for any model
- 🌍 Global or Per-Project - Works everywhere or specific projects
How It Works
User Request → Claude Code (Sonnet) → Delegates Simple/Medium Tasks
↓
MCP Server
↓
Local Model (Free)
↓
Review & Iterate
↓
Final Output ✅
Result: Claude handles architecture and complex decisions. Local model handles code generation. You save money.
Quick Start
1. Prerequisites
# Install MCP SDK
pip install mcp
# Start a local model backend (choose one):
# Option A: LM Studio (recommended)
# 1. Download from lmstudio.ai
# 2. Load Qwen 2.5 Coder 7B
# 3. Start server (http://localhost:1234)
# Option B: Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull deepseek-coder:6.7b
ollama serve
2. Configure MCP Server
# Global (all projects)
python configure/configure_mcp_global.py
# Or project-specific
python configure/configure_mcp.py
This adds the MCP server to ~/.claude.json.
3. Enable Automatic Delegation (Recommended)
# Copy template to your project
cp CLAUDE-template.md CLAUDE.md
# Or create global default
mkdir -p ~/.claude
cp CLAUDE-template.md ~/.claude/CLAUDE.md
Important:
- MCP config = Makes tools available
- CLAUDE.md = Tells Claude when to use them
Both are needed for automatic delegation.
4. Restart Claude Code
Close and reopen VS Code/Claude Code completely.
Available Tools
1. execute_with_feedback_loop (Primary)
Best for most coding tasks. Local model generates → Claude reviews → iterates until approved.
{
"task_id": "add-auth-001",
"task_type": "backend",
"prompt": "Create user authentication endpoints",
"quality_criteria": ["Include error handling", "Add type hints"]
}
2. execute_with_local_model (Quick)
For simple tasks that don't need review (boilerplate, formatting, docs).
{
"prompt": "Generate a Python dataclass for User with name, email, age"
}
3. provide_feedback (Review)
Claude uses this to provide specific feedback for iteration.
4. compare_iterations (Analysis)
View improvement across iterations with statistics.
5. get_statistics_summary (Reporting)
Get cumulative statistics across all delegation tasks.
{
"total_tasks": 47,
"cost_analysis": {
"total_savings_usd": 3.71,
"savings_percent": 86.9
}
}
Configuration
Environment Variables
Backend:
export BACKEND_TYPE=lmstudio # or ollama, openai-compatible
export BACKEND_URL=http://localhost:1234/v1
Pricing (for accurate cost tracking):
export REMOTE_INPUT_COST_PER_1K=0.003 # Claude Sonnet input
export REMOTE_OUTPUT_COST_PER_1K=0.015 # Claude Sonnet output
export LOCAL_COST_PER_1K=0.0 # Local models are free
Statistics & Cost Tracking
Every delegation returns comprehensive statistics:
{
"statistics": {
"local_model_usage": {
"tokens": {"sent": 1500, "received": 800},
"time_seconds": 2.34
},
"cost_analysis": {
"local_model_cost_usd": 0.00,
"estimated_cost_if_fully_remote_usd": 0.0165,
"actual_cost_usd": 0.0024,
"savings_usd": 0.0141,
"savings_percent": 85.5
}
}
}
Statistics are automatically saved to .mcp_stats.json and persist across sessions.
Delegation Strategy
Claude Code should delegate:
✅ To Local Model:
- Boilerplate and simple code generation
- Refactoring and formatting
- Test generation
- API endpoints and CRUD operations
- Documentation
🎯 Handle Yourself (Remote):
- Architecture and design decisions
- Security-critical code
- Complex algorithms
- Novel/ambiguous requirements
- Final integration and review
Target: 60-80% of tasks delegated for 60-90% cost savings.
Tips
- Use Feedback Loops - Better quality than quick delegation
- Be Specific in Feedback - "Add email validation" beats "improve validation"
- Check Statistics - Use
get_statistics_summaryto track savings - Iterate Up to 3 Times - If not fixed, handle it yourself
- Customize CLAUDE.md - Tailor delegation rules per project
Troubleshooting
"Connection error"
- Ensure LM Studio/Ollama is running
- Check model is loaded
- Verify server URL:
curl http://localhost:1234/v1/models
"Tools not available"
- Run config script:
python configure/configure_mcp_global.py - Restart Claude Code completely
- Check
~/.claude.jsoncontains "local-agents"
"Not delegating automatically"
- Create CLAUDE.md from template
- Restart Claude Code after adding CLAUDE.md
File Structure
mcp-local-agent/
├── local_agent_mcp_server.py # Main MCP server
├── configure/
│ ├── configure_mcp_global.py # Global setup
│ └── configure_mcp.py # Project setup
├── CLAUDE-template.md # Template for project guidelines
├── README.md # This file
└── .mcp_stats.json # Statistics (auto-generated)
Advanced
Custom System Prompts
{
"system_prompt": "You are a senior Python developer. Follow PEP 8 strictly. Always include comprehensive error handling and type hints."
}
Global vs Project-Specific
- Global:
configure_mcp_global.py→ Works everywhere - Project:
configure_mcp.py→ Only current project - Both supported: Can override global with project settings
Recommended Models
LM Studio:
- Qwen 2.5 Coder 7B (best balance)
- DeepSeek Coder V2 16B (highest quality)
- CodeLlama 13B (alternative)
Ollama:
deepseek-coder:6.7bcodellama:13b
Documentation
CLAUDE-template.md- Project guidelines templatedocs/DELEGATION_STRATEGY.md- Decision frameworktest_hybrid_workflow.py- Example workflows
Support
- Issues: GitHub Issues
- MCP Docs: MCP Documentation
Status: 🟢 Production Ready | Cost Savings: 60-90% | Quality: Maintained