nomad3/techopsmind
If you are the rightful owner of techopsmind and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
OpsMind is a company-agnostic SRE operations platform that leverages AI-powered monitoring and a plugin architecture to provide comprehensive infrastructure observability via the Model Context Protocol (MCP).
OpsMind
Company-agnostic SRE operations platform with AI-powered monitoring, plugin architecture for multi-company support, and comprehensive infrastructure observability via Model Context Protocol (MCP).
Architecture: Plugin-based system supporting multiple companies with isolated configurations Current Plugins: Integral FX trading (33 tools, 1,545 servers, 6 datacenters)
🚀 Quick Start
# Clone repository
git clone https://github.com/opsmind/opsmind.git
cd opsmind
# Install dependencies
poetry install
# Configure for your company
cp .env.example .env
# Edit .env with your credentials
# Start OpsMind server
poetry run opsmind-server
Connect to Claude Code
Add to ~/.config/claude-code/mcp_config.json:
{
"mcpServers": {
"opsmind": {
"command": "poetry",
"args": ["run", "opsmind-server"],
"cwd": "/path/to/opsmind"
}
}
}
Usage Example:
You: "Check health of ppfxidb1"
Assistant: [Uses check_server_health from Integral plugin]
You: "Analyze alerts in Singapore region"
Assistant: [Uses analyze_alerts with AI pattern detection]
🏗️ Architecture
Plugin-Based Design
OpsMind uses a plugin architecture to support multiple companies:
OpsMind Platform
├── Core Framework (company-agnostic)
│ ├── MCP server
│ ├── Plugin system
│ ├── Configuration management
│ └── Base classes
│
└── Company Plugins (company-specific)
├── Integral Plugin (33 tools)
├── Your Company Plugin
└── Another Company Plugin
Each company gets:
- Custom hostname parsing
- Custom alert patterns
- Custom monitoring thresholds
- Custom MCP tools
- Isolated configuration
📦 Current Plugins
Integral Plugin (Built-in)
Infrastructure: 1,545 servers | 6 datacenters (LDN, NYC, SG, TYO, UAT, DR) Tools: 33 MCP tools organized in 9 categories
| Category | Tools | Status |
|---|---|---|
| Infrastructure Monitoring | 5 | ✅ Production |
| SSH Operations | 4 | ⚠️ WIP |
| Alert Intelligence | 4 | ✅ Production |
| Vector Store | 10 | ✅ Production |
| Infrastructure Knowledge | 3 | ✅ Production |
| Operations Scripts | 2 | ✅ Production |
| Prometheus | 2 | ✅ Production |
| Ops Intelligence | 2 | ✅ Production |
| Auto-Ingestion | 1 | ✅ Production |
Features:
- AI-powered alert analysis (85% noise detection)
- Vector store (10,874 entries across 6 collections)
- Auto-ingestion (hourly Slack + daily Confluence)
- FX trading infrastructure monitoring
- Hostname parsing (ppfxidb1, spfxiclob5 conventions)
🔧 Creating a Plugin for Your Company
Quick Start (4-8 hours)
1. Copy Template:
cp -r plugins/template plugins/your_company
cd plugins/your_company
2. Edit plugin.yaml:
name: "your_company"
version: "1.0.0"
author: "Your Team"
description: "Your company infrastructure monitoring"
class_name: "YourCompanyPlugin"
3. Customize Hostname Parsing:
Edit hostname_parser.py to match your naming conventions:
def parse(self, hostname: str) -> Dict[str, Any]:
# Your logic: web-prod-us-01 → {environment: "prod", location: "us-east", ...}
pass
4. Configure Thresholds:
Edit monitoring_config.py with your thresholds:
def get_thresholds(self) -> Dict[str, float]:
return {
"cpu_warning": 80, # Your thresholds
"memory_critical": 95,
}
5. Enable Plugin:
# In config/your_company.yaml
plugins:
enabled:
- your_company
# Start server
poetry run opsmind-server
See: PLUGIN_DEVELOPMENT_GUIDE.md (coming in Batch 5) for full details
🛠️ Components
Core Framework (Company-Agnostic)
Plugin System (core/plugins/):
- Plugin interface (SREPlugin)
- Plugin discovery and loading
- Configuration validation
Configuration (core/config/):
- Pydantic models with validation
- Environment variable substitution
- Multi-environment support
Security (Planned):
- Input validation
- API key authentication
- Audit logging
OpsMind Package
Current State: Contains all implementation (30 modules)
- MCP server (4,346 lines, all 33 tools)
- Monitoring modules (5 types)
- Vector store (ChromaDB, 10,874 entries)
- SSH multi-hop
- Alert intelligence
- Resource monitoring
Future State: Will be split into:
- Generic base classes in
core/ - Integral-specific tools in
plugins/integral/tools/
📚 Documentation
For Users
- README.md - This file (getting started)
- OPSMIND_REFACTOR_STATUS.md - Current refactor status
- docs/RUNBOOK.md - Operations procedures
For Developers
- REFACTORING_PLAN.md - Complete refactoring plan
- REFACTOR_IMPLEMENTATION_GUIDE.md - Step-by-step implementation
- REFACTOR_FAST_TRACK.md - 1-week fast-track plan
- CLAUDE.md - AI assistant guide (needs updating)
For Plugin Developers
- plugins/integral/ - Reference implementation
- plugins/test_plugin/ - Minimal example
- PLUGIN_DEVELOPMENT_GUIDE.md - Coming soon
🔒 Security
Current (Pragmatic):
- ✅ Input validation in progress
- ✅ Secrets via environment variables
- ✅ Proprietary license
Planned:
- API key authentication (optional)
- Enhanced input sanitization
- Audit logging
Not Planned (internal use):
- ❌ RBAC (not needed for internal)
- ❌ Vault integration (env vars sufficient)
- ❌ Encryption at rest (optional)
📝 Version & Status
Current: 1.0.0 (OpsMind) Previous: 0.7.1 (Integral SRE Assistant) Status: ✅ Phase 1 Complete (Plugin Architecture Proven)
Branch: refactor/company-agnostic
Commits: 4 checkpoint commits
Time Invested: ~12-16 hours
Remaining (if continuing): ~10-14 hours
What Works:
- ✅ OpsMind server starts successfully
- ✅ Plugin system loads Integral plugin
- ✅ All 33 tools available
- ✅ Hostname parsing, alerts, thresholds extracted
- ✅ Memory leaks fixed
- ✅ Resource monitoring active
What's Next:
- Test current state thoroughly
- Optionally continue with Batches 4-5
- Or create second company plugin to validate
🤝 Built With
- Model Context Protocol - AI tool integration
- Claude Code - AI-powered development
- Google Gemini - AI intelligence
- Poetry - Dependency management
- Pydantic - Configuration validation
- ChromaDB - Vector store
- Prometheus - Metrics collection
📄 License
Proprietary - Internal use only
For detailed documentation, see documentation files and OPSMIND_REFACTOR_STATUS.md 📖