kishanrao92/infra-mcp
If you are the rightful owner of infra-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
An interactive Kubernetes monitoring system built with Flask and OpenAI's Model Context Protocol (MCP) for diagnosing cluster issues using natural language queries.
Kubernetes MCP Server
An interactive Kubernetes monitoring system built with Flask and OpenAI's Model Context Protocol (MCP). This project provides an agentic interface for diagnosing cluster issues using natural language queries.
Features
- Flask MCP Server: Exposes Kubernetes cluster data via JSON-RPC endpoints
- Interactive Client: Ask questions like "What is the status of the checkout service?"
- OpenAI Integration: Uses GPT models to intelligently investigate cluster problems
- Kubernetes Integration: Real-time pod monitoring, events, and logs
- Colorized Output: Beautiful terminal interface with ANSI colors
Architecture
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Interactive βββββΆβ Flask MCP βββββΆβ Kubernetes β
β Client β β Server β β Cluster β
β (client.py) β β (server.py) β β (KIND/etc) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββ
β OpenAI GPT β β Static Fixtures β
β (optional) β β (metrics, etc) β
βββββββββββββββββββ ββββββββββββββββββββ
Setup
Prerequisites
- Python 3.9+
- Kubernetes cluster (KIND recommended for local development)
- OpenAI API key (optional, fallback mode available)
Installation
- Clone the repository:
git clone https://github.com/YOUR_USERNAME/YOUR_REPO_NAME.git
cd YOUR_REPO_NAME
- Install dependencies:
pip install flask kubernetes openai requests
- Set up environment variables:
export OPENAI_API_KEY="your-api-key-here" # Optional
export KUBECONFIG="path/to/your/kubeconfig" # If not using default
Running the Server
cd mcp
python3 server.py
The server will start on http://localhost:5050
Running the Interactive Client
cd mcp
python3 client.py
Usage
Interactive Mode
Start the client and ask natural language questions:
> what is the status of my checkout service?
> show failing pods in namespace staging
> summarize errors for service payments in the last 45 minutes
One-shot Mode
python3 client.py --ask "what pods are failing in default namespace?"
Available Tools
k8s.listProblemPods
- Find problematic podsk8s.getPodDetails
- Get detailed pod informationdeployments.listRecentChanges
- Recent deployment historymetrics.getErrors
- Error rate analysistraces.sampleErrors
- Sample failing tracesconfig.getDiff
- Configuration changes
Example Output
=== π§© FINAL ANSWER ===
π Summary:
The pod 'demo-fail-5df44cbf79-tqg6l' is experiencing CrashLoopBackOff
π Evidence:
β’ Pod: demo-fail-5df44cbf79-tqg6l
Status: Running
Restarts: 115
Reason: CrashLoopBackOff
β οΈ Probable Cause:
Application failing to start successfully due to exit code 1
π οΈ Safe Next Step:
Investigate application logs and configuration
β
Confidence: High
Configuration
Environment variables:
RPC_URL
- MCP server URL (default: http://127.0.0.1:5050/rpc)OPENAI_API_KEY
- OpenAI API key for LLM featuresOPENAI_MODEL
- Model to use (default: gpt-4o-mini)SERVICE
- Default service name (default: checkout)NAMESPACE
- Default K8s namespace (default: default)SINCE_MINS
- Time window for queries (default: 120)
Development
Project Structure
mcp-demo/
βββ mcp/
β βββ server.py # Flask MCP server
β βββ client.py # Interactive client
β βββ tools_catalog.json # Tool definitions
β βββ fixtures/ # Static test data
βββ k8s/
β βββ deployment.yaml # Sample K8s resources
βββ README.md
Adding New Tools
- Add tool definition to
tools_catalog.json
- Implement handler in
server.py
- Test with client
Demo
https://github.com/user-attachments/assets/e30a7a69-ff7a-46f1-a2ff-e75eff79334b
License
MIT License