infra-mcp

kishanrao92/infra-mcp

3.2

If you are the rightful owner of infra-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

An interactive Kubernetes monitoring system built with Flask and OpenAI's Model Context Protocol (MCP) for diagnosing cluster issues using natural language queries.

Tools
6
Resources
0
Prompts
0

Kubernetes MCP Server

An interactive Kubernetes monitoring system built with Flask and OpenAI's Model Context Protocol (MCP). This project provides an agentic interface for diagnosing cluster issues using natural language queries.

Features

  • Flask MCP Server: Exposes Kubernetes cluster data via JSON-RPC endpoints
  • Interactive Client: Ask questions like "What is the status of the checkout service?"
  • OpenAI Integration: Uses GPT models to intelligently investigate cluster problems
  • Kubernetes Integration: Real-time pod monitoring, events, and logs
  • Colorized Output: Beautiful terminal interface with ANSI colors

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Interactive   │───▢│   Flask MCP      │───▢│   Kubernetes    β”‚
β”‚     Client      β”‚    β”‚     Server       β”‚    β”‚    Cluster      β”‚
β”‚   (client.py)   β”‚    β”‚   (server.py)    β”‚    β”‚   (KIND/etc)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚
         β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   OpenAI GPT    β”‚    β”‚  Static Fixtures β”‚
β”‚   (optional)    β”‚    β”‚  (metrics, etc)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Setup

Prerequisites

  • Python 3.9+
  • Kubernetes cluster (KIND recommended for local development)
  • OpenAI API key (optional, fallback mode available)

Installation

  1. Clone the repository:
git clone https://github.com/YOUR_USERNAME/YOUR_REPO_NAME.git
cd YOUR_REPO_NAME
  1. Install dependencies:
pip install flask kubernetes openai requests
  1. Set up environment variables:
export OPENAI_API_KEY="your-api-key-here"  # Optional
export KUBECONFIG="path/to/your/kubeconfig"  # If not using default

Running the Server

cd mcp
python3 server.py

The server will start on http://localhost:5050

Running the Interactive Client

cd mcp
python3 client.py

Usage

Interactive Mode

Start the client and ask natural language questions:

> what is the status of my checkout service?
> show failing pods in namespace staging  
> summarize errors for service payments in the last 45 minutes

One-shot Mode

python3 client.py --ask "what pods are failing in default namespace?"

Available Tools

  • k8s.listProblemPods - Find problematic pods
  • k8s.getPodDetails - Get detailed pod information
  • deployments.listRecentChanges - Recent deployment history
  • metrics.getErrors - Error rate analysis
  • traces.sampleErrors - Sample failing traces
  • config.getDiff - Configuration changes

Example Output

=== 🧩 FINAL ANSWER ===

πŸ“‹ Summary:
  The pod 'demo-fail-5df44cbf79-tqg6l' is experiencing CrashLoopBackOff

πŸ” Evidence:
  β€’ Pod: demo-fail-5df44cbf79-tqg6l
    Status: Running
    Restarts: 115
    Reason: CrashLoopBackOff

⚠️  Probable Cause:
  Application failing to start successfully due to exit code 1

πŸ› οΈ  Safe Next Step:
  Investigate application logs and configuration

βœ… Confidence: High

Configuration

Environment variables:

  • RPC_URL - MCP server URL (default: http://127.0.0.1:5050/rpc)
  • OPENAI_API_KEY - OpenAI API key for LLM features
  • OPENAI_MODEL - Model to use (default: gpt-4o-mini)
  • SERVICE - Default service name (default: checkout)
  • NAMESPACE - Default K8s namespace (default: default)
  • SINCE_MINS - Time window for queries (default: 120)

Development

Project Structure

mcp-demo/
β”œβ”€β”€ mcp/
β”‚   β”œβ”€β”€ server.py           # Flask MCP server
β”‚   β”œβ”€β”€ client.py           # Interactive client
β”‚   β”œβ”€β”€ tools_catalog.json  # Tool definitions
β”‚   └── fixtures/           # Static test data
β”œβ”€β”€ k8s/
β”‚   └── deployment.yaml     # Sample K8s resources
└── README.md

Adding New Tools

  1. Add tool definition to tools_catalog.json
  2. Implement handler in server.py
  3. Test with client

Demo

https://github.com/user-attachments/assets/e30a7a69-ff7a-46f1-a2ff-e75eff79334b

License

MIT License