chaos-mesh-mcp

ernestolee13/chaos-mesh-mcp

3.2

If you are the rightful owner of chaos-mesh-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Chaos Mesh MCP Server provides AI assistants with direct access to Chaos Mesh for automated chaos engineering and resilience testing.

Tools
5
Resources
0
Prompts
0

Chaos Mesh MCP Server

Model Context Protocol (MCP) server for Chaos Mesh chaos engineering operations.

This MCP server provides AI assistants like Claude with direct access to Chaos Mesh for automated chaos engineering and resilience testing. Create, manage, and validate chaos experiments through natural language conversations.

Features

Chaos Types (24 tools)

  • NetworkChaos (4 tools): Simulate network delays, packet loss, partitions, and corruption
  • StressChaos (3 tools): Apply CPU and memory stress to containers
  • PodChaos (3 tools): Kill pods, fail pods, or kill specific containers
  • IOChaos (4 tools): Inject I/O latency, faults, attribute changes, and data corruption
  • HTTPChaos (4 tools): Abort connections, inject delays, replace/patch HTTP content
  • DNSChaos (2 tools): Return DNS errors or random IPs for specified domains
  • PhysicalMachineChaos (5 tools): Inject chaos on physical/virtual machines (requires Chaosd)

Management & Validation (9 tools)

  • Environment Validation (3 tools): Check prerequisites, verify component status, get chaos-specific requirements
  • Experiment Management (6 tools): Query status, list experiments, delete, pause, resume, get events

Tested Environment

This package has been tested and verified with:

  • Chaos Mesh: v2.8.0
  • Kubernetes: v1.27+ (tested with v1.27.6)
  • kubectl: v1.27.6+
  • Python: 3.10, 3.11, 3.12

Prerequisites

Before using this MCP server, ensure you have:

  1. kubectl installed and configured
  2. Kubernetes cluster accessible via kubectl (v1.15+)
  3. Chaos Mesh installed in the cluster (v2.6+ recommended)

Component-Specific Requirements

  • DNSChaos: Requires chaos-dns-server pod running in chaos-mesh namespace
  • PhysicalMachineChaos: Requires Chaosd agent on target physical/virtual machines
  • Other chaos types: Only require standard Chaos Mesh components (controller-manager, daemon)

Tip: Use the validate_environment tool to check your setup!

Installation

Step 1: Install Chaos Mesh

If Chaos Mesh is not already installed on your cluster:

# Using Helm (Recommended)
helm repo add chaos-mesh https://charts.chaos-mesh.org
helm install chaos-mesh chaos-mesh/chaos-mesh \
  --namespace chaos-mesh \
  --create-namespace \
  --version 2.8.0

# Verify installation
kubectl get pods -n chaos-mesh
# Should see: chaos-controller-manager, chaos-daemon, chaos-dashboard

Step 2: Install MCP Server

Option 1: Install from GitHub (Recommended)
pip install git+https://github.com/ernestolee13/chaos-mesh-mcp.git
Option 2: Install from Source
git clone https://github.com/ernestolee13/chaos-mesh-mcp.git
cd chaos-mesh-mcp
pip install -e .

Quick Start

1. Configure Claude Desktop

Edit ~/.config/claude/claude_desktop_config.json (Linux/Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "chaos-mesh": {
      "command": "python",
      "args": ["-m", "chaos_mesh_mcp.server"]
    }
  }
}

2. Restart Claude Desktop

Restart Claude Desktop to load the MCP server.

3. Validate Your Environment

In Claude Desktop, ask:

You: "Validate my Chaos Mesh environment"
Claude: [Checks kubectl, cluster connection, Chaos Mesh installation, and components]

4. Start Using Chaos Engineering

Example conversations:

You: "Create a network delay of 500ms for pods with label app=api for 2 minutes"
Claude: [Creates NetworkChaos experiment]

You: "Check if DNSChaos is available"
Claude: [Runs check_chaos_type_requirements for dns]

You: "List all active chaos experiments in default namespace"
Claude: [Lists current experiments with their status]

Architecture

chaos-mesh-mcp/
├── chaos_mesh_mcp/
│   ├── server.py           # MCP server main entry point
│   ├── kubectl.py          # Kubectl command runner
│   ├── templates.py        # YAML template rendering
│   ├── validators.py       # Parameter validation
│   └── tools/
│       ├── network.py      # NetworkChaos (4 tools)
│       ├── stress.py       # StressChaos (3 tools)
│       ├── pod.py          # PodChaos (3 tools)
│       ├── io.py           # IOChaos (4 tools)
│       ├── http.py         # HTTPChaos (4 tools)
│       ├── dns.py          # DNSChaos (2 tools)
│       ├── physical.py     # PhysicalMachineChaos (5 tools)
│       ├── validation.py   # Environment validation (3 tools)
│       └── management.py   # Experiment management (6 tools)
├── pyproject.toml          # Package metadata
├── LICENSE                 # MIT License
└── README.md               # This file

Environment Validation

This MCP includes comprehensive environment validation:

validate_environment
├─ Check kubectl availability & version
├─ Check cluster connectivity
├─ Check Chaos Mesh installation
├─ Check CRDs (7 chaos types)
└─ Check components
   ├─ chaos-controller-manager (required)
   ├─ chaos-daemon (required)
   ├─ chaos-dns-server (optional, for DNSChaos)
   └─ chaos-dashboard (optional)

check_chaos_type_requirements(chaos_type)
├─ Verify specific CRD installed
├─ Check required components running
└─ Show external requirements (e.g., Chaosd for Physical)

Troubleshooting

DNSChaos Not Working

# Check if chaos-dns-server is running
kubectl get pods -n chaos-mesh | grep dns

# If not present, install with DNS support:
helm upgrade chaos-mesh chaos-mesh/chaos-mesh --namespace=chaos-mesh --set dnsServer.create=true

PhysicalMachineChaos Not Working

PhysicalMachineChaos requires Chaosd agent on target machines:

Setup Instructions
  1. Install Chaosd on Target Machine (physical or virtual):

    wget https://mirrors.chaos-mesh.org/chaosd-latest-linux-amd64.tar.gz
    tar -xzf chaosd-latest-linux-amd64.tar.gz
    sudo cp chaosd-latest-linux-amd64/chaosd /usr/local/bin/
    sudo apt-get install -y stress-ng  # Required for CPU/Memory stress
    
  2. Start Chaosd Server:

    sudo chaosd server --port 31767
    
  3. Use the Correct Address Format (⚠️ Important):

    # ✓ CORRECT - No protocol prefix
    await create_physical_stress_cpu(
        namespace="default",
        address=["192.168.1.100:31767"],  # IP:PORT only
        duration="60s",
        workers=2,
        load=80
    )
    
    # ✗ WRONG - Including http:// causes HTTPS error
    await create_physical_stress_cpu(
        namespace="default",
        address=["http://192.168.1.100:31767"],  # Don't do this!
        duration="60s"
    )
    
Important Notes
  • Address format: Use IP:PORT or hostname:PORT without http:// or https:// prefix
  • Chaosd can run on localhost: The machine can inject chaos on itself
  • TLS: For production, configure Chaosd with HTTPS (optional for testing)
  • Clock action: Requires pid parameter to target specific process

See: Chaosd Documentation

Permission Errors

Ensure your Kubernetes user has appropriate RBAC permissions:

# Check if you can create chaos experiments
kubectl auth can-i create networkchaos --all-namespaces

Documentation

Based on Chaos Mesh official documentation:

Available Tools (33 total)

NetworkChaos (4 tools)

  • create_network_delay - Inject network latency to simulate slow connections
  • create_network_loss - Simulate packet loss for unreliable networks
  • create_network_partition - Create network splits to test split-brain scenarios
  • create_network_corrupt - Corrupt network packets to test data integrity

StressChaos (3 tools)

  • create_stress_cpu - Apply CPU load to test performance under stress
  • create_stress_memory - Apply memory pressure to test OOM scenarios
  • create_stress_combined - Apply both CPU and memory stress simultaneously

PodChaos (3 tools)

  • create_pod_kill - Kill pods to test recovery mechanisms
  • create_pod_failure - Make pods temporarily unavailable without killing
  • create_container_kill - Kill specific containers within pods

IOChaos (4 tools)

  • create_io_latency - Inject I/O delays to simulate slow disks
  • create_io_fault - Return error codes for file operations (ENOSPC, EIO, etc.)
  • create_io_attr_override - Modify file attributes (permissions, size)
  • create_io_mistake - Inject data corruption into read/write operations

HTTPChaos (4 tools)

  • create_http_abort - Abort HTTP connections to simulate network failures
  • create_http_delay - Inject latency into HTTP requests/responses
  • create_http_replace - Replace HTTP message content (headers, body)
  • create_http_patch - Add content to HTTP messages

DNSChaos (2 tools)

  • create_dns_error - Return DNS errors for specified domain patterns
  • create_dns_random - Return random IP addresses for DNS queries

PhysicalMachineChaos (5 tools)

  • create_physical_stress_cpu - Inject CPU stress on physical/virtual machines
  • create_physical_stress_memory - Inject memory stress on physical/virtual machines
  • create_physical_disk_fill - Fill disk space on physical/virtual machines
  • create_physical_process_kill - Kill processes on physical/virtual machines
  • create_physical_clock_skew - Skew system clock on physical/virtual machines

Environment Validation (3 tools)

  • validate_environment - Comprehensive environment validation (kubectl, cluster, Chaos Mesh, CRDs, components)
  • check_chaos_type_requirements - Check requirements for specific chaos type
  • get_chaos_requirements - Get detailed requirements for a chaos type

Experiment Management (6 tools)

  • get_experiment_status - Get detailed status of a chaos experiment
  • list_active_experiments - List all active experiments in cluster
  • delete_experiment - Delete a chaos experiment
  • pause_experiment - Pause a running experiment
  • resume_experiment - Resume a paused experiment
  • get_experiment_events - Get Kubernetes events for debugging

License

MIT License - see file for details.

Contributing

This is part of the chaos-agents project. Issues and pull requests are welcome!

Credits

Built on top of: