app-sre/proms-mcp
If you are the rightful owner of proms-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Proms MCP Server is a lean Model Context Protocol server designed to provide LLM agents with seamless access to multiple Prometheus instances for metrics analysis and SRE operations.
Proms MCP Server
A lean MCP (Model Context Protocol) server that provides LLM agents with transparent access to multiple Prometheus instances for metrics analysis and SRE operations.
Overview
This server implements the MCP protocol using the modern FastMCP library, allowing LLM agents to query multiple Prometheus instances through a unified interface. It supports discovery, querying, and analysis of Prometheus metrics with built-in security validation and comprehensive observability.
Features
- Multiple Prometheus Support: Query multiple Prometheus instances through a single interface
- Bearer Token Authentication: Secure authentication using OpenShift bearer tokens
- Security Hardening: Basic PromQL query validation for safety
- Comprehensive Toolset: 8 MCP tools covering discovery, querying, and analysis using modern
@tool
decorators - Observability: Structured logging for debugging and monitoring
- Production Ready: Designed for OpenShift/Kubernetes deployment
- Lean Architecture: Stateless, minimal dependencies (5 core dependencies), fail-fast design
MCP Tools
Discovery Tools
list_datasources
: List all available Prometheus datasourceslist_metrics
: Get all available metric names from a datasourceget_metric_metadata
: Get metadata for a specific metric
Query Tools
query_instant
: Execute instant PromQL queryquery_range
: Execute range PromQL query
Analysis Tools
get_metric_labels
: Get all label names for a specific metricget_label_values
: Get all values for a specific labelfind_metrics_by_pattern
: Find metrics matching a regex pattern
Quick Start
Prerequisites
- Python 3.11+
- uv for dependency management
- Docker/Podman for container development
Local Development
# Clone and setup
git clone <repository-url>
cd proms-mcp
make install
# 1. Login to your target cluster and get your token
oc login https://api.your-cluster.example.com:6443
export OPENSHIFT_TOKEN=$(oc whoami -t)
# 2. Create datasource config with your token
# this assumes your openshift token is valid to authenticate on prometheus
cat > local_config/datasources.yaml << EOF
datasources:
- name: "my-prometheus"
type: "prometheus"
url: "https://prometheus.your-cluster.example.com"
jsonData:
httpHeaderName1: "Authorization"
secureJsonData:
httpHeaderValue1: "Bearer ${OPENSHIFT_TOKEN}"
EOF
# 3. Run the server with authentication enabled
make run-auth OPENSHIFT_API_URL=$(oc whoami --show-server)
# Or run the server without authentication - the connection to prometheus will still be authenticated
make run
# 4. Configure your MCP client (e.g. with Cursor and auth-enabled proms-mcp)
cat > .cursor/mcp.json << EOF
{
"mcpServers": {
"proms-mcp-local-auth": {
"url": "http://localhost:8000/mcp/",
"headers": {
"Authorization": "Bearer ${OPENSHIFT_TOKEN}"
}
}
}
}
EOF
Container Development
# Build container
podman build -t proms-mcp .
# Create datasource config (see Configuration section)
# Run with config
podman run -p 8000:8000 \
-v ./datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml:ro \
proms-mcp
MCP Client Setup
Cursor Integration
The server supports two authentication modes:
Development (No Authentication):
{
"mcpServers": {
"proms-mcp-dev": {
"url": "http://localhost:8000/mcp/",
"description": "Development server - no authentication"
}
}
}
Production (Bearer Token Authentication):
{
"mcpServers": {
"proms-mcp": {
"url": "https://proms-mcp.apps.cluster.example.com/mcp",
"headers": {
"Authorization": "Bearer your-openshift-token-here"
},
"description": "Production server with OpenShift bearer token auth"
}
}
}
⚠️ Security Note: Never commit
.cursor/mcp.json
with real tokens to git. It's already in.gitignore
.
See .cursor/mcp-examples.json
for complete configuration examples including:
- Development and production setups
- Service account token configuration
- Multi-environment configurations
- SSL verification scenarios
Other MCP Clients
The server exposes MCP over HTTP at:
- Endpoint:
POST http://localhost:8000/mcp/
(or your deployed URL) - Protocol: JSON-RPC 2.0 over HTTP
- Content-Type:
application/json
- Accept:
application/json, text/event-stream
- Authentication: Bearer token in
Authorization
header (whenAUTH_MODE=active
)
📝 Path Behavior: The server uses
/mcp/
(with trailing slash) to avoid HTTP 307 redirects that can cause authentication issues in some MCP clients. Always use the trailing slash in your client configurations.
Configuration
Environment Variables
PORT
: MCP server port (default: 8000)HEALTH_METRICS_PORT
: Health and metrics server port (default: 8080)LOG_LEVEL
: Logging level (default: INFO)GRAFANA_DATASOURCES_PATH
: Path to datasource config file (default: /etc/grafana/provisioning/datasources/datasources.yaml)QUERY_TIMEOUT
: Query timeout in seconds (default: 30)
Authentication Configuration
The server supports two authentication modes:
AUTH_MODE
: Authentication mode (none
oractive
, default:active
)OPENSHIFT_API_URL
: OpenShift API server URL (required for bearer token auth)OPENSHIFT_CA_CERT_PATH
: Path to CA certificate file for SSL verification (optional, only needed for custom certificates)
No Authentication Mode (Development Only)
# Explicitly disable authentication for development
AUTH_MODE=none uv run python -m proms_mcp
Bearer Token Authentication Mode (Default)
# Run with bearer token authentication
AUTH_MODE=active \
OPENSHIFT_API_URL=https://api.cluster.example.com:6443 \
uv run python -m proms_mcp
# if you're authenticated on openshift already:
AUTH_MODE=active \
OPENSHIFT_API_URL=$(oc whoami --show-server) \
uv run python -m proms_mcp
# For self-signed certificates, you can provide the CA certificate (if needed for custom certificates):
OPENSHIFT_CA_CERT_PATH=/path/to/ca.crt uv run python -m proms_mcp
Authentication Implementation:
The server uses Kubernetes TokenReview API with self-validation to authenticate OpenShift bearer tokens. Each user's token validates itself - no special RBAC permissions are needed. Authentication is handled by a custom TokenReviewVerifier
that integrates with FastMCP's authentication system.
Datasource Configuration
Create a Grafana datasource provisioning YAML file. Only type: "prometheus"
datasources are processed.
Example datasources.yaml:
apiVersion: 1
prune: true
datasources:
- name: "prod-prometheus"
type: "prometheus"
url: "https://prometheus-prod.example.com"
access: "proxy"
editable: false
jsonData:
httpHeaderName1: "Authorization"
secureJsonData:
httpHeaderValue1: "Bearer prod-token"
- name: "demo-prometheus"
type: "prometheus"
url: "https://demo.robustperception.io:9090"
access: "proxy"
editable: false
Security
PromQL Query Validation
The server implements basic security checks:
- Query Length: Limits to 10,000 characters
- Empty Query: Prevents empty or whitespace-only queries
- Input Sanitization: Basic parameter encoding via httpx
API Endpoints
- POST /mcp/: MCP JSON-RPC 2.0 endpoint (port 8000)
- GET /health: Health check (port 8080)
- GET /metrics: Prometheus metrics (port 8080)
Deployment
OpenShift Deployment
Deploy using the provided OpenShift template:
# Development deployment (no authentication)
oc process -f openshift/deploy.yaml \
-p IMAGE=quay.io/app-sre/proms-mcp \
-p IMAGE_TAG=latest \
-p AUTH_MODE=none \
| oc apply -f -
# Production deployment (bearer token authentication)
oc process -f openshift/deploy.yaml \
-p IMAGE=quay.io/app-sre/proms-mcp \
-p IMAGE_TAG=v1.0.0 \
-p AUTH_MODE=active \
-p OPENSHIFT_API_URL=https://api.cluster.example.com:6443 \
| oc apply -f -
# No additional RBAC setup is needed - the server uses self-validation
Template Parameters:
AUTH_MODE
:none
(development) oractive
(production, default)OPENSHIFT_API_URL
: OpenShift API server URL (default:https://kubernetes.default.svc
for in-cluster)OPENSHIFT_CA_CERT_PATH
: CA certificate path (default: in-cluster service account CA)NAMESPACE
: Target namespace (required)HOSTNAME
: Route hostname (required)
MCP Client Configuration
Development Mode (No Authentication)
{
"mcpServers": {
"proms-mcp-dev": {
"url": "http://localhost:8000/mcp"
}
}
}
Production Mode (Bearer Token)
{
"mcpServers": {
"proms-mcp": {
"url": "https://proms-mcp.apps.cluster.example.com/mcp",
"headers": {
"Authorization": "Bearer ${OPENSHIFT_TOKEN}"
}
}
}
}
Get your OpenShift token:
export OPENSHIFT_TOKEN=$(oc whoami -t)
RBAC Requirements
For production (bearer token authentication) deployments:
- ServiceAccount:
proms-mcp-server
(created by template) - used only for pod identity - No special RBAC permissions needed: The server uses self-validation where each user's token validates itself
- User Tokens: Users need valid OpenShift tokens (
oc whoami -t
)
The template creates the ServiceAccount for pod identity. No ClusterRoleBindings or special permissions are required because the authentication uses self-validation.
Development
Code Quality
make format # Format code and fix imports
make lint # Lint and type check code
make test # Run tests with coverage
Project Structure
proms-mcp/
proms_mcp/ # Main package
auth.py # TokenReview-based authentication with FastMCP integration
server.py # FastMCP server with 8 MCP tools
client.py # Prometheus API wrapper
config.py # Config parser with auth support
monitoring.py # Health/metrics endpoints
logging.py # Structured logging configuration
tests/ # Test suite (mirrors package structure)
openshift/deploy.yaml # OpenShift template with RBAC support
local_config/ # Local development configuration
Troubleshooting
Common Issues
- No datasources loaded:
- Check that
GRAFANA_DATASOURCES_PATH
points to your datasources file - Verify YAML syntax is valid (JSON format is also supported)
- Ensure the file contains a
datasources
array withtype: "prometheus"
entries - Use
make run
which automatically sets the path tolocal_config/datasources.yaml
- Check that
- Authentication failures: Verify bearer tokens in
secureJsonData
- Query timeouts: Adjust
QUERY_TIMEOUT
environment variable - Query validation errors: Check query length and ensure non-empty queries
- Client connection issues:
- 400 Bad Request: Server restart - client will reconnect automatically
- 406 Not Acceptable: Client must accept
application/json, text/event-stream
Debug Mode
LOG_LEVEL=DEBUG make run
Health Checks
curl http://localhost:8080/health
curl http://localhost:8080/metrics | grep mcp_
Documentation
- - Technical specification and architecture
- - Development guide for AI assistants
- - Local testing guide with bearer token examples
Contributing
- Fork the repository
- Create a feature branch
- Make changes with tests
- Run quality checks:
make format lint test
- Submit a pull request
License
Apache License 2.0