verlyn13/devops-mcp
If you are the rightful owner of devops-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The DevOps MCP Server is a safety-first Model Context Protocol server designed to manage and expose golden-path workflows with strict policy and audit controls.
DevOps MCP Server (Personal)
Overview
- A small, safety-first MCP server that exposes your golden-path workflows (chezmoi, mise, brew, git) to agent clients with strict policy and audit. Targets Node 24, stdio transport.
Quick Dev
- Ensure Node 24 is active (mise):
mise use -g node@24 - Install deps:
pnpm i(ornpm i) - Run dev (no build):
pnpm dev(ornpm run dev) - Build:
pnpm build(ornpm run build) - Start:
pnpm start(ornpm start)
Codex CLI Wiring (local, stdio)
- Add an MCP entry pointing to your dev server (disabled by default). Enable after
pnpm devor build/start is confirmed.
Capabilities (initial)
- Tools:
mcp_health,patch_apply_check,pkg_sync_plan(plan-only),pkg_sync_apply(gated),dotfiles_apply(gated),secrets_read_ref,converge_host(routine) - Resources:
dotfiles_state,policy_manifest,pkg_inventory,repo_status,telemetry_info
Policy & Safety (initial)
- Hardened exec wrapper:
execFileonly, sanitized PATH, no inherited env by default. - JSONL audit at:
~/Library/Application Support/devops.mcp/audit.jsonlwith per-call entries. - Allowlists are stubbed in code; TOML config hook is present for future expansion.
- Rate limiting per tool/resource via
[limits]and[capabilities]in config.
Security Model
- SecretRef allowlist: configure
[secrets] gopass_roots = ["personal/devops/*", "org/*"]; any path outside is denied. - Traversal guarded: rejects
..,//, leading/, and dot-files. - Hashed audits only: secret accesses recorded as
refHash(sha256) with no secret bytes; values never serialized. - Env injection: secretRefs are resolved and injected into the child process env only; values are not logged or echoed.
Apply Verification
pkg_sync_applyexecutes per-op brew/mise changes underconfirm=true+ global lock.- Post-apply, the server re-reads inventory and computes
residualagainst the plan. If any residual remains,ok=false. - INERT mode (
DEVOPS_MCP_INERT=1): no system changes, returnsinert=true, writes an inert state file, and subsequent plan should be a no-op for the same desired inputs.
Troubleshooting
- Framing: integration and clients use newline-delimited JSON on stdout; logs go to stderr. Avoid
Content-Lengthframing. - Ready banner: the server writes
READY <epoch>to stderr after handlers are registered; integration waits on it. - Logs: local dev writes pretty logs to TTY and JSON to
~/Library/Application Support/devops.mcp/logs/server.ndjson. In prod/CI, logs are JSON on stderr for your collector/supervisor to capture. - INERT: export
DEVOPS_MCP_INERT=1during tests to avoid system mutations. - WAL files: live alongside the DB as
audit.sqlite3-wal; CI runs a truncation checkpoint to keep it small.
launchd service (macOS)
- See
examples/devops.mcp.plistand load with:launchctl bootstrap gui/$UID examples/devops.mcp.plistlaunchctl kickstart -k gui/$UID/local.devops.mcp- Logs at
~/Library/Application Support/devops.mcp/server.log
Next (per plan)
- Routines and secret handles (gopass), capability tiers enforcement for mutating tools.
Example config
- See
examples/config.example.tomlfor a ready-to-tweak TOML. - Failure semantics
- Circuit-break rules:
converge_hostaborts afterpkg_sync_applyifok=falseand never attemptsdotfiles_apply. - Lock order: package (
pkg) first, thendotfiles(and futurerepo). Tools acquire locks in this order and release promptly. - Timeouts & retries: per-step timeouts from
[timeouts];pkg_sync_applyretries once on transient failure;dotfiles_applydoes not retry. - Audit IDs: all mutating steps emit
audit_idwhich you can search in the audit store (SQLite, SQLite WASM, or JSONL). Example search:- SQLite:
SELECT * FROM calls WHERE id = '<audit_id>'inaudit.sqlite3 - SQLite WASM: set
[audit] kind = "sqlite_wasm"when native bindings are unavailable on Node 24.
- SQLite:
Integration & Dashboard
- See docs/guides/dashboard-integration.md for endpoints and examples.
- Bridge defaults to disabled; enable via
[dashboard_bridge] enabled=true, port=7171. - Observer scripts live under
[observers].dirand should output NDJSON to stdout.
Generated clients (typed)
- Bridge client:
./scripts/generate-openapi-client.sh [BRIDGE_URL] [OUT_DIR] - DS client:
DS_BASE_URL=... ./scripts/generate-openapi-client-ds.sh [OUT_DIR] - MCP client:
MCP_BASE_URL=... ./scripts/generate-openapi-client-mcp.sh [OUT_DIR] - All scripts prefer openapi-typescript-codegen (axios), fallback to OpenAPI Generator (npx), then docker.
- CI tip: run generation during build and check in or package
src/generated/**artifacts as needed by the dashboard.
Reliability & Caps
- Logs rotate daily or when exceeding
telemetry.logs.max_file_mb(min 8MB). - Audit JSONL rotates when exceeding
audit.jsonlMaxMBand prunes older rotated files beyondaudit.retainDays. - Self-status history is in-memory and bounded by
diagnostics.self_history_max.- JSONL:
rg '<audit_id>' ~/Library/Application Support/devops.mcp/audit.jsonl
- JSONL:
Telemetry
- See
docs/telemetry.mdfor endpoints, config, and event vocabulary. - See
docs/observability.mdfor collector, compose, and stack runbook. - Programmatic access for other repos:
- Import
getTelemetryInfo()fromsrc/lib/telemetry/info.tsto read normalized endpoints and log sinks at runtime. - Import types and constants from
src/lib/telemetry/contract.tsfor dashboards.
- Import
Operations
-
Startup/health
- Start with
pnpm startor via launchd (seeexamples/devops.mcp.plist). - On startup, the server logs a structured
ServiceStartline and an OTLP reachability banner to stderr. - Fetch
devops://telemetry_infoto inspect telemetry endpoints, env, reachability, log sinks, redaction, and SLOs.
- Start with
-
Telemetry setup
- Traces/Metrics via OTLP: set
[telemetry] enabled=true,export='otlp',endpoint,protocol=('http'|'grpc'). - Logs ingestion:
- JSON: prod/CI logs to stderr; local logs pretty (TTY) + JSON file at
${audit.dir}/logs/server.ndjson(daily rotation). - OTLP Logs (optional): when
export='otlp', logs are forwarded via a Pino→OTLP transport. Attribute filtering is strict by default; extend via[telemetry.logs] attributes_allowlist.
- JSON: prod/CI logs to stderr; local logs pretty (TTY) + JSON file at
- Traces/Metrics via OTLP: set
-
SLOs and alerts
- Configure
[slos]:maxResidualPctAfterApply,maxConvergeDurationMs,maxDroppedPer5m, and per-kind drop thresholds. - Breaches emit
SLOBreachevents; dashboards should alert on them.
- Configure
-
Repo safety
- Configure
system_repowith SSH allowlist;allow_https=falseby default. - Repo cache is traversal-safe, validated after clone, and pruned daily.
- Configure
-
Secrets
- Use
secrets_read_refto obtain opaque references; pass viasecretRefsto tools. Values are never logged or persisted.
- Use
-
Policy and limits
- Enforce capability tiers in
[capabilities]; tune per-resource limits in[limits]. \n## Ports & Env Conventions (Required)
- Enforce capability tiers in
-
Canonical MCP port:
4319\n- UseMCP_URLandMCP_BASE_URL(defaulthttp://127.0.0.1:4319).\n- Only useOBS_BRIDGE_URL=7171when you are explicitly talking to the Bridge.\n- Stage 2 scripts/docs: update defaults toMCP_URL/MCP_BASE_URL=4319; do not default to 7171.
See policy: docs/policies/ports-and-env.md