mcp-windows

sbroenne/mcp-windows

3.3

If you are the rightful owner of mcp-windows and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

A Model Context Protocol (MCP) server providing Windows automation capabilities for LLM agents, built on .NET 8 with native Windows API integration.

Tools
4
Resources
0
Prompts
0

🪟 Windows MCP Server

.NET Platform

Windows automation that actually works. Uses the Windows UI Automation API to find buttons by name, not pixels. Tested with real AI models before every release.

Why This Exists

Screenshot-based automation doesn't work reliably. Vision models guess wrong, coordinates break when windows move or DPI changes, and you burn through thousands of tokens on retry loops. We tried it (check the commit history) — it failed too often to be useful.

Windows MCP Server asks Windows directly: "What buttons exist in this window?" Windows knows. It's deterministic.

How It Works

# 1. Find the window
window_management(action='find', title='Notepad') → handle='123456'

# 2. Click elements by name
ui_click(windowHandle='123456', nameContains='Save')

# 3. Type into fields
ui_type(windowHandle='123456', controlType='Edit', text='Hello World')

# 4. Fallback for games/canvas — screenshot + mouse
screenshot_control(windowHandle='123456') → element coordinates
mouse_control(action='click', x=450, y=300)

Same command works every time. Any machine. Any DPI. Any theme.

Key Features

  • 🧠 Semantic UI — Find elements by name, not coordinates. Works regardless of DPI, theme, or window position.
  • � Multi-Monitor — Full support for multiple displays with per-monitor DPI scaling.
  • 🧪 LLM-Tested — 54 tests with real AI models (GPT-4.1, GPT-5.2). 100% pass rate required for release.
  • 💻 Broad App Support — Tested against classic Windows apps, modern Windows 11 apps, and Electron apps (VS Code, Teams, Slack).
  • 🔄 Full Fallback — Screenshot + mouse + keyboard for games and custom controls.
  • 🪙 Token Optimized — Short property names, JPEG screenshots, auto-scaling. ~60% fewer tokens than standard JSON.

Installation

VS Code ExtensionInstall from Marketplace. Works with GitHub Copilot automatically.

StandaloneDownload from Releases. Add to your MCP config:

{ "servers": { "windows": { "command": "path/to/Sbroenne.WindowsMcp.exe" } } }

Tools

ToolPurpose
ui_clickClick buttons, checkboxes, menu items by name
ui_typeType into text fields
ui_findDiscover elements in a window (with timeout/retry)
ui_readRead text from elements (with OCR fallback)
ui_fileHandle Save dialogs
screenshot_controlGet element metadata (image optional)
window_managementFind, activate, move, resize windows
mouse_controlCoordinate-based clicks (fallback for games)
keyboard_controlHotkeys and key sequences
appLaunch applications

Full reference:

⚠️ Caution

This MCP server controls your Windows desktop. Use responsibly.

Testing

dotnet test                                      # All tests
dotnet test --filter "FullyQualifiedName~Unit"   # Unit only

Framework coverage: Tests run against WinForms, WinUI 3, and Electron apps.

LLM tests: 54 tests with real AI models (GPT-4.1, GPT-5.2). 100% pass rate required for release.

cd tests/Sbroenne.WindowsMcp.LLM.Tests
.\Run-LLMTests.ps1 

Requires Azure OpenAI access. See .

Related Projects

Documentation

DocumentDescription
Complete tool reference — all actions, parameters, examples
Build instructions, coding guidelines, PR process
How to run LLM integration tests
Azure OIDC and GitHub Actions configuration

License

MIT — see

Contributing

See