mcp-server-win-ui

scanzy/mcp-server-win-ui

3.2

If you are the rightful owner of mcp-server-win-ui and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The Windows UI MCP server is a tool designed to integrate Model Context Protocol with Windows UI applications, enabling AI models to interact with these applications through guided user interaction.

Windows UI MCP server

A Windows integration for Model Context Protocol that enables AI models to discover, map and interact with Windows UI applications through guided user interaction.

Workflow

  1. Initial Discovery

    • AI asks user about target application context, purpose and workflow
    • User provides context about main UI components and interactions
    • AI guides user through application exploration,
  2. UI Exploration and Mapping

    • AI and user collaborate to identify key windows and controls
    • System captures reliable identifiers (handle, text, class, position)
    • Validation performed at each identification step
    • Validated control information saved to YAML files with multiple identifiers for reliability
  3. UI Automation

    • Load stored YAML context files
    • Use validated identifiers to locate UI elements
    • Perform automated interactions with the application
    • Handle validation and error recovery

Information to collect

  1. Application name, context, purpose and general workflow
  2. Main concepts and terms
  3. Window types:
  • normal: for main operations (e.g. write, select, etc.)
  • dialogs: for simple actions (e.g. import, export, save, open, settings,etc.)
  • popup: for confirmation, error, etc. (e.g. do you want to save?)
  1. Control zones:
  • fixed: controls that are always present (e.g. menu, toolbar, status bar)
  • dynamic: controls that are present only in certain contexts (e.g. search, filter, etc.)
  1. Flows: steps to follow to complete a task (e.g. login at startup, wizards, etc.)
  2. Actions and commands: buttons, menus, and controls to click
  3. Recommendations and gotchas

Installation

[JSON MCP config file here]

Project Structure

Generated by AI (maybe overkill)

win32-mcp/
├── src/
│   ├── core/                   # Core functionality
│   │   ├── window_manager.py   # Windows enumeration, filtering, and state monitoring
│   │   ├── control_manager.py  # UI control discovery and interaction
│   │   ├── context_store.py    # YAML context serialization and validation
│   │   └── validator.py        # Validation rules and utilities
│   ├── prompts/                # AI conversation prompts in YAML format
│   │   ├── discovery.yaml      # App purpose and workflow discovery
│   │   ├── mapping.yaml        # UI element identification and mapping
│   │   └── validation.yaml     # Control validation and verification
│   ├── utils/                  # Helper utilities
│   │   ├── win32_utils.py      # Win32 API wrapper functions
│   │   ├── yaml_utils.py       # YAML processing helpers
│   │   └── logging.py          # Logging configuration
│   └── mcp/                    # MCP protocol implementation
│       ├── server.py           # Async server with JSON-RPC
│       ├── handlers.py         # Request/response and event handlers
│       └── tools.py            # Tool definitions for UI operations
├── mcp_config.json             # Server configuration
└── README.md                   # Project documentation

Development Status

🚧 Under Development

TODO:

  • [o] Scouting with win32 api and flaUI
  • Think about tools to develop
  • Think about storage of context
  • Think about project structure
  • Prompts for discovery, mapping, validation