mcp-windows-automation

mukul975/mcp-windows-automation

3.2

If you are the rightful owner of mcp-windows-automation and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The MCP Windows Automation Server is an AI-powered solution for controlling and automating Windows systems using natural language commands.

šŸš€ MCP Windows Automation Server - AI-Powered Windows Control & Automation

Python License: MIT GitHub Stars GitHub Forks GitHub Issues

Transform your Windows PC into an AI-controlled automation powerhouse! šŸ¤–

A comprehensive Model Context Protocol (MCP) server that enables AI assistants like Claude, ChatGPT, and other AI models to seamlessly control Windows applications, automate tasks, and manage system operations through natural language commands.

šŸ”— What is Model Context Protocol (MCP)?

Model Context Protocol (MCP) is an open standard developed by Anthropic that enables AI assistants to securely access external tools, data sources, and system resources. This project implements a comprehensive MCP server specifically designed for Windows automation, allowing AI models to:

  • šŸ› ļø Execute System Commands: Run Windows commands and scripts safely
  • šŸ“ Access File Systems: Read, write, and manage files and directories
  • šŸ–„ļø Control Applications: Automate Windows applications and software
  • 🌐 Browse the Web: Perform web automation and data extraction
  • šŸŽµ Media Control: Manage multimedia applications and content
  • šŸ“Š System Monitoring: Track system performance and resource usage

šŸ—ļø MCP Architecture Benefits

  • šŸ”’ Security: Sandboxed execution with permission controls
  • šŸ”Œ Standardized: Uses industry-standard MCP protocol
  • šŸ¤– AI-Optimized: Designed specifically for AI assistant integration
  • šŸ“” Real-time: Bi-directional communication between AI and system
  • šŸ”„ Extensible: Easy to add new tools and capabilities

šŸŽÆ Supported AI Platforms

  • Claude Desktop (Primary integration)
  • ChatGPT (via API)
  • Custom AI Models (via MCP protocol)
  • Local AI Assistants (Ollama, LocalAI, etc.)
  • Enterprise AI Solutions

🌟 Why Choose MCP Windows Automation?

  • šŸŽÆ AI-Native: Built specifically for AI assistant integration
  • šŸ”§ Comprehensive: 80+ automation tools in one package
  • šŸ›”ļø Safe: Built-in security checks and user permission controls
  • šŸ“± Multi-Platform: Works with Claude Desktop, ChatGPT, and custom AI implementations
  • šŸš€ Production-Ready: Thoroughly tested and documented
  • šŸ’” Intuitive: Natural language commands - no coding required!

⚔ Key Features & Automation Capabilities

šŸ–„ļø Windows System Control

  • System Information: Get detailed Windows system information, installed programs, running processes
  • Window Management: Focus, minimize, maximize windows, get window lists
  • Process Management: List, monitor, and control running processes
  • Registry Access: Safe Windows registry operations
  • Service Management: Control Windows services

šŸ–±ļø Input Automation

  • Mouse Control: Click, drag, move cursor, scroll automation
  • Keyboard Control: Type text, send keyboard shortcuts, hotkeys
  • Screen Interaction: Find and click UI elements, image recognition
  • Drag & Drop: Automated file and UI element manipulation

šŸŽµ Multimedia & Entertainment

  • Spotify Automation: Complete music control, playlist management
  • YouTube Integration: Search and play videos automatically
  • Music Playlist Management: Create, edit, and manage playlists
  • Media Player Control: Universal media player automation

🌐 Web Browser Automation

  • Chrome Automation: Full browser control with Selenium WebDriver
  • Web Scraping: Extract data from websites
  • Form Filling: Automate web form submissions
  • Navigation: Automated browsing and page interaction

šŸ“± Application Control

  • Notepad Automation: Text editing and file operations
  • Calculator Control: Mathematical calculations
  • File Explorer: Navigate and manage files/folders
  • Custom App Integration: Extend to control any Windows application

šŸ” Computer Vision & Screen Analysis

  • Screenshot Capture: Take and save screen captures
  • Image Recognition: Find UI elements using computer vision
  • Screen Monitoring: Track screen changes and activity
  • OCR Integration: Text extraction from images

āš™ļø Configuration & Preferences

  • User Preferences: Store and retrieve user settings
  • Configuration Management: JSON-based configuration system
  • Profile Management: Multiple user profile support
  • Customization: Extensible plugin architecture

Installation

  1. Clone this repository:
git clone https://github.com/yourusername/mcp-windows-automation.git
cd mcp-windows-automation
  1. Install required dependencies:
pip install -r requirements.txt
  1. For web automation, install ChromeDriver:

Usage

Running the MCP Server

python src/unified_server.py

Configuration

Edit the configuration files in the config/ directory:

  • claude_desktop_config.json - Claude Desktop integration
  • user_preferences.json - User preferences storage

Example Usage

# Example of using the automation server
from src.unified_server import AutomationServer

server = AutomationServer()
# The server will be available via MCP protocol

šŸŽÆ Real-World Use Cases

šŸ¢ Business Automation
  • "Take a screenshot of my desktop and save it as 'daily_report.png'"
  • "Open Excel, create a new spreadsheet, and type the sales data"
  • "Check system performance and email the report to my manager"
  • "Backup all files from Desktop to external drive"
šŸŽµ Entertainment & Media
  • "Play my favorite song on Spotify"
  • "Create a new playlist called 'Work Music' and add upbeat songs"
  • "Search for 'Python tutorials' on YouTube and play the first video"
  • "Take a screenshot when my favorite song plays"
šŸ’» Development Workflow
  • "Open VS Code, create a new Python file, and type the boilerplate code"
  • "Run the test suite and capture the output"
  • "Open Chrome, navigate to GitHub, and check for new issues"
  • "Monitor CPU usage while running the build process"
šŸ”§ System Administration
  • "List all running processes and their memory usage"
  • "Check which programs start with Windows"
  • "Find and close any unresponsive applications"
  • "Get detailed system information and save to a file"

šŸš€ Quick Start Examples

Natural Language Commands (via AI Assistant):
šŸ¤– "Can you play some music on Spotify?"
šŸ¤– "Take a screenshot of my screen"
šŸ¤– "Open calculator and compute 15% of 250"
šŸ¤– "Close all browser windows"
šŸ¤– "What programs are currently running?"
Direct MCP Tool Calls:
{
  "tool": "spotify_play_favorite_song",
  "parameters": {}
}

{
  "tool": "take_screenshot",
  "parameters": {
    "filename": "my_desktop.png"
  }
}

{
  "tool": "automate_calculator",
  "parameters": {
    "expression": "15% of 250"
  }
}

Project Structure

ā”œā”€ā”€ src/                    # Source code
│   ā”œā”€ā”€ unified_server.py           # Main MCP server
│   ā”œā”€ā”€ advanced_automation_server.py  # Advanced automation features
│   ā”œā”€ā”€ mcp_gui.py                 # GUI interface
│   └── ...
ā”œā”€ā”€ tests/                  # Test files
ā”œā”€ā”€ docs/                   # Documentation
ā”œā”€ā”€ examples/               # Example scripts and configurations
ā”œā”€ā”€ config/                 # Configuration files
└── README.md

Documentation

Testing

Run the test suite:

python -m pytest tests/

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Run the test suite
  6. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Requirements

  • Python 3.7+
  • Windows 10/11
  • Required Python packages (see requirements.txt)
  • ChromeDriver for web automation features

Support

For issues and questions, please open an issue on GitHub.