amafjarkasi/windows-screenshot-mcp-server
If you are the rightful owner of windows-screenshot-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A powerful, production-ready MCP (Model Context Protocol) server for Windows screenshot automation with real-time WebSocket streaming capabilities.
Windows Screenshot MCP Server
Professional Windows screenshot capture server with Model Context Protocol (MCP) integration, real-time WebSocket streaming, Chrome tab capture, and advanced window targeting.
Overview
A production-ready Go-based screenshot server that provides both REST API and MCP protocol support for capturing Windows application screenshots. Designed for automation, testing, and AI agent integration with advanced features like real-time streaming and hidden window capture.
Features
Core Screenshot Capabilities
- Window targeting: Capture by title, class name, process ID, or window handle
- Multiple image formats: PNG, JPEG, BMP, WebP with configurable quality
- Region capture: Specify rectangular areas for precise screenshots
- Advanced window handling: Support for hidden, minimized, and system tray applications
Chrome Browser Integration
- Tab discovery: Automatically find Chrome instances and enumerate tabs
- Direct tab capture: Screenshot specific browser tabs via Chrome DevTools
- Multiple Chrome support: Handle multiple Chrome processes simultaneously
Real-Time WebSocket Streaming
- Live streaming: Real-time window feeds via WebSocket connections
- Configurable quality: Adjust FPS (1-60), quality, and format dynamically
- Multiple sessions: Support concurrent streaming sessions
- Session management: Start, stop, and monitor active streaming sessions
Dual Protocol Support
- REST API: Traditional HTTP endpoints for easy integration
- Model Context Protocol (MCP): JSON-RPC 2.0 for AI agent integration
- Health monitoring: Built-in health checks and status reporting
- CORS support: Cross-origin requests enabled for web applications
Quick Start
Installation
# Download latest release
curl -L https://github.com/your-org/screenshot-mcp-server/releases/latest/download/screenshot-server.exe -o screenshot-server.exe
# Or build from source
git clone https://github.com/your-org/screenshot-mcp-server.git
cd screenshot-mcp-server
go build -o screenshot-server.exe ./cmd/server
Basic Usage
# Start the server
./screenshot-server.exe --port 8080
# Health check
curl http://localhost:8080/health
# Basic window capture
curl "http://localhost:8080/api/screenshot?method=title&target=Notepad" -o notepad.png
# Full desktop capture
curl "http://localhost:8080/api/screenshot?method=desktop&monitor=0" -o desktop.png
API Reference
REST Endpoints
Health Check
GET /health
Returns server status and version information.
Screenshot Capture
GET /api/screenshot
GET /v1/screenshot
Parameters:
method(required):title,pid,handle,classtarget(required): Window identifier (title, PID, handle, class name)format:png,jpeg,bmp,webp(default:png)quality: 1-100 for lossy formats (default: 95)cursor:true/falseto include mouse cursor
Examples:
# Window by title
curl "http://localhost:8080/api/screenshot?method=title&target=Calculator" -o calc.png
# Window by PID
curl "http://localhost:8080/api/screenshot?method=pid&target=1234&format=jpeg&quality=80" -o app.jpg
# Window by class name
curl "http://localhost:8080/api/screenshot?method=class&target=Notepad&cursor=true" -o notepad.png
Chrome Integration
GET /v1/chrome/instances # List Chrome instances
GET /v1/chrome/tabs # List all Chrome tabs
POST /v1/chrome/tabs/:id/screenshot # Capture specific tab
WebSocket Streaming
Connect to ws://localhost:8080/stream/{windowId} for real-time streaming.
Query Parameters:
fps: Frames per second (1-60, default: 10)quality: Compression quality (10-100, default: 75)format:jpegorpng(default:jpeg)
Client Example:
<!DOCTYPE html>
<html>
<body>
<img id="stream" style="max-width: 100%;">
<script>
const ws = new WebSocket('ws://localhost:8080/stream/0?fps=15&quality=75&format=jpeg');
ws.onmessage = function(event) {
const data = JSON.parse(event.data);
if (data.type === 'frame') {
document.getElementById('stream').src = 'data:image/jpeg;base64,' + data.image;
}
};
</script>
</body>
</html>
Model Context Protocol (MCP)
The server supports MCP JSON-RPC 2.0 requests via POST /rpc.
Available Methods:
screenshot.capture- Capture screenshotswindow.list- List windows (placeholder)chrome.instances- List Chrome instanceschrome.tabs- List Chrome tabschrome.tabCapture- Capture Chrome tabstream.status- Get streaming status
Example MCP Request:
{
"jsonrpc": "2.0",
"method": "screenshot.capture",
"params": {
"method": "title",
"target": "Calculator",
"format": "png"
},
"id": 1
}
Server Configuration
The server can be configured via environment variables or command-line flags:
# Start with custom port
./server.exe --port 9090
# Start with custom host
./server.exe --host 0.0.0.0 --port 8080
# Environment variables
export SCREENSHOT_PORT=8080
export SCREENSHOT_HOST=localhost
./server.exe
Examples & Use Cases
- - Simple window screenshots with REST API, CLI, and programming examples
- - Real-time window feeds with JavaScript, Python, and Node.js clients
- - Advanced techniques for minimized and system tray applications
- - Direct browser tab screenshots with Chrome DevTools integration
- - Automated UI change detection with Python framework
Advanced Configuration
Server Configuration
The server uses a default configuration that can be customized:
// Default settings
type Config struct {
Port int // Default: 8080
Host string // Default: "localhost"
DefaultFormat string // Default: "png"
Quality int // Default: 95
IncludeCursor bool // Default: false
LogLevel string // Default: "info"
ChromeTimeout string // Default: "30s"
StreamMaxSessions int // Default: 10
StreamDefaultFPS int // Default: 10
}
Chrome DevTools Setup
For Chrome tab capture, launch Chrome with debugging enabled:
# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
# Launch with temporary profile
chrome.exe --remote-debugging-port=9222 --user-data-dir=temp-profile
Building from Source
Prerequisites
- Go 1.21 or later
- Windows OS (for Windows API support)
- Git
Build Instructions
# Clone the repository
git clone https://github.com/your-org/screenshot-mcp-server.git
cd screenshot-mcp-server
# Install dependencies
go mod download
# Build the server
go build -o server.exe ./cmd/server
# Run tests
go test ./...
# Start the server
./server.exe
Project Structure
├── cmd/
│ ├── server/ # Main server application
│ └── mcpctl/ # MCP control utility
├── internal/
│ ├── screenshot/ # Screenshot capture engines
│ ├── chrome/ # Chrome DevTools integration
│ ├── window/ # Window management
│ └── ws/ # WebSocket streaming
├── pkg/
│ └── types/ # Shared data structures
└── examples/ # Usage examples and documentation
Architecture
The server follows a modular architecture:
- HTTP Server (Gin framework) - REST API endpoints
- WebSocket Manager - Real-time streaming support
- Screenshot Engine - Core capture functionality with multiple methods
- Chrome Manager - Browser integration via DevTools protocol
- MCP Handler - JSON-RPC 2.0 support for AI agents
License
This project is licensed under the MIT License - see the file for details.
Support
- Issues: GitHub Issues
- Documentation: See
/examplesdirectory for usage examples
A powerful Windows screenshot server built for modern automation and AI integration.