jasich/mac-vision-mcp
If you are the rightful owner of mac-vision-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A Model Context Protocol (MCP) server that enables AI coding agents to capture screenshots of macOS windows and displays on demand.
mac-vision-mcp
A Model Context Protocol (MCP) server that enables AI coding agents to capture screenshots of macOS windows and displays on demand.
Why
LLMs are amazing at using images for context. You can feed image files to an LLM and it can do things like analyze a design or read text. I find myself constantly wanting to "show" LLMs what I'm looking at, but I found it cumbersome to take a screenshot, find the file, and give the path to the LLM. Additionally I ended up with thousands of screenshots over time that I needed to manage. So I thought, why can't the LLM just do this itself? And that's what led to this project.
Features
- Window Discovery - List all open windows with metadata (title, app, bounds, display)
- Window Capture - Capture screenshots of specific windows by ID
- Display Capture - Capture entire displays (single or all)
- Smart Filtering - Automatically filters out system overlays and utility windows
- Natural Integration - Works seamlessly with any MCP-compatible AI agent
- Privacy First - Runs entirely locally on your Mac
- Professional Logging - Structured logging with timestamps for debugging
System Requirements
- macOS: 12.0+ (Monterey or later)
- Architecture: Intel (x64) or Apple Silicon (arm64)
- Node.js: 16.0.0 or higher
- Permissions: Screen Recording permission required
Installation
Global Installation (Recommended)
npm install -g mac-vision-mcp
Using with npx (No Installation)
npx -y mac-vision-mcp
Quick Start
1. Grant Screen Recording Permission
On first run, macOS will prompt you to grant Screen Recording permission:
- Open System Preferences
- Go to Privacy & Security > Screen Recording
- Enable permission for the application running the MCP server
- Restart the MCP server
2. Configure Your MCP Client
For Claude Code
Add to .mcp.json in your project:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}
For Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}
For Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}
3. Use with Your AI Agent
Once configured, your AI agent can use natural language to capture screenshots:
User: "Show me my Chrome window with the error"
Agent: [calls list_windows]
Agent: [calls capture_window with the Chrome window ID]
Agent: "I can see the 404 error in your browser..."
MCP Tools
list_windows
Get all open windows with metadata.
Parameters: None
Returns:
{
"windows": [
{
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome",
"bounds": {
"x": 0,
"y": 23,
"width": 1920,
"height": 1057
},
"display": 0
}
]
}
capture_window
Capture a screenshot of a specific window.
Parameters:
window_id(required, string) - Window ID fromlist_windowsmode(optional, string) - Capture mode:"full"or"content"(default:"full")output_path(optional, string) - Custom output path (must end with.png)
Returns:
{
"success": true,
"file_path": "/tmp/screenshot_12345.png",
"window": {
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome"
}
}
capture_windows
Capture screenshots of multiple windows at once. Useful when you need to see several windows simultaneously.
Parameters:
window_ids(required, string[]) - Array of Window IDs fromlist_windowsmode(optional, string) - Capture mode:"full"or"content"(default:"full")output_dir(optional, string) - Custom output directory (default: temp directory)
Returns:
{
"success": true,
"captures": [
{
"window_id": "12345",
"success": true,
"file_path": "/tmp/screenshot_12345.png",
"window": {
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome"
}
},
{
"window_id": "67890",
"success": true,
"file_path": "/tmp/screenshot_67890.png",
"window": {
"id": "67890",
"title": "VS Code",
"app": "Code"
}
}
]
}
capture_display
Capture entire display(s).
Parameters:
display_id(optional, number) - Specific display number (0-indexed), or omit to capture all
Single Display Returns:
{
"success": true,
"file_path": "/tmp/display_0.png",
"display": 0
}
All Displays Returns:
{
"success": true,
"captures": [
{
"display": 0,
"file_path": "/tmp/display_0.png"
},
{
"display": 1,
"file_path": "/tmp/display_1.png"
}
]
}
Troubleshooting
Permission Denied Errors
Error: Screen Recording permission required
Solution:
- Open System Preferences > Privacy & Security > Screen Recording
- Enable permission for your terminal or application
- Restart the MCP server
Window Not Found
Error: Window {id} not found. It may have been closed.
Cause: The window was closed between listing and capturing.
Solution: Call list_windows again to get current window IDs.
Invalid Output Path
Error: Output path must end with .png
Solution: Ensure custom output paths have a .png extension.
Native Module Issues
Error: Native module compilation errors
Solution:
- Ensure you're on macOS 12.0+
- Verify Node.js version is 16.0.0+
- Try reinstalling:
npm install -g mac-vision-mcp --force
No Windows Listed
Issue: list_windows returns empty array or missing windows
Cause: Screen Recording permission not granted or windows filtered out
Solution:
- Verify Screen Recording permission is enabled
- Note: System windows and gesture overlays are automatically filtered
- Windows smaller than 50x50 pixels are excluded
Architecture
- Language: TypeScript/Node.js with ESM modules
- MCP SDK: @modelcontextprotocol/sdk (v1.22.0)
- Screenshot Library: node-screenshots (v0.2.4) with native N-API bindings
- Window Metadata: get-windows (v9.2.3)
- Permissions: mac-screen-capture-permissions (v2.1.0)
- Validation: Zod (v3.25.0)
Development
Local Setup
# Clone repository
git clone https://github.com/jasich/mac-vision-mcp.git
cd mac-vision-mcp
# Install dependencies
npm install
# Build
npm run build
# Run locally
node dist/index.js
Using Local Build in Another Project
To test your local development build with Claude Code or another MCP client:
-
Build the project (if not already done):
cd /path/to/mac-vision-mcp npm run build -
Configure your other project's
.claude.jsonwith the absolute path:{ "mcpServers": { "mac-vision": { "command": "node", "args": ["/path/to/mac-vision-mcp/dist/index.js"] } } } -
Restart Claude Code to load the local build
-
Make changes and rebuild as needed:
npm run build # Rebuild after code changes
Note: Replace /path/to/mac-vision-mcp with your actual absolute path to the project.
Testing with MCP Inspector
# Run with MCP Inspector for debugging
npx @modelcontextprotocol/inspector node ./dist/index.js
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
License
MIT License - see LICENSE file for details.
Acknowledgments
- Built on the Model Context Protocol
- Uses node-screenshots for native screenshot capture
- Uses get-windows by Sindre Sorhus for window metadata
Support
- Issues: Report bugs or request features via GitHub Issues
- Documentation: Model Context Protocol Docs
- MCP Inspector: Use for testing and debugging MCP tools