mac-vision-mcp by jasich - MCP Server

mac-vision-mcp

A Model Context Protocol (MCP) server that enables AI coding agents to capture screenshots of macOS windows and displays on demand.

Why

LLMs are amazing at using images for context. You can feed image files to an LLM and it can do things like analyze a design or read text. I find myself constantly wanting to "show" LLMs what I'm looking at, but I found it cumbersome to take a screenshot, find the file, and give the path to the LLM. Additionally I ended up with thousands of screenshots over time that I needed to manage. So I thought, why can't the LLM just do this itself? And that's what led to this project.

Features

Window Discovery - List all open windows with metadata (title, app, bounds, display)
Window Capture - Capture screenshots of specific windows by ID
Display Capture - Capture entire displays (single or all)
Smart Filtering - Automatically filters out system overlays and utility windows
Natural Integration - Works seamlessly with any MCP-compatible AI agent
Privacy First - Runs entirely locally on your Mac
Professional Logging - Structured logging with timestamps for debugging

System Requirements

macOS: 12.0+ (Monterey or later)
Architecture: Intel (x64) or Apple Silicon (arm64)
Node.js: 16.0.0 or higher
Permissions: Screen Recording permission required

Installation

Global Installation (Recommended)

npm install -g mac-vision-mcp

Using with npx (No Installation)

npx -y mac-vision-mcp

Quick Start

1. Grant Screen Recording Permission

On first run, macOS will prompt you to grant Screen Recording permission:

Open System Preferences
Go to Privacy & Security > Screen Recording
Enable permission for the application running the MCP server
Restart the MCP server

2. Configure Your MCP Client

For Claude Code

Add to .mcp.json in your project:

{
  "mcpServers": {
    "mac-vision": {
      "command": "npx",
      "args": ["-y", "mac-vision-mcp"]
    }
  }
}

For Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "mac-vision": {
      "command": "npx",
      "args": ["-y", "mac-vision-mcp"]
    }
  }
}

For Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "mac-vision": {
      "command": "npx",
      "args": ["-y", "mac-vision-mcp"]
    }
  }
}

3. Use with Your AI Agent

Once configured, your AI agent can use natural language to capture screenshots:

User: "Show me my Chrome window with the error"

Agent: [calls list_windows]
Agent: [calls capture_window with the Chrome window ID]
Agent: "I can see the 404 error in your browser..."

MCP Tools

`list_windows`

Get all open windows with metadata.

Parameters: None

Returns:

{
  "windows": [
    {
      "id": "12345",
      "title": "Chrome - Documentation",
      "app": "Google Chrome",
      "bounds": {
        "x": 0,
        "y": 23,
        "width": 1920,
        "height": 1057
      },
      "display": 0
    }
  ]
}

`capture_window`

Capture a screenshot of a specific window.

Parameters:

window_id (required, string) - Window ID from list_windows
mode (optional, string) - Capture mode: "full" or "content" (default: "full")
output_path (optional, string) - Custom output path (must end with .png)

Returns:

{
  "success": true,
  "file_path": "/tmp/screenshot_12345.png",
  "window": {
    "id": "12345",
    "title": "Chrome - Documentation",
    "app": "Google Chrome"
  }
}

`capture_windows`

Capture screenshots of multiple windows at once. Useful when you need to see several windows simultaneously.

Parameters:

window_ids (required, string[]) - Array of Window IDs from list_windows
mode (optional, string) - Capture mode: "full" or "content" (default: "full")
output_dir (optional, string) - Custom output directory (default: temp directory)

Returns:

{
  "success": true,
  "captures": [
    {
      "window_id": "12345",
      "success": true,
      "file_path": "/tmp/screenshot_12345.png",
      "window": {
        "id": "12345",
        "title": "Chrome - Documentation",
        "app": "Google Chrome"
      }
    },
    {
      "window_id": "67890",
      "success": true,
      "file_path": "/tmp/screenshot_67890.png",
      "window": {
        "id": "67890",
        "title": "VS Code",
        "app": "Code"
      }
    }
  ]
}

`capture_display`

Capture entire display(s).

Parameters:

display_id (optional, number) - Specific display number (0-indexed), or omit to capture all

Single Display Returns:

{
  "success": true,
  "file_path": "/tmp/display_0.png",
  "display": 0
}

All Displays Returns:

{
  "success": true,
  "captures": [
    {
      "display": 0,
      "file_path": "/tmp/display_0.png"
    },
    {
      "display": 1,
      "file_path": "/tmp/display_1.png"
    }
  ]
}

Troubleshooting

Permission Denied Errors

Error: Screen Recording permission required

Solution:

Open System Preferences > Privacy & Security > Screen Recording
Enable permission for your terminal or application
Restart the MCP server

Window Not Found

Error: Window {id} not found. It may have been closed.

Cause: The window was closed between listing and capturing.

Solution: Call list_windows again to get current window IDs.

Invalid Output Path

Error: Output path must end with .png

Solution: Ensure custom output paths have a .png extension.

Native Module Issues

Error: Native module compilation errors

Solution:

Ensure you're on macOS 12.0+
Verify Node.js version is 16.0.0+
Try reinstalling: npm install -g mac-vision-mcp --force

No Windows Listed

Issue: list_windows returns empty array or missing windows

Cause: Screen Recording permission not granted or windows filtered out

Solution:

Verify Screen Recording permission is enabled
Note: System windows and gesture overlays are automatically filtered
Windows smaller than 50x50 pixels are excluded

Architecture

Language: TypeScript/Node.js with ESM modules
MCP SDK: @modelcontextprotocol/sdk (v1.22.0)
Screenshot Library: node-screenshots (v0.2.4) with native N-API bindings
Window Metadata: get-windows (v9.2.3)
Permissions: mac-screen-capture-permissions (v2.1.0)
Validation: Zod (v3.25.0)

Development

Local Setup

# Clone repository
git clone https://github.com/jasich/mac-vision-mcp.git
cd mac-vision-mcp

# Install dependencies
npm install

# Build
npm run build

# Run locally
node dist/index.js

Using Local Build in Another Project

To test your local development build with Claude Code or another MCP client:

Build the project (if not already done):

cd /path/to/mac-vision-mcp
npm run build

Configure your other project's .claude.json with the absolute path:

{
  "mcpServers": {
    "mac-vision": {
      "command": "node",
      "args": ["/path/to/mac-vision-mcp/dist/index.js"]
    }
  }
}

Restart Claude Code to load the local build

Make changes and rebuild as needed:

npm run build  # Rebuild after code changes

Note: Replace /path/to/mac-vision-mcp with your actual absolute path to the project.

Testing with MCP Inspector

# Run with MCP Inspector for debugging
npx @modelcontextprotocol/inspector node ./dist/index.js

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

MIT License - see LICENSE file for details.

Acknowledgments

Built on the Model Context Protocol
Uses node-screenshots for native screenshot capture
Uses get-windows by Sindre Sorhus for window metadata

Support

Issues: Report bugs or request features via GitHub Issues
Documentation: Model Context Protocol Docs
MCP Inspector: Use for testing and debugging MCP tools