Peekaboo

steipete/Peekaboo

3.8

If you are the rightful owner of Peekaboo and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Peekaboo MCP is a macOS-only server that allows AI agents to capture and analyze screenshots using local or remote AI models.

Tools
3
Resources
0
Prompts
0

Peekaboo 🫣 - Mac automation that sees the screen and does the clicks.

Peekaboo Banner

npm package License: MIT macOS 15.0+ (Sequoia) Swift 6.2 node >=22 Download macOS Homebrew Ask DeepWiki

Peekaboo brings high-fidelity screen capture, AI analysis, and complete GUI automation to macOS. Version 3 adds native agent flows and multi-screen automation across the CLI and MCP server.

Note: v3 is currently in beta (3.0.0-beta3) and has a few known issues; see the changelog for details.

What you get

  • Pixel-accurate captures (windows, screens, menu bar) with optional Retina 2x scaling.
  • Natural-language agent that chains Peekaboo tools (see, click, type, scroll, hotkey, menu, window, app, dock, space).
  • Menu and menubar discovery with structured JSON; no clicks required.
  • Multi-provider AI: GPT-5.1 family, Claude 4.x, Grok 4-fast (vision), Gemini 2.5, and local Ollama models.
  • MCP server for Claude Desktop and Cursor plus a native CLI; the same tools in both.
  • Configurable, testable workflows with reproducible sessions and strict typing.
  • Requires macOS Screen Recording + Accessibility permissions (see ).

Install

  • macOS app + CLI (Homebrew):
    brew install steipete/tap/peekaboo
    
  • MCP server (Node 22+, no global install needed):
    npx -y @steipete/peekaboo
    

Quick start

# Capture full screen at Retina scale and save to Desktop
peekaboo image --mode screen --retina --path ~/Desktop/screen.png

# Click a button by label (captures, resolves, and clicks in one go)
peekaboo see --app Safari --json-output | jq -r '.data.snapshot_id' | read SNAPSHOT
peekaboo click --on "Reload this page" --snapshot "$SNAPSHOT"

# Run a natural-language automation
peekaboo "Open Notes and create a TODO list with three items"

# Run as an MCP server (Claude/Cursor)
npx -y @steipete/peekaboo

# Minimal Claude Desktop config snippet (Developer → Edit Config):
# {
#   "mcpServers": {
#     "peekaboo": {
#       "command": "npx",
#       "args": ["-y", "@steipete/peekaboo"],
#       "env": {
#         "PEEKABOO_AI_PROVIDERS": "openai/gpt-5.1,anthropic/claude-opus-4"
#       }
#     }
#   }
# }
CommandKey flags / subcommandsWhat it does
--app, --mode screen/window, --retina, --json-outputCapture and annotate UI, return snapshot + element IDs
--on <id/query>, --snapshot, --wait, coordsClick by element ID, label, or coordinates
--text, --clear, --delay-msEnter text with pacing options
key names, --repeatSpecial keys and sequences
combos like cmd,shift,tModifier combos (cmd/ctrl/alt/shift)
--on <id>, --direction up/down, --ticksScroll views or elements
--from/--to, --duration, --stepsSmooth gesture-style drags
--from/--to, modifiers, Dock/Trash targetsDrag-and-drop between elements/coords
--to <id/coords>, --screen-indexPosition the cursor without clicking
list, move, resize, focus, set-boundsMove/resize/focus windows and Spaces
launch, quit, relaunch, switch, listLaunch, quit, relaunch, switch apps
list, switch, move-windowList or switch macOS Spaces
list, list-all, click, click-extraList/click app menus and extras
list, clickTarget status-bar items by name/index
launch, right-click, hide, show, listInteract with Dock items
list, click, input, file, dismissDrive system dialogs (open/save/etc.)
--mode screen/window/menu, --retina, --analyzeScreenshot screen/window/menu bar (+analyze)
apps, windows, screens, menubar, permissionsEnumerate apps, windows, screens, permissions
--verbose, --json-output, --no-sortInspect native Peekaboo tools
init, show, add, login, modelsManage credentials/providers/settings
status, grantCheck/grant required macOS permissions
.peekaboo.json, --output, --no-fail-fastExecute .peekaboo.json automation scripts
--duration (ms)Millisecond delays between steps
--all-snapshots, --older-than, --snapshotPrune snapshots and caches
--model, --dry-run, --resume, --max-steps, audioNatural-language multi-step automation
serve (default)Run Peekaboo as an MCP server

Models and providers

  • OpenAI: GPT-5.1 (default) and GPT-4.1/4o vision
  • Anthropic: Claude 4.x
  • xAI: Grok 4-fast reasoning + vision
  • Google: Gemini 2.5 (pro/flash)
  • Local: Ollama (llama3.3, llava, etc.)

Set providers via PEEKABOO_AI_PROVIDERS or peekaboo config add.

Learn more

  • Command reference:
  • Architecture:
  • Building from source:
  • Testing guide:
  • MCP setup:
  • Permissions:
  • Ollama/local models:
  • Agent chat loop:
  • Service API reference:

Development basics

  • Requirements: macOS 15+, Xcode 16+/Swift 6.2. Node 22+ only if you run the pnpm docs/build helper scripts (core CLI/app/MCP are Swift-only).
  • Install deps: pnpm install then pnpm run build:cli or pnpm run test:safe.
  • Lint/format: pnpm run lint && pnpm run format.

License

MIT