Randroids-Dojo/mcp-desktop-dvr
3.3
If you are the rightful owner of mcp-desktop-dvr and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
MCP Desktop DVR is a production-ready server for desktop video capture and analysis on macOS, enabling AI-powered visual analysis through a 30-minute rolling buffer.
MCP Desktop DVR
✅ PRODUCTION READY
MCP server for desktop video capture and analysis on macOS. Enables Claude to analyze screen activity through a 30-minute rolling buffer.
Features
- 30-minute rolling buffer of desktop activity
- AI-powered visual analysis with OpenAI GPT-4o (cloud) or Tarsier2-7B (local)
- Intelligent fallback from OpenAI → Tarsier → OCR
- Window-specific capture by application
- Precise extraction of time segments
- Production-ready with comprehensive testing
Requirements
- macOS 10.15+
- Node.js 18+
- Screen Recording permission
Quick Setup
npm install && npm run build
Add to Claude Desktop config:
{
"mcpServers": {
"desktop-dvr": {
"command": "node",
"args": ["/path/to/mcp-desktop-dvr/dist/index.js"],
"env": {
"ANALYZER_PREFERENCE": "auto" // Auto-select best available analyzer
}
}
}
}
Tools
start-continuous-capture
- Begin recordinganalyze-desktop-now
- Extract and analyze recent activityget-capture-status
- Check recording statusstop-capture
- Stop recordingconfigure-capture
- Update settings
Usage
Ask Claude:
- "Start desktop recording"
- "Analyze the last 30 seconds"
- "What errors are on my screen?"
Video Analysis Options
The analyze-desktop-now
tool supports multiple analyzers with different strengths:
☁️ OpenAI GPT-4o Vision (Primary - Cloud)
- Most accurate analysis using GPT-4o via Responses API
- Automatic GIF segmentation - creates 10-second segments from videos
- Smart file management - preserves all GIF files locally alongside MP4
- Efficient uploads - sends only first segment to OpenAI for analysis
- Comprehensive understanding of complex workflows
- Requires OpenAI API key and internet connection
- Fast processing (~3-5 seconds)
🤖 Tarsier2-7B (Fallback - Local AI)
- Runs locally on your Mac using Metal Performance Shaders
- No API keys required - completely private
- Excellent visual understanding of UIs, games, and graphics
- Moderate processing time (~5-15 seconds for 30-second clips)
📝 OCR Text Extraction (Final Fallback)
- Basic text extraction from screenshots
- Limited effectiveness with modern UIs
- Always available as final fallback
Configuration
Set environment variables in your MCP config:
"env": {
"ANALYZER_PREFERENCE": "auto", // Options: "auto", "openai", "tarsier", "ocr"
"OPENAI_API_KEY": "sk-...", // For OpenAI (recommended)
"OPENAI_MODEL": "gpt-4o" // OpenAI model (default: gpt-4o)
}
Analyzer Selection Logic
"auto"
(default) - Uses OpenAI if API key set, otherwise Tarsier, then OCR"openai"
- Forces OpenAI GPT-4o Vision (requires API key)"tarsier"
- Forces Tarsier2-7B local AI"ocr"
- Forces basic OCR text extraction
Documentation
- - Setup and usage instructions
- - Development info
License
ISC