mcp-ui-analyzer

manhhaycode/mcp-ui-analyzer

3.2

If you are the rightful owner of mcp-ui-analyzer and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The MCP UI Screenshot Analyzer is a server that integrates with GitHub Copilot to provide AI-powered UI analysis using local models only, ensuring zero API costs and a privacy-first design.

Tools
6
Resources
0
Prompts
0

MCP UI Screenshot Analyzer

An MCP (Model Context Protocol) server that integrates with GitHub Copilot to provide AI-powered UI analysis using local models only (zero API costs, privacy-first design).

Features

  • UI Screenshot Analysis: Semantic understanding of UI layouts and structure
  • Color Palette Extraction: Extract dominant colors using OpenCV k-means
  • Text Extraction: OCR capabilities via Gemma 3 vision
  • Bug Detection: Identify layout issues and accessibility problems
  • Depth Levels: Configurable analysis depth (quick/standard/deep)
  • Smart Caching: Performance optimization for repeated analyses

Current Status

Week 1 MVP - COMPLETE

  • ✅ Gemma 3 12B vision integration via Ollama
  • ✅ OpenCV color extraction
  • ✅ Result caching (1-hour TTL)
  • ✅ All 6 MCP tools implemented (4 fully functional)
  • ✅ Error handling and validation
  • ⏳ YOLOv8 component detection (Week 2)
  • ⏳ Code generation (Week 3)

Quick Start

Prerequisites

  • macOS or Linux
  • Python 3.10+
  • 8GB+ RAM (16GB recommended)
  • Ollama installed

Installation

# 1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 2. Pull Gemma 3 12B model
ollama pull gemma3:12b

# 3. Activate virtual environment (already created)
source venv/bin/activate

# 4. Run the MCP server
python server.py

Verify Installation

# Check Ollama is running
pgrep -x "ollama"

# Check models are available
ollama list

# Test server initialization
python -c "from server import mcp, gemma_analyzer; print('Server ready!')"

MCP Tools

1. analyze_ui_screenshot

Main analysis tool with configurable depth levels.

Parameters:

  • image_path (str): Absolute path to screenshot
  • depth (str): "quick" | "standard" | "deep"

Depth Levels:

  • quick: Gemma 3 only (2-4s) - Basic description
  • standard: Gemma 3 + colors (5-8s) - Detailed analysis
  • deep: Full pipeline (12-18s) - Comprehensive analysis

Example:

# Via GitHub Copilot Chat:
Analyze the UI screenshot at /Users/me/screenshot.png

# Programmatic:
result = analyze_ui_screenshot("/path/to/screenshot.png", depth="standard")

2. extract_color_palette

Extract dominant colors using OpenCV k-means clustering.

Parameters:

  • image_path (str): Absolute path to image
  • n_colors (int): Number of colors (2-10, default: 5)

Returns: Color palette with hex codes, RGB values, and percentages

3. extract_ui_text

Extract text from UI using Gemma 3 OCR capabilities.

Parameters:

  • image_path (str): Absolute path to screenshot

Returns: List of extracted text elements

4. detect_ui_bugs

Detect layout issues and accessibility problems.

Parameters:

  • image_path (str): Absolute path to screenshot

Returns: List of issues with severity and suggestions

5. detect_ui_components

Coming in Week 2 - YOLOv8 integration

6. generate_component_code

Coming in Week 3 - Code generation

GitHub Copilot Integration

Configuration

Create .vscode/mcp-settings.json:

{
  "mcpServers": {
    "ui-analyzer": {
      "command": "python",
      "args": ["/Users/manhhaycode/Developer/image-analysis/server.py"],
      "env": {}
    }
  }
}

Usage in Copilot Chat

Analyze the UI screenshot at /path/to/screenshot.png

Extract colors from /path/to/design.png

Detect bugs in ~/Desktop/app-screenshot.png

Performance

OperationTime (CPU)Time (GPU)Cached
Quick analysis2-4s1-2s<1s
Standard analysis5-8s2-3s<1s
Deep analysis12-18s4-6s<1s
Color extraction<1s<0.5s<0.1s

Cache: Results cached for 1 hour, automatic invalidation

Project Structure

image-analysis/
├── server.py                  # MCP server entry point
├── config.yaml               # Configuration
├── analyzers/
│   ├── gemma_analyzer.py     # Ollama Gemma 3 integration
│   ├── color_extractor.py    # OpenCV color extraction
│   └── __init__.py
├── orchestrator/
│   ├── cache.py              # Result caching
│   └── __init__.py
├── tests/
│   ├── fixtures/             # Sample screenshots
│   └── __init__.py
├── utils/
│   └── __init__.py
└── venv/                     # Virtual environment

Configuration

Edit config.yaml to customize:

vision:
  model: "gemma3:12b"           # Primary model
  fallback: "gemma3:2b"          # Low RAM fallback

performance:
  enable_caching: true
  cache_ttl_seconds: 3600       # 1 hour

color_extraction:
  default_n_colors: 5

Troubleshooting

Server fails to start:

# Verify Ollama is running
pgrep -x "ollama" || ollama serve &

# Check Gemma 3 model
ollama list | grep gemma3:12b

# Reinstall dependencies
pip install -r requirements.txt

Out of memory:

# Use quantized model (6.6GB vs 9GB)
ollama pull gemma3:12b-q4

# Edit config.yaml:
vision:
  model: "gemma3:12b-q4"

Image not found errors:

  • Always use absolute paths
  • Verify file exists: ls -la /path/to/image.png
  • Check file permissions

Development

Running Tests

# (Tests to be implemented)
python -m pytest tests/

Adding New Tools

  1. Implement analyzer in analyzers/
  2. Add MCP tool decorator in server.py
  3. Integrate caching
  4. Update documentation

Roadmap

  • Week 1 (COMPLETE): MVP with Gemma 3 + color extraction + caching ✓
  • Week 2: YOLOv8 component detection
  • Week 3-4: Code generation, comprehensive testing, documentation

Performance Optimization

The system includes several optimizations:

  1. Smart Caching: MD5-based image hashing with 1-hour TTL
  2. Depth Levels: User-controlled trade-off between speed and detail
  3. Lazy Loading: Components loaded only when needed
  4. Error Recovery: Graceful degradation if optional features fail

Hardware Requirements

  • Minimum: 8GB RAM, CPU only (using quantized model)
  • Recommended: 16GB RAM, any GPU
  • Optimal: 32GB RAM, GPU with 8GB+ VRAM

License

MIT License

Contributing

Contributions welcome! Please open issues or PRs on GitHub.

Support

For issues or questions:

  • Check for detailed documentation
  • Review troubleshooting section above
  • Open a GitHub issue