mcpVideoParser by michaelbaker-dev - MCP Server

MCP Video Parser

A powerful video analysis system that uses the Model Context Protocol (MCP) to process, analyze, and query video content using AI vision models.

🎬 Features

AI-Powered Video Analysis: Automatically extracts and analyzes frames using vision LLMs (Llava)
Natural Language Queries: Search videos using conversational queries
Time-Based Search: Query videos by relative time ("last week") or specific dates
Location-Based Organization: Organize videos by location (shed, garage, etc.)
Audio Transcription: Extract and search through video transcripts
Chat Integration: Natural conversations with Mistral/Llama while maintaining video context
Scene Detection: Intelligent frame extraction based on visual changes
MCP Protocol: Standards-based integration with Claude and other MCP clients

🚀 Quick Start

Prerequisites

Python 3.10+
Ollama installed and running
ffmpeg (for video processing)

Installation

Clone the repository:

git clone https://github.com/michaelbaker-dev/mcpVideoParser.git
cd mcpVideoParser

Install dependencies:

pip install -r requirements.txt

Pull required Ollama models:

ollama pull llava:latest    # For vision analysis
ollama pull mistral:latest  # For chat interactions

Start the MCP server:

python mcp_video_server.py --http --host localhost --port 8000

Basic Usage

Process a video:

python process_new_video.py /path/to/video.mp4 --location garage

Start the chat client:

python standalone_client/mcp_http_client.py --chat-llm mistral:latest

Example queries:

"Show me the latest videos"
"What happened at the garage yesterday?"
"Find videos with cars"
"Give me a summary of all videos from last week"

🏗️ Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Video Files   │────▶│ Video Processor │────▶│ Frame Analysis  │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                │                         │
                                ▼                         ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   MCP Server    │◀────│ Storage Manager │◀────│   Ollama LLM    │
└─────────────────┘     └─────────────────┘     └─────────────────┘
         │
         ▼
┌─────────────────┐
│   HTTP Client   │
└─────────────────┘

🛠️ Configuration

Edit config/default_config.json to customize:

Frame extraction rate: How many frames to analyze
Scene detection sensitivity: When to capture scene changes
Storage settings: Where to store videos and data
LLM models: Which models to use for vision and chat

See for details.

🔧 MCP Tools

The server exposes these MCP tools:

process_video - Process and analyze a video file
query_location_time - Query videos by location and time
search_videos - Search video content and transcripts
get_video_summary - Get AI-generated summary of a video
ask_video - Ask questions about specific videos
analyze_moment - Analyze specific timestamp in a video
get_video_stats - Get system statistics
get_video_guide - Get usage instructions

🛠️ Utility Scripts

Video Cleanup

Clean all videos from the system and reset to a fresh state:

# Dry run to see what would be deleted
python clean_videos.py --dry-run

# Clean processed files and database (keeps originals)
python clean_videos.py

# Clean everything including original video files
python clean_videos.py --clean-originals

# Skip confirmation and backup
python clean_videos.py --yes --no-backup

This script will:

Remove all video entries from the database
Delete all processed frames and transcripts
Delete all videos from the location-based structure
Optionally delete original video files
Create a backup of the database before cleaning (unless --no-backup)

Video Processing

Process individual videos:

# Process a video with automatic location detection
python process_new_video.py /path/to/video.mp4

# Process with specific location
python process_new_video.py /path/to/video.mp4 --location garage

📖 Documentation

- Detailed MCP tool documentation
- Customization options
- How video processing works
- Contributing and testing
- Production setup

🚦 Development

Running Tests

# All tests
python -m pytest tests/ -v

# Unit tests only
python -m pytest tests/unit/ -v

# Integration tests (requires Ollama)
python -m pytest tests/integration/ -v

Project Structure

mcp-video-server/
├── src/
│   ├── llm/            # LLM client implementations
│   ├── processors/     # Video processing logic
│   ├── storage/        # Database and file management
│   ├── tools/          # MCP tool definitions
│   └── utils/          # Utilities and helpers
├── standalone_client/  # HTTP client implementation
├── config/            # Configuration files
├── tests/             # Test suite
└── video_data/        # Video storage (git-ignored)

🤝 Contributing

We welcome contributions! Please see for guidelines.

📝 Roadmap

✅ Basic video processing and analysis
✅ MCP server implementation
✅ Natural language queries
✅ Chat integration with context
🚧 Enhanced time parsing (see )
🚧 Multi-camera support
🚧 Real-time processing
🚧 Web interface

🐛 Troubleshooting

Common Issues

Ollama not running:

ollama serve  # Start Ollama

Missing models:

ollama pull llava:latest
ollama pull mistral:latest

Port already in use:

# Change port in command
python mcp_video_server.py --http --port 8001

📄 License

MIT License - see for details.

🙏 Acknowledgments

Built on FastMCP framework
Uses Ollama for local LLM inference
Inspired by the Model Context Protocol specification

💬 Support

Version: 0.1.1
Author: Michael Baker
Status: Beta - Breaking changes possible