Youtube-Vision-MCP

Youtube-Vision-MCP

3.4

If you are the rightful owner of Youtube-Vision-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The YouTube Vision MCP Server utilizes the Google Gemini Vision API to interact with YouTube videos, providing descriptions, summaries, and key moment extractions.

The YouTube Vision MCP Server is a Model Context Protocol server designed to enhance interactions with YouTube videos using the Google Gemini Vision API. It offers functionalities such as video analysis, summarization, and key moment extraction. The server is configurable via environment variables and communicates through standard input/output. It supports multiple tools for different interactions, including general descriptions, Q&A, and summarization. The server is compatible with various platforms and can be installed via Smithery or run using npx for quick use. It requires Node.js and a Google Gemini API Key for operation.

Features

  • Analyzes YouTube videos using the Gemini Vision API.
  • Provides tools for general description, Q&A, summarization, and key moment extraction.
  • Lists available Gemini models supporting content generation.
  • Configurable Gemini model via environment variable.
  • Communicates via standard input/output.

Tools

  1. ask_about_youtube_video

    Answer questions about videos or provide general descriptions

  2. summarize_youtube_video

    Generate a video summary

  3. extract_key_moments

    Extract critical moments in video

  4. list_supported_models

    List supported Gemini models