README - MCP-YouTube-Transcribe by jackhp

MCP-YouTube-Transcribe

An MCP server that provides a tool to fetch transcripts from YouTube videos. It first attempts to retrieve a pre-existing, official transcript. If one is not available, it downloads the video's audio and uses OpenAI's Whisper model for local AI-powered transcription.

This project is designed to be a simple, self-contained tool that can be easily integrated into any system capable of communicating with an MCP server.

Features

YouTube Video Search: Finds the most relevant YouTube video based on a text query.
Official Transcript Priority: Intelligently fetches manually created or auto-generated YouTube transcripts first for speed and accuracy.
Fast AI-Powered Transcription: Uses whisper.cpp (if available) for blazing fast transcription. Falls back to OpenAI's Python Whisper tiny model if whisper.cpp is not installed.
MCP Server Interface: Exposes the transcription functionality as a simple tool (get_youtube_transcript) via the lightweight model context protocol.

Requirements

Python 3.12+
uv: A fast Python package installer and resolver. You will need to install uv on your system first.
FFmpeg: Must be installed and available in your system's PATH. Required for audio processing.
whisper.cpp (Highly recommended): MCP-YouTube-Transcribe will first try to use whisper.cpp for lightning-fast local transcription and only fall back to Python Whisper if the executable is not found.
- macOS: brew install whisper-cpp
- Linux: Build from source following the whisper.cpp installation guide
- Windows: Build from source or grab a pre-built binary from the releases page
After installation, make sure the whisper-cli (or whisper-cpp on older versions) command is in your PATH.

Finally, download a Whisper model. The tiny model offers the best speed-to-quality ratio for most use-cases:
```
mkdir -p models
curl -L -o models/ggml-tiny.bin \
     https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.bin
```
Place additional models in the same models/ folder if you wish to experiment.

Installation with `uv`

Using uv is recommended as it's extremely fast and handles both environment creation and package installation seamlessly.

Clone the repository:

git clone https://github.com/<your-username>/YouTubeTranscriber.git
cd YouTubeTranscriber

Create and activate a virtual environment: This command creates a .venv folder in your project directory and activates it. uv will automatically use this environment for all subsequent commands.
```
uv venv
```
Install the project and its dependencies: This command reads the pyproject.toml file and installs all required libraries into the virtual environment.
```
uv sync
```

Usage

Running the MCP Server

Once installed, you can start the server by running the mcp_server.py script. The server will listen for JSON-RPC requests on stdin and send responses to stdout.

python mcp_server.py

The server will log its activity to a file named mcp_server.log in the project's root directory.

Connecting to Gemini CLI on Windows

You can connect this MCP server to the Google Gemini CLI to use the function as a native tool directly from your terminal. These instructions are for a Windows environment.

Step 1: Create a Startup Script `run_server.bat`

The Gemini CLI needs a single, reliable command to start your server. A batch script is the perfect way to handle this on Windows, as it ensures the correct virtual environment and Python interpreter are used.

Create a new file named in the root of your project directory. run_server.bat
Copy and paste the following content into the file:

    @echo off
    REM This ensures the script's directory is the current directory
    cd /d "%~dp0"
    
    REM --- IMPORTANT ---
    REM Replace the path below with the ABSOLUTE path to your project's venv python.exe
    set PYTHON_EXE="C:\Users\jackp\.pyenv\pyenv-win\versions\3.12.10\python3.12.exe"
    
    echo --- Starting MCP Server using %PYTHON_EXE% ---
    %PYTHON_EXE% mcp_server.py
    
    pause

This script activates the virtual environment in your project and then runs the server, ensuring all the correct dependencies are available..venv

Step 2: Configure the Gemini CLI

Now, you need to tell the Gemini CLI how to find and run your new server.

Locate your Gemini CLI config.json file. On Windows, this is typically found at: C:\Users\<Your-Username>\.gemini\config.json
Open the config.json file in a text editor. Add the following entry to the mcpServers object. If mcpServers doesn't exist, create it as shown below.

    {
      "mcpServers": {
        "MCP-YouTube-Transcribe": {
          "command": "C:\\Windows\\System32\\cmd.exe",
          "args": [
            "/c",
            "<path-to-your-project>\\run_server.bat"
          ],
          "cwd": "<path-to-your-project>"
        }
      }
    }

Crucially, you must replace both instances of <path-to-your-project> with the full, absolute path to where you cloned the YouTubeTranscriber repository.

Example: If your project is located at C:\dev\YouTubeTranscriber, the entry would look like this:

    {
      "mcpServers": {
        "MCP-YouTube-Transcribe": {
          "command": "C:\\Windows\\System32\\cmd.exe",
          "args": [
            "/c",
            "C:\\dev\\YouTubeTranscriber\\run_server.bat"
          ],
          "cwd": "C:\\dev\\YouTubeTranscriber"
        }
      }
    }

Note: JSON requires backslashes to be escaped, so you must use double backslashes (\\) in your paths.

Step 3: Verify the Connection

After saving the config.json file, you can verify that Gemini CLI recognizes and can use your new tool.

Run Gemini CLI and press ctrl+t

You should see MCP-YouTube-Transcribe listed as an available tool.

Connecting to Gemini CLI on Mac/Unix

You can also connect this MCP server to the Google Gemini CLI on Mac or other Unix-like systems. The process is similar to Windows but uses a shell script instead of a batch file.

Step 1: Prepare the Startup Script

The repository already includes a run_server.sh script. Just make it executable:

chmod +x run_server.sh

Step 2: Configure the Gemini CLI

Locate your Gemini CLI config.json file. On Mac/Unix systems, this is typically found at: ~/.gemini/config.json
Open the config.json file in a text editor. Add the following entry to the mcpServers object. If mcpServers doesn't exist, create it as shown below:

{
  "mcpServers": {
    "MCP-YouTube-Transcribe": {
      "command": "/path/to/your/project/run_server.sh",
      "cwd": "/path/to/your/project"
    }
  }
}

Replace both instances of /path/to/your/project with the absolute path to where you cloned the repository.

Example: If your project is located at /Users/username/MCP-YouTube-Transcribe, the entry would look like this:

{
  "mcpServers": {
    "MCP-YouTube-Transcribe": {
      "command": "/Users/username/MCP-YouTube-Transcribe/run_server.sh",
      "cwd": "/Users/username/MCP-YouTube-Transcribe"
    }
  }
}

Step 3: Verify the Connection

After saving the config.json file, you can verify that Gemini CLI recognizes and can use your new tool.

Run Gemini CLI and press ctrl+t

You should see MCP-YouTube-Transcribe listed as an available tool.

MCP Client Example

You can interact with the server using any client that supports the MCP protocol over stdio. The server exposes one primary tool: get_youtube_transcript.

Here is an example of a call_tool request to get a transcript for the query "What is an API? by MuleSoft".

Request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "call_tool",
  "params": {
    "name": "get_youtube_transcript",
    "arguments": {
      "query": "What is an API? by MuleSoft",
      "force_whisper": false
    }
  }
}

query: The search term for the YouTube video.
force_whisper: (Optional) A boolean that, if true, skips the check for an official transcript and generates one directly with Whisper. Defaults to false.

Testing

This project includes a test suite to verify its functionality.

Core Function Test (simple.py): This script tests the server's handler functions directly without needing to run a separate server process. It's the quickest way to check if the core logic is working.
```
python simple.py
```
Full Server Test (test_mcp.py): This script starts the MCP server as a subprocess and sends it live JSON-RPC requests, providing an end-to-end test of the server's functionality.
```
python test_mcp.py
```

Configuration

Logging: Server activity is logged to mcp_server.log.
Audio Cache: When Whisper is used, downloaded audio files are temporarily stored in testing/audio_cache/. You may wish to change this path in youtube_tool.py for a production environment.

Contributing

Contributions are welcome! If you'd like to improve the YouTube Transcriber, please feel free to fork the repository and submit a pull request.

Please read our CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests to us.

License

This project is licensed under the MIT License - see the LICENSE file for details.

JackHP/MCP-YouTube-Transcribe

MCP-YouTube-Transcribe

Features

Requirements

Installation with uv

Usage

Running the MCP Server

Connecting to Gemini CLI on Windows

Step 1: Create a Startup Script run_server.bat

Step 2: Configure the Gemini CLI

Step 3: Verify the Connection

Connecting to Gemini CLI on Mac/Unix

Step 1: Prepare the Startup Script

Step 2: Configure the Gemini CLI

Step 3: Verify the Connection

MCP Client Example

Testing

Configuration

Contributing

License

Installation with `uv`

Step 1: Create a Startup Script `run_server.bat`