local-transcription-mcp--parakeet-tdt-0.6b-v2--

local-transcription-mcp--parakeet-tdt-0.6b-v2--

3.3

If you are the rightful owner of local-transcription-mcp--parakeet-tdt-0.6b-v2-- and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Parakeet Transcription MCP Server is designed to transcribe audio and video files into text using NVIDIA's Parakeet TDT 0.6B V2 model.

The Parakeet Transcription MCP Server is a robust solution for converting audio and video files into text using NVIDIA's advanced Parakeet TDT 0.6B V2 model. This server is built with FastMCP and leverages `pydub` for audio conversions and `nemo_toolkit[asr]` for transcription. It requires FFmpeg for audio processing and is optimized for NVIDIA GPUs, though it can also run on CPUs. The server supports various audio formats and provides detailed transcription outputs, including timestamps and formatted text. It is designed to be easily integrated with MCP-compatible clients and offers both MCP and REST API interfaces for interaction.

Features

  • Transcribe various audio/video formats.
  • Automatic conversion to 16kHz, mono WAV or FLAC format.
  • Option to include detailed word and segment timestamps.
  • Formatted transcription output with customizable line breaks.
  • Retrieve information about the loaded ASR model and system hardware specifications.

Tools

  1. transcribe_audio

    Transcribes an audio/video file to text using the Parakeet TDT 0.6B V2 model.

  2. system_hardware_specifications

    Retrieves system hardware specifications relevant for performance estimation.