galacoder/mcp-whisper-transcription
If you are the rightful owner of mcp-whisper-transcription and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP Whisper Transcription Server is an advanced transcription server optimized for Apple Silicon devices, utilizing MLX-optimized Whisper models for efficient audio and video transcription.
transcribe_file
Transcribe a single audio/video file
batch_transcribe
Process multiple files in a directory
list_models
Show available Whisper models
get_model_info
Get details about a specific model
clear_cache
Clear model cache
estimate_processing_time
Estimate transcription time
validate_media_file
Check file compatibility
get_supported_formats
List supported input/output formats
MCP Whisper Transcription Server
An MCP (Model Context Protocol) server for audio/video transcription using MLX-optimized Whisper models. Optimized for Apple Silicon devices with ultra-fast performance.
Features
- 🚀 MLX-Optimized: Leverages Apple Silicon for blazing-fast transcription
- 🎯 Multiple Formats: Supports txt, md, srt, and json output formats
- 🎬 Video Support: Automatically extracts audio from video files
- 📦 Batch Processing: Process multiple files in parallel
- 🔧 MCP Integration: Full MCP protocol support with tools and resources
Installation
-
Clone the repository:
git clone https://github.com/galacoder/mcp-whisper-transcription.git cd mcp-whisper-transcription
-
Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -
-
Install dependencies:
poetry install
-
Copy environment configuration:
cp .env.example .env
Configuration
Edit .env
to configure:
DEFAULT_MODEL
: Choose from tiny, base, small, medium, large-v3, or large-v3-turboOUTPUT_FORMATS
: Comma-separated list of output formatsMAX_WORKERS
: Number of parallel workers for batch processingTEMP_DIR
: Directory for temporary files
Usage
As MCP Server
Add to your Claude Code configuration:
{
"mcpServers": {
"whisper-transcription": {
"command": "poetry",
"args": ["run", "python", "-m", "src.whisper_mcp_server"],
"cwd": "/path/to/mcp-whisper-transcription"
}
}
}
Available MCP Tools
- transcribe_file: Transcribe a single audio/video file
- batch_transcribe: Process multiple files in a directory
- list_models: Show available Whisper models
- get_model_info: Get details about a specific model
- clear_cache: Clear model cache
- estimate_processing_time: Estimate transcription time
- validate_media_file: Check file compatibility
- get_supported_formats: List supported input/output formats
Available MCP Resources
transcription://history
- Recent transcriptionstranscription://history/{id}
- Specific transcription detailstranscription://models
- Available modelstranscription://config
- Current configurationtranscription://formats
- Supported formatstranscription://performance
- Performance statistics
Development
Running Tests
poetry run pytest
Code Formatting
poetry run black .
poetry run isort .
Type Checking
poetry run mypy src/
Requirements
- Python 3.9+
- Apple Silicon Mac (for MLX optimization)
- ffmpeg (for video file support)
License
MIT License - see LICENSE file for details
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Acknowledgments
- Built with FastMCP
- Powered by MLX Whisper
- Original Whisper by OpenAI