elevenlabs-podcast-mcp

adamanz/elevenlabs-podcast-mcp

3.2

If you are the rightful owner of elevenlabs-podcast-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The ElevenLabs Podcast MCP Server is designed to generate professional podcasts using the ElevenLabs v3 Text-to-Speech API with support for Audio Tags.

Tools
5
Resources
0
Prompts
0

ElevenLabs Podcast MCP Server

A Model Context Protocol (MCP) server for generating professional podcasts using ElevenLabs v3 Text-to-Speech API with Audio Tags support.

🎯 Key Features

  • πŸŽ™οΈ Multi-speaker dialogue with natural interruptions and overlapping speech
  • 🏷️ Audio Tags for emotional control [excited], delivery [whispers], and effects [laughs]
  • ⏱️ Smart duration control - content-aware with 10-minute maximum
  • 🎨 Multiple podcast styles - interview, narrative, discussion, educational, comedy
  • 🎭 Tone presets - professional, casual, excited, calm, dramatic
  • 🌍 70+ language support with consistent voice quality
  • πŸ“ AI script generation with Audio Tags
  • πŸ”„ Batch processing for long-form content
  • πŸ”Š High-quality audio output (up to 192kbps MP3)
  • πŸš€ Built with FastMCP for easy integration

πŸ“¦ Installation

  1. Clone the repository:
git clone <repository-url>
cd elevenlabs-podcast-mcp
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
cp .env.example .env
# Edit .env and add your ElevenLabs API key

πŸš€ Quick Start

Running the Server

Development mode:

fastmcp dev server.py

Production mode:

fastmcp run server.py --transport sse

πŸ› οΈ Available Tools

Core Tools

generate_podcast

Generate a complete podcast with Audio Tags, configurable style and tone.

{
    "script": "Host: [excitedly] Welcome! Guest: [thoughtfully] Great to be here!",
    "style": "interview",  # interview, narrative, discussion, educational, comedy
    "tone": "professional", # professional, casual, excited, calm, dramatic
    "duration_minutes": null,  # Auto-calculates based on content (max 10 min)
    "auto_duration": true,
    "voice_mapping": {"Host": "voice_id_1", "Guest": "voice_id_2"},
    "output_path": "output/episode.mp3"
}
generate_script

AI-powered script generation with Audio Tags.

{
    "topic": "Artificial Intelligence",
    "style": "interview",
    "duration_minutes": 5,
    "include_tags": true  # Includes Audio Tags for emotions
}

Example output:

Host: [excitedly] Welcome to Tech Talks! Today we're exploring AI.
Guest: [thoughtfully] This technology is transforming everything.
Host: [interrupting] β€”That's exactly what our listeners want to know!
generate_long_podcast

Handle long-form content with automatic batching (>3000 chars).

{
    "script": "Very long podcast script...",
    "style": "narrative",
    "tone": "dramatic",
    "output_path": "output/long_episode.mp3"
}
preview_podcast

Quick preview generation for testing voices and tones.

{
    "text": "[whispers] Testing the preview feature",
    "voice_id": "21m00Tcm4TlvDq8ikWAM",
    "tone": "dramatic"
}

Voice Management

list_voices

List all available voices from your ElevenLabs account.

Utility Tools

create_podcast_project

Create a structured project directory.

{
    "project_name": "MyPodcast",
    "description": "Weekly tech discussions"
}

🏷️ Audio Tags Reference

Audio Tags are wrapped in square brackets and control voice performance:

Emotions

  • [excited], [happy], [sad], [angry], [thoughtfully], [nervously]

Delivery

  • [whispers], [shouts], [quietly], [loudly]
  • [pause], [stammers], [rushed]

Reactions

  • [laughs], [sighs], [gasps], [clears throat], [chuckles]

Dialogue Dynamics

  • [interrupting], [overlapping], [jumping in]

Accents

  • [British accent], [French accent], [Australian accent]

Example Script with Audio Tags

Host: [excitedly] Welcome to our show! [pause] Today's topic is fascinating.
Guest: [thoughtfully] Indeed. [sighs] Let me explain why...
Host: [interrupting] β€”Actually, that reminds me of something!
Guest: [laughs] You always do that! [continuing] As I was saying...
Host: [whispers] Sorry, go ahead.
Guest: [normal voice] The key point is... [dramatically] Everything changes now!

🎨 Podcast Styles

Interview

Professional Q&A format with host and guest dynamics.

Narrative

Storytelling format with dramatic elements.

Discussion

Multi-participant roundtable with natural interruptions.

Educational

Clear, structured learning content.

Comedy

Humorous delivery with timing and sarcasm.

🎭 Tone Presets

Each tone adjusts voice parameters:

  • Professional: Balanced, clear delivery (stability: 0.7)
  • Casual: Relaxed, conversational (stability: 0.4)
  • Excited: High energy, enthusiastic (stability: 0.3)
  • Calm: Soothing, measured pace (stability: 0.8)
  • Dramatic: Theatrical, expressive (stability: 0.5)

πŸ“š Available Resources

  • voices://presets - Preset voice configurations
  • config://settings - Server configuration
  • templates://podcast-scripts - Script templates with Audio Tags

πŸ’‘ Usage Examples

Simple Podcast with Emotion

client.call_tool("generate_podcast", {
    "script": "Host: [excitedly] Breaking news everyone!",
    "style": "interview",
    "tone": "excited"
})

Multi-Speaker with Interruptions

script = """
Host: [starting] So the main issue isβ€”
Guest: [interrupting] β€”Actually, I disagree!
Host: [surprised] Oh? Tell me more.
Guest: [explaining] Well, when you consider...
"""

client.call_tool("generate_podcast", {
    "script": script,
    "style": "discussion"
})

Auto-Generated Script

# First generate the script
script = client.call_tool("generate_script", {
    "topic": "Space Exploration",
    "style": "narrative",
    "include_tags": true
})

# Then create the podcast
client.call_tool("generate_podcast", {
    "script": script,
    "auto_duration": true
})

βš™οΈ Configuration

Environment Variables

ELEVENLABS_API_KEY=your_api_key_here
ELEVENLABS_MODEL=eleven_v3  # ALWAYS use v3 for Audio Tags
MAX_DURATION_MINUTES=10
DEFAULT_SPEAKING_RATE=150

Voice Defaults

  • Host: Rachel (21m00Tcm4TlvDq8ikWAM)
  • Guest: Drew (29vD33N1CtxCmqQRPOHJ)
  • Narrator: Bella (EXAVITQu4vr4xnSDxMaL)

πŸ”§ Development

Project Structure

elevenlabs-podcast-mcp/
β”œβ”€β”€ server.py              # Main MCP server with all tools
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ .env.example          # Environment template
β”œβ”€β”€ CLAUDE.md             # AI context documentation
β”œβ”€β”€ README.md             # This file
└── ai-docs/              # Additional documentation

Adding Custom Tools

@mcp.tool
async def your_custom_tool(param: str) -> Dict:
    """Your tool description."""
    # Implementation
    return {"result": "success"}

Testing

# Inspect available tools
fastmcp inspect server.py

# Test specific tool
fastmcp dev server.py

πŸ“‹ Requirements

  • Python 3.11+
  • ElevenLabs API key (v3 access required)
  • FastMCP framework
  • pydub (for audio processing)

⚠️ Important Notes

  1. Always use eleven_v3 model for Audio Tags support
  2. Character limit: 3000 per request (auto-batching for longer content)
  3. Professional Voice Clones (PVCs) not fully optimized for v3 yet
  4. Recommended: Use Instant Voice Clones (IVC) or designed voices

πŸ› Troubleshooting

Rate Limiting

The server includes automatic retry with exponential backoff.

Long Content

Use generate_long_podcast for content >3000 characters.

Audio Tags Not Working

Ensure you're using eleven_v3 model, not eleven_turbo_v2_5.

πŸ“„ License

MIT

πŸ’¬ Support

For issues or questions, please open a GitHub issue.