CodingButter/speak2me-mcp
If you are the rightful owner of speak2me-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Speak2Me-MCP is a Model Context Protocol server that integrates voice capabilities into Claude Code and other MCP clients, utilizing advanced TTS and STT technologies.
speak2me-mcp
Voice MCP Server with STT/TTS capabilities - Elysia backend + React PWA frontend
A Model Context Protocol (MCP) server that adds voice capabilities to Claude Code and other MCP clients. Speak text using high-quality TTS (ElevenLabs) with SSML enrichment (OpenAI), and listen to voice input with STT (Google Gemini).
Features
- šļø Voice Input: Capture and transcribe voice using Google Gemini STT with VAD and chunking
- š Voice Output: Convert text to speech using ElevenLabs with OpenAI-powered SSML enrichment
- š MCP Integration: Two tools (
speak
andlisten
) accessible from Claude Code and other MCP clients - š¬ PWA Interface: React-based operator console with conversation history and audio replay
- š Multi-Session: Support multiple concurrent MCP connections with separate conversation histories
- ā Tested: 81 tests covering schemas, tools, storage, and session management
Architecture
This is a Bun monorepo with:
- Backend (
apps/backend
): Elysia server with MCP SSE endpoints - Frontend (
apps/frontend
): React PWA with audio controls and conversation UI - Packages:
core
: MCP tools, AI services (TTS/STT/SSML), session managementdatabase
: Prisma storage layer for conversations and messagesshared
: Zod schemas and TypeScript typesplatform
: Web/Electron adaptersui
: Shared React components
Quick Start
Prerequisites
- Bun (v1.0+)
- API Keys:
- OpenAI (for SSML enrichment)
- ElevenLabs (for TTS)
- Google Gemini (for STT)
Installation
# Clone the repo
git clone https://github.com/CodingButter/speak2me-mcp.git
cd speak2me-mcp
# Install dependencies
bun install
# Set up database
cd packages/database
bun run db:generate
bun run db:push
cd ../..
# Configure API keys (backend)
cp apps/backend/.env.example apps/backend/.env
# Edit apps/backend/.env with your API keys
Development
# Start both backend and frontend
bun run dev
# Or start individually
bun run dev:backend # Backend on http://localhost:3000
bun run dev:frontend # Frontend on http://localhost:5173
Testing
# Run all tests
bun test
# Watch mode
bun test:watch
# Coverage
bun test:coverage
MCP Integration
Connect Claude Code (or other MCP clients) to the voice server:
1. Start the backend
bun run dev:backend
2. Add to your project's .mcp.json
{
"mcpServers": {
"voice": {
"url": "http://localhost:3000/sse/my-project-id"
}
}
}
Each project can have its own conversationId
(the last path segment) to maintain separate histories.
3. Use the tools
Claude Code will auto-discover two tools:
speak
- Convert text to speech
{
text: string, // Required: text to speak
ssml?: string, // Optional: provide your own SSML
voiceId?: string, // Optional: ElevenLabs voice ID
model?: string, // Optional: ElevenLabs model
stream?: boolean // Optional: stream audio (default: true)
}
listen
- Capture and transcribe voice
{
mode: "auto" | "manual" | "ptt", // Required: listening mode
vadThreshold?: number, // Optional: VAD threshold (0-1)
minSilenceMs?: number, // Optional: silence duration
maxUtteranceMs?: number, // Optional: max utterance length
locale?: string // Optional: e.g., "en-US"
}
Project Structure
speak2me-mcp/
āāā apps/
ā āāā backend/ # Elysia MCP server
ā ā āāā src/
ā ā ā āāā index.ts # Main server
ā ā ā āāā mcp/ # SSE transport, tool handlers
ā ā ā āāā api/ # REST endpoints
ā ā āāā package.json
ā āāā frontend/ # React PWA
ā āāā src/
ā ā āāā components/ # UI components
ā ā āāā hooks/ # Audio capture hooks
ā ā āāā services/ # Audio encoding
ā āāā package.json
āāā packages/
ā āāā core/ # MCP tools & services
ā ā āāā src/
ā ā āāā mcp/ # handleSpeak, handleListen
ā ā āāā services/ # TTS, STT, SSML enhancer
ā ā āāā session/ # SessionManager
ā ā āāā operations/ # CoreOperations
ā āāā database/ # Prisma storage
ā ā āāā prisma/
ā ā ā āāā schema.prisma
ā ā āāā src/storage.ts
ā āāā shared/ # Schemas & types
ā ā āāā src/
ā ā āāā schemas.ts # Zod schemas
ā ā āāā types.ts # TypeScript types
ā āāā platform/ # Web/Electron adapters
ā āāā ui/ # Shared components
ā āāā config/ # Shared config
āāā package.json # Root workspace
Scripts
Root Level
bun run dev
- Start both apps in dev modebun run dev:backend
- Start backend onlybun run dev:frontend
- Start frontend onlybun run build
- Build all appsbun test
- Run all testsbun run typecheck
- Type check all packagesbun run lint
- Lint all packagesbun run format
- Format code with Prettier
Backend
bun run dev
- Dev with hot reloadbun run build
- Build for productionbun run start
- Start production buildbun test
- Run backend tests
Frontend
bun run dev
- Dev serverbun run build
- Build for productionbun run preview
- Preview production buildbun test
- Run frontend tests
Database
bun run db:generate
- Generate Prisma clientbun run db:push
- Push schema to databasebun run db:migrate
- Create migrationbun run db:studio
- Open Prisma Studio
Git Hooks
This project uses pre-push hooks to ensure code quality:
- Pre-push: Runs all tests before allowing push to remote
- Tests must pass before code can be pushed
- Located in
.git/hooks/pre-push
API Keys Configuration
Keys can be stored two ways:
Server-side (Recommended for self-hosted)
Create apps/backend/.env
:
OPENAI_API_KEY=sk-...
ELEVENLABS_API_KEY=...
GEMINI_API_KEY=...
Client-side (PWA UI)
Users can enter keys in the PWA Settings panel. Keys are stored per conversation.
Documentation
- CLAUDE.md - Instructions for Claude Code when working in this repo
- Project Scope Document.md - Full product requirements and architecture
Tech Stack
- Runtime: Bun
- Backend: Elysia, @modelcontextprotocol/sdk, Prisma
- Frontend: React 18, Zustand, TailwindCSS, @ricky0123/vad-web
- AI Services: OpenAI, ElevenLabs, Google Gemini
- Validation: Zod
- Testing: Bun Test
Contributing
- Fork the repo
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Run tests (
bun test
) - Commit (
git commit -m 'Add amazing feature'
) - Push to your fork (
git push origin feature/amazing-feature
) - Open a Pull Request
License
MIT
Credits
Built with Claude Code