SaharshPamecha/livekit-gemini-mcp-prototype
If you are the rightful owner of livekit-gemini-mcp-prototype and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Model Context Protocol (MCP) server is a FastAPI-based server designed to facilitate MongoDB operations through a structured JSON-RPC style API, enabling efficient user data management and interaction logging.
Voice AI Microservices with LiveKit, Gemini, and MongoDB MCP
A production-ready microservices architecture featuring a real-time voice AI agent powered by LiveKit and Google Gemini, integrated with a Model Context Protocol (MCP) server for MongoDB user data operations.
Architecture Overview
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ │ │ │ │ │
│ LiveKit Room │◄───►│ Voice Agent │◄───►│ MCP Server │
│ (WebRTC) │ │ (Gemini + LK) │ │ (FastAPI) │
│ │ │ │ │ │
└─────────────────┘ └─────────────────┘ └────────┬────────┘
│
▼
┌─────────────────┐
│ │
│ MongoDB │
│ (Users DB) │
│ │
└─────────────────┘
Components
1. MCP Server (mcp-server/)
A FastAPI-based server implementing the Model Context Protocol pattern for MongoDB operations.
Features:
- JSON-RPC style API for function calls
- User CRUD operations
- Preference management
- Interaction history tracking
- Async MongoDB operations with Motor
Exposed Functions:
get_user_by_id- Fetch user by unique IDget_user_by_email- Fetch user by emailget_user_preferences- Get user settingsupdate_user_preferences- Update user settingsget_user_history- Get interaction historylog_interaction- Log user interactionscreate_user- Create new userlist_users- List all users with pagination
2. Voice Agent (voice-agent/)
A LiveKit agent using Google Gemini's real-time multimodal API for voice conversations.
Features:
- Real-time voice conversations with Gemini 2.0
- Automatic user identification from LiveKit participant identity
- MCP function calling for user-specific data
- Personalized responses based on user context
- Interaction logging
Prerequisites
- Docker and Docker Compose
- Google Cloud API key with Gemini API access
- (Optional) LiveKit Cloud account for production
Quick Start
1. Clone and Configure
cd "CAI with MongoDB MCP"
# Copy environment template
cp .env.example .env
# Edit .env with your credentials
nano .env
2. Set Required Environment Variables
# Required: Google Gemini API Key
GOOGLE_API_KEY=your_google_api_key_here
# Optional: Change MongoDB credentials
MONGO_ROOT_USERNAME=admin
MONGO_ROOT_PASSWORD=your_secure_password
3. Start Services
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f
# Check service health
docker-compose ps
4. Verify Services
# Check MCP Server health
curl http://localhost:8080/health
# List available MCP functions
curl http://localhost:8080/functions
# Test user lookup
curl -X POST http://localhost:8080/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "method": "get_user_by_id", "params": {"user_id": "user_001"}, "id": "1"}'
Development
Running Services Individually
MCP Server:
cd mcp-server
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Set environment variables
export MONGODB_URI="mongodb://localhost:27017"
export MONGODB_DATABASE="voice_ai_db"
# Run server
uvicorn src.server:app --reload --port 8080
Voice Agent:
cd voice-agent
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Set environment variables
export LIVEKIT_URL="ws://localhost:7880"
export LIVEKIT_API_KEY="devkey"
export LIVEKIT_API_SECRET="secret"
export GOOGLE_API_KEY="your_key"
export MCP_SERVER_URL="http://localhost:8080"
# Run agent
python -m src.main start
Testing MCP Functions
# Create a user
curl -X POST http://localhost:8080/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "create_user",
"params": {
"user_id": "test_user",
"name": "Test User",
"email": "test@example.com"
},
"id": "1"
}'
# Get user preferences
curl -X POST http://localhost:8080/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "get_user_preferences",
"params": {"user_id": "user_001"},
"id": "2"
}'
# Log an interaction
curl -X POST http://localhost:8080/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "log_interaction",
"params": {
"user_id": "user_001",
"interaction_type": "voice_query",
"content": "What is my account balance?"
},
"id": "3"
}'
Connecting to the Voice Agent
Using LiveKit Meet (Quick Test)
- Go to LiveKit Meet
- Enter your LiveKit server URL:
ws://localhost:7880 - Use API key
devkeyand secretsecret - Join a room - the agent will automatically connect
Programmatic Connection
from livekit import api
# Generate a token for a user
token = api.AccessToken(
api_key="devkey",
api_secret="secret"
).with_identity("user_001") # This ID is used to identify the user
.with_grants(api.VideoGrants(room_join=True, room="my-room"))
.to_jwt()
Project Structure
CAI with MongoDB MCP/
├── docker-compose.yml # Docker orchestration
├── .env.example # Environment template
├── .gitignore
├── README.md
├── scripts/
│ └── mongo-init.js # MongoDB initialization
├── mcp-server/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── .env.example
│ └── src/
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── database.py # MongoDB connection & repos
│ ├── mcp_functions.py # MCP function implementations
│ └── server.py # FastAPI server
└── voice-agent/
├── Dockerfile
├── requirements.txt
├── .env.example
└── src/
├── __init__.py
├── config.py # Configuration management
├── mcp_client.py # MCP server client
├── function_handler.py # Function call routing
├── tools.py # LiveKit tool definitions
├── agent.py # Agent implementation
└── main.py # Entry point
Configuration Reference
MCP Server Environment Variables
| Variable | Description | Default |
|---|---|---|
MONGODB_URI | MongoDB connection string | mongodb://localhost:27017 |
MONGODB_DATABASE | Database name | voice_ai_db |
MONGODB_USERS_COLLECTION | Users collection name | users |
MCP_SERVER_HOST | Server bind host | 0.0.0.0 |
MCP_SERVER_PORT | Server port | 8080 |
LOG_LEVEL | Logging level | INFO |
Voice Agent Environment Variables
| Variable | Description | Default |
|---|---|---|
LIVEKIT_URL | LiveKit server URL | ws://localhost:7880 |
LIVEKIT_API_KEY | LiveKit API key | - |
LIVEKIT_API_SECRET | LiveKit API secret | - |
GOOGLE_API_KEY | Google Gemini API key | - |
GEMINI_MODEL | Gemini model name | gemini-2.0-flash-exp |
GEMINI_VOICE | Voice for TTS | Puck |
GEMINI_TEMPERATURE | Response temperature | 0.7 |
MCP_SERVER_URL | MCP server URL | http://localhost:8080 |
LOG_LEVEL | Logging level | INFO |
Production Deployment
Security Considerations
- MongoDB: Use strong passwords, enable authentication, consider TLS
- LiveKit: Use LiveKit Cloud or secure your self-hosted instance
- API Keys: Never commit API keys, use secrets management
- Network: Use private networks between services
Scaling
- MCP Server: Stateless, can be horizontally scaled
- Voice Agent: Scale based on concurrent room requirements
- MongoDB: Use replica sets for high availability
Troubleshooting
Common Issues
MCP Server can't connect to MongoDB:
# Check MongoDB is running
docker-compose ps mongodb
# Check MongoDB logs
docker-compose logs mongodb
Voice Agent can't connect to MCP Server:
# Verify MCP server is healthy
curl http://localhost:8080/health
# Check network connectivity
docker-compose exec voice-agent curl http://mcp-server:8080/health
Gemini API errors:
- Verify your
GOOGLE_API_KEYis valid - Check API quotas in Google Cloud Console
- Ensure Gemini API is enabled for your project
License
MIT License - See LICENSE file for details.