livekit-gemini-mcp-prototype

SaharshPamecha/livekit-gemini-mcp-prototype

3.2

If you are the rightful owner of livekit-gemini-mcp-prototype and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The Model Context Protocol (MCP) server is a FastAPI-based server designed to facilitate MongoDB operations through a structured JSON-RPC style API, enabling efficient user data management and interaction logging.

Voice AI Microservices with LiveKit, Gemini, and MongoDB MCP

A production-ready microservices architecture featuring a real-time voice AI agent powered by LiveKit and Google Gemini, integrated with a Model Context Protocol (MCP) server for MongoDB user data operations.

Architecture Overview

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│                 │     │                 │     │                 │
│  LiveKit Room   │◄───►│  Voice Agent    │◄───►│   MCP Server    │
│   (WebRTC)      │     │ (Gemini + LK)   │     │  (FastAPI)      │
│                 │     │                 │     │                 │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                         │
                                                         ▼
                                                ┌─────────────────┐
                                                │                 │
                                                │    MongoDB      │
                                                │   (Users DB)    │
                                                │                 │
                                                └─────────────────┘

Components

1. MCP Server (mcp-server/)

A FastAPI-based server implementing the Model Context Protocol pattern for MongoDB operations.

Features:

  • JSON-RPC style API for function calls
  • User CRUD operations
  • Preference management
  • Interaction history tracking
  • Async MongoDB operations with Motor

Exposed Functions:

  • get_user_by_id - Fetch user by unique ID
  • get_user_by_email - Fetch user by email
  • get_user_preferences - Get user settings
  • update_user_preferences - Update user settings
  • get_user_history - Get interaction history
  • log_interaction - Log user interactions
  • create_user - Create new user
  • list_users - List all users with pagination

2. Voice Agent (voice-agent/)

A LiveKit agent using Google Gemini's real-time multimodal API for voice conversations.

Features:

  • Real-time voice conversations with Gemini 2.0
  • Automatic user identification from LiveKit participant identity
  • MCP function calling for user-specific data
  • Personalized responses based on user context
  • Interaction logging

Prerequisites

  • Docker and Docker Compose
  • Google Cloud API key with Gemini API access
  • (Optional) LiveKit Cloud account for production

Quick Start

1. Clone and Configure

cd "CAI with MongoDB MCP"

# Copy environment template
cp .env.example .env

# Edit .env with your credentials
nano .env

2. Set Required Environment Variables

# Required: Google Gemini API Key
GOOGLE_API_KEY=your_google_api_key_here

# Optional: Change MongoDB credentials
MONGO_ROOT_USERNAME=admin
MONGO_ROOT_PASSWORD=your_secure_password

3. Start Services

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Check service health
docker-compose ps

4. Verify Services

# Check MCP Server health
curl http://localhost:8080/health

# List available MCP functions
curl http://localhost:8080/functions

# Test user lookup
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "method": "get_user_by_id", "params": {"user_id": "user_001"}, "id": "1"}'

Development

Running Services Individually

MCP Server:

cd mcp-server
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Set environment variables
export MONGODB_URI="mongodb://localhost:27017"
export MONGODB_DATABASE="voice_ai_db"

# Run server
uvicorn src.server:app --reload --port 8080

Voice Agent:

cd voice-agent
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Set environment variables
export LIVEKIT_URL="ws://localhost:7880"
export LIVEKIT_API_KEY="devkey"
export LIVEKIT_API_SECRET="secret"
export GOOGLE_API_KEY="your_key"
export MCP_SERVER_URL="http://localhost:8080"

# Run agent
python -m src.main start

Testing MCP Functions

# Create a user
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "create_user",
    "params": {
      "user_id": "test_user",
      "name": "Test User",
      "email": "test@example.com"
    },
    "id": "1"
  }'

# Get user preferences
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "get_user_preferences",
    "params": {"user_id": "user_001"},
    "id": "2"
  }'

# Log an interaction
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "log_interaction",
    "params": {
      "user_id": "user_001",
      "interaction_type": "voice_query",
      "content": "What is my account balance?"
    },
    "id": "3"
  }'

Connecting to the Voice Agent

Using LiveKit Meet (Quick Test)

  1. Go to LiveKit Meet
  2. Enter your LiveKit server URL: ws://localhost:7880
  3. Use API key devkey and secret secret
  4. Join a room - the agent will automatically connect

Programmatic Connection

from livekit import api

# Generate a token for a user
token = api.AccessToken(
    api_key="devkey",
    api_secret="secret"
).with_identity("user_001")  # This ID is used to identify the user
 .with_grants(api.VideoGrants(room_join=True, room="my-room"))
 .to_jwt()

Project Structure

CAI with MongoDB MCP/
├── docker-compose.yml          # Docker orchestration
├── .env.example                # Environment template
├── .gitignore
├── README.md
├── scripts/
│   └── mongo-init.js           # MongoDB initialization
├── mcp-server/
│   ├── Dockerfile
│   ├── requirements.txt
│   ├── .env.example
│   └── src/
│       ├── __init__.py
│       ├── config.py           # Configuration management
│       ├── database.py         # MongoDB connection & repos
│       ├── mcp_functions.py    # MCP function implementations
│       └── server.py           # FastAPI server
└── voice-agent/
    ├── Dockerfile
    ├── requirements.txt
    ├── .env.example
    └── src/
        ├── __init__.py
        ├── config.py           # Configuration management
        ├── mcp_client.py       # MCP server client
        ├── function_handler.py # Function call routing
        ├── tools.py            # LiveKit tool definitions
        ├── agent.py            # Agent implementation
        └── main.py             # Entry point

Configuration Reference

MCP Server Environment Variables

VariableDescriptionDefault
MONGODB_URIMongoDB connection stringmongodb://localhost:27017
MONGODB_DATABASEDatabase namevoice_ai_db
MONGODB_USERS_COLLECTIONUsers collection nameusers
MCP_SERVER_HOSTServer bind host0.0.0.0
MCP_SERVER_PORTServer port8080
LOG_LEVELLogging levelINFO

Voice Agent Environment Variables

VariableDescriptionDefault
LIVEKIT_URLLiveKit server URLws://localhost:7880
LIVEKIT_API_KEYLiveKit API key-
LIVEKIT_API_SECRETLiveKit API secret-
GOOGLE_API_KEYGoogle Gemini API key-
GEMINI_MODELGemini model namegemini-2.0-flash-exp
GEMINI_VOICEVoice for TTSPuck
GEMINI_TEMPERATUREResponse temperature0.7
MCP_SERVER_URLMCP server URLhttp://localhost:8080
LOG_LEVELLogging levelINFO

Production Deployment

Security Considerations

  1. MongoDB: Use strong passwords, enable authentication, consider TLS
  2. LiveKit: Use LiveKit Cloud or secure your self-hosted instance
  3. API Keys: Never commit API keys, use secrets management
  4. Network: Use private networks between services

Scaling

  • MCP Server: Stateless, can be horizontally scaled
  • Voice Agent: Scale based on concurrent room requirements
  • MongoDB: Use replica sets for high availability

Troubleshooting

Common Issues

MCP Server can't connect to MongoDB:

# Check MongoDB is running
docker-compose ps mongodb

# Check MongoDB logs
docker-compose logs mongodb

Voice Agent can't connect to MCP Server:

# Verify MCP server is healthy
curl http://localhost:8080/health

# Check network connectivity
docker-compose exec voice-agent curl http://mcp-server:8080/health

Gemini API errors:

  • Verify your GOOGLE_API_KEY is valid
  • Check API quotas in Google Cloud Console
  • Ensure Gemini API is enabled for your project

License

MIT License - See LICENSE file for details.