fw2274/flight_agent
If you are the rightful owner of flight_agent and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Flight Search MCP server provides a comprehensive solution for finding and booking flights using the Model Context Protocol.
Multi-Agent Voice-Activated Flight Search System
A comprehensive flight search agent with voice input capabilities using Google ADK, Amadeus API, LangGraph, and Model Context Protocol (MCP).
Table of Contents
- Overview
- Quick Start
- System Architecture
- Voice Integration Setup
- Usage Guide
- Technical Details
- Troubleshooting
- Integration Examples
Overview
This project implements a multi-agent system that enables users to search for flights using either voice commands or text input. The application is designed to improve accessibility for users who have difficulties with writing or interacting with keypads, including individuals with motor impairments, visual disabilities, or those who prefer voice interaction.
System Flow
┌─────────────────┐
│ User speaks │
│ flight query │
└────────┬────────┘
│
▼
┌─────────────────────────┐
│ Voice-to-Text MCP Server│ ← Rust-based, uses Whisper AI
│ (Rust + Whisper) │
└────────┬────────────────┘
│
▼ (JSON-RPC)
┌─────────────────────────┐
│ voice_mcp_client.py │ ← Python MCP client
│ (Python MCP Client) │
└────────┬────────────────┘
│
▼
┌─────────────────────────┐
│ flight_search_vtt.py │ ← Enhanced flight search
│ (Interpreter + Executor)│
└────────┬────────────────┘
│
▼
┌─────────────────────────┐
│ LangGraph Flight Search │ ← Existing Amadeus integration
│ (agent_graph.py) │
└─────────────────────────┘
Key Capabilities
- 🎤 Voice Input: Speak your flight requirements naturally
- 💬 Text Input: Traditional text query support
- 🤖 Dual-Agent System: Interpreter agent + Executor agent
- ✈️ Real Flight Data: Amadeus API integration
- 🧠 Smart Parsing: Gemini AI for natural language understanding
- 🎯 Accurate Results: Structured flight search with IATA codes and ISO dates
Quick Start
Text Input Mode
Prerequisites:
- Python 3.10+
- Amadeus API credentials (Get them here)
- Google API key
Setup:
# 1. Setup environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
# 2. Configure credentials
echo "GOOGLE_API_KEY=your_key_here" >> .env
echo "AMADEUS_API_KEY=your_key_here" >> .env
echo "AMADEUS_API_SECRET=your_secret_here" >> .env
# 3. Run test search
python flight_search.py --query "Find a round-trip flight from ATL to JFK on Dec 02 returning Dec 15 for 2 adults in economy"
Flags: --verbose (stream tool calls), --debug (full timeline)
Voice Input Mode
Additional Prerequisites:
- Rust
- Microphone
- ~200MB disk space for Whisper model
Setup:
# 1. Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
# 2. Build voice-to-text MCP server
cd voice-to-text-mcp
cargo build --release
./scripts/download-models.sh # Choose ggml-base.en.bin
cd ..
# 3. Verify setup
./test_voice_setup.sh
# 4. Test voice input
python3 flight_search_vtt.py --voice
What to say:
- "Find a round trip from Atlanta to New York, December first to December fifteenth, two adults, economy"
- "I need a flight from San Francisco to Chicago on January tenth, business class"
System Architecture
Three-Agent Architecture
Agent 1: Voice Recognition Agent
- Captures spoken input from users and converts it into text
- Technology: Whisper AI via MCP server (Rust-based)
- Hardware acceleration: Metal/CoreML (macOS), CUDA (Linux/Windows)
Agent 2: Information Extraction Agent
- Processes transcribed text and extracts structured parameters
- Model: Gemini 2.5 Flash Lite
- Extracts:
- Origin and destination cities/airports (IATA codes)
- Departure and return dates (ISO format)
- Number of passengers (adults, children, infants)
- Cabin class preferences
- Special requirements or preferences
Agent 3: Flight Search Agent
- Executes flight search based on structured information
- Framework: LangGraph
- API: Amadeus (real flight data)
- Output: Flight options with prices, times, airlines, duration
Component Details
The Amadeus tooling comes from langgraph_travel_agent (vendored under langgraph_travel_agent/backend). We wrap its agent_graph.py primitives so Google ADK can call them directly:
agent_graph_module.search_flights→ exposed to ADK via asyncsearch_flightswrapperagent_graph_module.amadeusclient → validated on startup
Files you'll care about:
- — main entry; wires Gemini to LangGraph tool (text input only)
- — flight search with voice-to-text integration
- — Python client for voice-to-text MCP server
- — search_flights LangChain tool and Amadeus plumbing
voice-to-text-mcp/— Rust-based MCP server for speech recognition
Voice Integration Setup
Step 1: Install Rust
# Install Rust via rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Reload shell configuration
source $HOME/.cargo/env
# Verify installation
cargo --version
# Expected: cargo 1.91.1 (or later)
Step 2: Build MCP Server
cd voice-to-text-mcp
# Build release version (takes 2-3 minutes first time)
cargo build --release
# Verify binary was created
ls -lh target/release/voice-to-text-mcp
# Expected: ~4.5MB binary
cd ..
Step 3: Download Whisper Model
Choose the right model for your needs:
| Model | Size | Speed | Accuracy | Use Case |
|---|---|---|---|---|
| ggml-tiny.en.bin | 75MB | Very Fast | Good | Testing, prototyping |
| ggml-base.en.bin | 142MB | Fast | Better | General use (recommended) ⭐ |
| ggml-small.en.bin | 466MB | Slower | Best | High accuracy needs |
Interactive download:
cd voice-to-text-mcp
./scripts/download-models.sh
# Choose: ggml-base.en.bin
cd ..
Manual download:
cd voice-to-text-mcp/models/
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
cd ../..
Step 4: Verify Setup
./test_voice_setup.sh
What it checks:
- ✓ Rust installation (cargo)
- ✓ MCP repository presence
- ✓ Binary build status
- ✓ Whisper model availability
- ✓ Python integration files
- ✓ Environment variables
Step 5: Test Voice Input
# Basic voice input test
python3 flight_search_vtt.py --voice
# Voice with debug output
python3 flight_search_vtt.py --voice --debug --verbose
Expected flow:
🎤 Listening... (max 30s, silence timeout 2s)- Speak your flight requirement
- Recording stops after 2 seconds of silence
- Transcription appears
- Interpreter parses parameters
- Executor searches flights
- Results displayed
Usage Guide
Command Reference
flight_search_vtt.py options:
--voice Enable voice input
--voice-timeout N Recording timeout in milliseconds (default: 30000)
--voice-silence-timeout N Silence timeout in milliseconds (default: 2000)
--mcp-server PATH Path to MCP server binary
--mcp-model PATH Path to Whisper model
--query TEXT Use text query instead of voice
--debug Show debug timeline
--verbose Show tool calls/responses
Voice Input Best Practices
For best transcription results:
- Environment: Speak in a quiet room, reduce background noise
- Speaking style: Speak clearly at normal pace, use complete sentences
- Dates: State dates explicitly ("December fifteenth" not "12/15")
- Pauses: Pause briefly between thoughts (silence detection helps)
Example good inputs:
"Find a round trip from Atlanta to New York,
departing December 1st, returning December 15th,
for 2 adults in economy"
"I need a flight from San Francisco to Chicago
on January 10th, one way, business class"
"Search for flights from LAX to JFK,
leaving next Friday, returning the following Monday"
Timeout Configuration
# Quick commands (10 seconds)
python3 flight_search_vtt.py --voice --voice-timeout 10000 --voice-silence-timeout 1000
# Normal use (30 seconds) - DEFAULT
python3 flight_search_vtt.py --voice --voice-timeout 30000 --voice-silence-timeout 2000
# Detailed descriptions (60 seconds)
python3 flight_search_vtt.py --voice --voice-timeout 60000 --voice-silence-timeout 3000
Example Session
$ python3 flight_search_vtt.py --voice
🎤 Listening... (max 30s, silence timeout 2s)
Speak your flight requirement now!
# User says: "Find a round trip from Atlanta to New York,
# departing December 1st, returning December 15th,
# for 2 adults in economy"
✓ Transcribed: 'Find a round trip from Atlanta to New York,
departing December 1st, returning December 15th,
for 2 adults in economy'
🧭 Interpreter Agent
✓ Interpreter output:
{
"originLocationCode": "ATL",
"destinationLocationCode": "JFK",
"departureDate": "2025-12-01",
"returnDate": "2025-12-15",
"adults": 2,
"travelClass": "ECONOMY"
}
🛠️ Executor Agent
→ Flight search: ATL → JFK
Departure: 2025-12-01, Return: 2025-12-15, Adults: 2, Class: ECONOMY
✓ Received 3 flight results
📋 FLIGHT SEARCH RESULTS
[Flight options listed here...]
✅ Search complete!
Technical Details
MCP (Model Context Protocol) Architecture
What is MCP?
- Language-agnostic communication protocol
- JSON-RPC 2.0 based
- Enables tools to work across different languages
Communication:
- Transport: stdio (stdin/stdout)
- Protocol: JSON-RPC 2.0
- Tools exposed:
listen,transcribe_file
Benefits:
- Language-agnostic (Rust server, Python client)
- Standardized protocol
- Isolated concerns (audio processing separate from business logic)
- Reusable components
Voice Processing
Whisper AI Integration:
- OpenAI Whisper (speech recognition)
- Quantized models (ggml format)
- English-only variants (.en suffix)
Hardware Acceleration:
- macOS: Metal GPU + CoreML (Apple Neural Engine)
- Linux/Windows: CUDA (NVIDIA GPUs)
- Fallback: CPU-only
Recording format:
- Sample rate: 16kHz
- Channels: Mono
- Format: WAV (PCM)
Auto-stop logic:
- Records up to
timeout_msmilliseconds - Stops early if
silence_timeout_msof silence detected - Silence threshold: -30dB
Google ADK Integration
Agent configuration:
# Interpreter Agent
model = "gemini-2.5-flash-lite"
temperature = 0.3 # Low for consistent parsing
# Executor Agent
model = "gemini-2.5-flash-lite"
tools = [search_flights]
LangGraph Integration
Key file:
Enhanced error handling:
except ResponseError as error:
error_code = getattr(error, 'code', 'UNKNOWN')
error_description = getattr(error, 'description', str(error))
# Extract from response body
if hasattr(error, 'response') and error.response:
error_body = error.response.body
if isinstance(error_body, dict):
errors = error_body.get('errors', [])
if errors:
first_error = errors[0]
error_code = first_error.get('code', error_code)
error_description = first_error.get('detail', error_description)
Troubleshooting
Setup Issues
"cargo: command not found"
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
"MCP server binary not found"
cd voice-to-text-mcp && cargo build --release
"Whisper model not found"
cd voice-to-text-mcp && ./scripts/download-models.sh
Missing API keys
# Check .env file exists
cat .env
# Verify all keys are set
grep GOOGLE_API_KEY .env
grep AMADEUS_API_KEY .env
grep AMADEUS_API_SECRET .env
Voice Input Issues
"No input device available"
Checklist:
- Microphone is connected and working
- Microphone is not in use by another application
- System has microphone permissions
macOS:
System Settings → Privacy & Security → Microphone
Ensure Terminal has access
Linux:
# Check audio devices
arecord -l
# Test microphone
arecord -d 5 test.wav
aplay test.wav
"Recording cuts off too early"
# Increase silence timeout
python3 flight_search_vtt.py --voice --voice-silence-timeout 5000
# Increase overall timeout
python3 flight_search_vtt.py --voice --voice-timeout 45000
"Poor transcription quality"
Try:
- Use a better model (ggml-small.en.bin)
- Speak more clearly and slowly
- Reduce background noise
- Increase silence timeout for longer pauses
Flight Search Issues
"No results returned"
Checklist:
- Dates are in the future
- Origin/destination are valid IATA codes or city names
- Travel dates are realistic
Debug:
python3 flight_search_vtt.py --query "Your query" --debug --verbose
Common Amadeus API error codes:
- 38194: Invalid origin/destination → Use valid IATA codes (e.g.,
ATL,JFK) - 477: Invalid date format → Use
YYYY-MM-DDformat - 4926: No flights available → Try different dates or route
- 38187: Invalid passenger count → Check adults/children/infants counts
Integration Examples
Using the MCP Client in Other Projects
The can be used in any Python project:
from voice_mcp_client import VoiceToTextMCPClient
# Initialize client
client = VoiceToTextMCPClient(
mcp_server_path="voice-to-text-mcp/target/release/voice-to-text-mcp",
model_path="voice-to-text-mcp/models/ggml-base.en.bin"
)
# Listen for voice input
user_input = client.listen(
timeout_ms=30000, # 30 seconds max
silence_timeout_ms=2000, # Stop after 2s silence
auto_stop=True # Enable auto-stop
)
print(f"User said: {user_input}")
# Transcribe existing audio file
transcript = client.transcribe_file("audio.wav")
print(f"Transcription: {transcript}")
Custom Flight Search Workflow
from voice_mcp_client import VoiceToTextMCPClient
import os
from dotenv import load_dotenv
# Load environment
load_dotenv()
# Initialize voice client
voice_client = VoiceToTextMCPClient(
"voice-to-text-mcp/target/release/voice-to-text-mcp",
"voice-to-text-mcp/models/ggml-base.en.bin"
)
# Get voice input
print("🎤 Speak your flight requirement...")
query = voice_client.listen(timeout_ms=30000)
print(f"✓ Heard: {query}")
# Process with your custom agent
# ... your code here ...
References
Official Documentation:
Related Projects:
Whisper Models:
Happy flying! ✈️🎤