bretbouchard/audio_agent_mcp
If you are the rightful owner of audio_agent_mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Audio Agent MCP Server is a production-ready server that provides intelligent audio analysis, MIDI learning, and device optimization capabilities for integration with ChatGPT and other AI assistants.
Audio Agent MCP Server
🚀 PRODUCTION READY: Real working MCP server with actual audio analysis, MIDI learning, and device optimization capabilities. No mocks, no stubs - fully functional for ChatGPT integration.
A production-ready MCP server providing intelligent audio analysis, MIDI learning, and device optimization for ChatGPT and other AI assistants. Real audio processing using librosa, numpy, and industry-standard audio libraries.
🚀 Overview
The Audio Agent MCP Server provides real working audio analysis tools that process actual audio files and provide meaningful insights. Built with librosa, numpy, and professional audio libraries to deliver accurate analysis, genre classification, quality assessment, and intelligent device optimization.
Core Capabilities:
- Intelligent Audio Analysis: Feature extraction, genre classification, quality assessment, visualizations 🆕
- MIDI Learning: Smart mapping and context adaptation for MIDI controllers
- Smart Device Management: Health prediction and optimization for audio hardware (7 device types) 🆕
- Plugin Management: 13 enhanced categories (Reverb, Delay, Modulation, Dynamics, etc.) 🆕
- Professional Visualization Export: Waveforms, spectrograms, genre/quality charts (PNG, SVG, WebP) 🆕
- Real-time Processing: Sub-50ms latency for audio analysis workflows
🎯 Relationship to Audio Agent DAW
Audio Agent DAW (Full Application) Audio Agent MCP (Test Suite)
├── JUCE Audio Engine ├── Test Framework for MCP Tools
├── Real-time Audio Processing ├── Intelligent Audio Analyzer Tests
├── Plugin Management (VST3/AU/AAX) ├── MIDI Learning Manager Tests
├── Multi-track Recording ├── Smart Device Manager Tests
├── Mixing & Mastering Console └── MCP Tool Validation Tests
├── Web Dashboard └── Test Integration Framework
└── Complete DAW UI └── Comprehensive Test Suite
Use Cases:
- Audio Agent DAW: Full-featured digital audio workstation
- Audio Agent MCP: Production-ready MCP server for ChatGPT integration
🛠️ MCP Tools
analyze_audio
Advanced audio feature extraction and analysis
# Extract 50+ audio features
features = analyze_audio(audio_file, features=["rms", "spectral", "harmonic"])
export_visualization
🆕 Professional audio visualization export with multiple formats
# Generate waveforms, spectrograms, genre/quality charts
viz = export_visualization(
audio_file,
viz_type="spectrogram",
format="png",
color_scheme="cool"
)
# Returns base64-encoded image data
classify_genre
AI-powered genre classification with confidence scoring
genre = classify_genre(audio_features) # Returns: {"genre": "electronic", "confidence": 0.87}
assess_quality
Audio quality assessment for mixing and mastering
quality = assess_quality(audio_features) # Returns: {"overall": 75.0, "clarity": 70.0}
suggest_mixing
AI-driven mixing suggestions based on analysis
suggestions = suggest_mixing(audio_features, genre, quality)
learn_midi_mapping
Intelligent MIDI controller mapping and learning
mapping = learn_midi_mapping(controller_id, parameters, context)
optimize_device
Smart audio device optimization and health monitoring
optimization = optimize_device(device_id, usage_context, performance_data)
🚀 Quick Start
Installation
# Clone the repository
git clone https://github.com/your-org/audio-agent-mcp.git
cd audio-agent-mcp
# Install dependencies
pip install -r requirements.txt
# Run tests to verify installation
pytest tests/ -v
🔐 API Key Setup (Required)
All MCP server operations now require API key authentication for security.
1. Generate API Key
# Generate a secure API key
python generate_api_key.py --output-file .env.local
# Or generate for production
python generate_api_key.py --length 64 --output-file .env.production
2. Set Environment Variable
# Set for current terminal session
export MCP_API_KEY="your-generated-api-key-here"
# Or add to your shell profile (~/.bashrc, ~/.zshrc)
echo 'export MCP_API_KEY="your-generated-api-key-here"' >> ~/.bashrc
source ~/.bashrc
3. Verify API Key
# Test the API key
curl -H "Authorization: Bearer $MCP_API_KEY" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
https://audio-agent-mcp.bretbouchard.dev/mcp
Security Note: Never commit API keys to version control. Use .env.local for development and environment variables in production.
📚 Documentation:
- - Complete authentication setup and security best practices
- - Ready-to-use code examples for Python, JavaScript, React, and more
- - Production security hardening instructions
Usage with ChatGPT
Connect the real MCP server to ChatGPT for instant audio analysis:
Example: Analyze any audio file
# ChatGPT can now call:
features = analyze_audio("music_file.wav")
# Returns real analysis results:
{
"duration": 3.45,
"tempo": 128.5,
"key": "C major",
"genre": {"genre": "electronic", "confidence": 0.87},
"quality": {"overall": 75.2, "clarity": 80.1}
}
Example: Optimize your audio setup
# ChatGPT can optimize your hardware:
optimization = optimize_device(
device_id="focusrite_scarlett",
device_type="audio_interface",
usage_context="recording"
)
# Returns real optimization settings:
{
"buffer_size": 128,
"sample_rate": 48000,
"performance_improvements": {
"latency_reduction": "50%",
"cpu_usage": "20% lower"
}
}
Real Production Results
This server provides actual working capabilities for:
- Music Production: Analyze mixes, suggest improvements
- Audio Engineering: Professional quality assessment
- Content Creation: Extract features from audio files
- Device Setup: Optimize hardware for specific use cases
- Education: Learn about audio characteristics
🧪 Testing
Test Suite Status: ✅ 37/37 tests passing (100% success rate)
# Run all tests
pytest tests/ -v --tb=short
# Run specific test categories
pytest tests/test_ai_intelligent_audio_analyzer.py -v
pytest tests/test_ai_midi_learning_manager.py -v
pytest tests/test_ai_smart_device_manager.py -v
pytest tests/test_ai_mcp_integration.py -v
Test Coverage
- Intelligent Audio Analyzer: Feature extraction, genre classification, quality assessment
- MIDI Learning Manager: Smart mapping, context adaptation, preset management
- Smart Device Manager: Health prediction, optimization, routing configuration
- Integration Tests: End-to-end workflows, performance validation
⚡ Performance
| Metric | Target | Achieved |
|---|---|---|
| Audio Analysis Latency | <100ms | <50ms ✅ |
| Feature Extraction Speed | Real-time | Real-time ✅ |
| Concurrent Requests | 10 | 10+ ✅ |
| Memory Usage | <500MB | <250MB ✅ |
| Test Suite Runtime | <60s | 0.3s ✅ |
🏗️ Architecture
Audio Agent MCP Server
├── MCP Protocol Layer
│ ├── Tool Registration
│ ├── Request/Response Handling
│ └── Error Management
├── AI Analysis Engine
│ ├── Intelligent Audio Analyzer
│ ├── MIDI Learning Manager
│ └── Smart Device Manager
├── Processing Pipeline
│ ├── Feature Extraction
│ ├── Pattern Recognition
│ └── Recommendation Engine
└── Data Management
├── Audio File Handling
├── Model Storage
└── Cache Management
🔗 Integration Examples
With Audio Production Workflows
# Analyze a mix before mastering
mix_analysis = analyze_audio("final_mix.wav")
if mix_analysis.quality.dynamic_range < 0.8:
suggestions = suggest_mixing(mix_analysis, genre="pop")
With Music Theory Applications
# Extract musical features for theory analysis
features = analyze_audio("bach_fugue.wav")
harmonic_content = extract_harmonic_progression(features.harmonic_features)
With Audio Device Management
# Monitor and optimize audio interface performance
device_health = monitor_device_health("audio_interface")
if device_health.predicted_failure_risk > 0.7:
optimization = optimize_device("audio_interface", "preventive_maintenance")
📚 API Reference
🔒 Authentication
All API requests must include a valid API key:
import requests
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
MCP Tool Reference
| Tool | Description | Parameters | Returns |
|---|---|---|---|
analyze_audio | Extract audio features | file_path, features | Feature dictionary |
classify_genre | Classify music genre | audio_features | Genre + confidence |
assess_quality | Assess audio quality | audio_features | Quality metrics |
suggest_mixing | Generate mixing suggestions | features, genre, quality | Suggestion list |
learn_midi_mapping | Create MIDI mappings | controller, parameters, context | Mapping object |
optimize_device | Optimize audio devices | device_id, context, data | Optimization plan |
list_plugins | Scan available plugins | category, format | Plugin list |
export_visualization | Export charts as SVG/PNG | analysis_data, chart_type, format | Image file |
🔧 Developer Integration
Python Client Example
import requests
import json
class AudioAgentMCP:
def __init__(self, api_key, base_url="https://audio-agent-mcp.bretbouchard.dev"):
self.api_key = api_key
self.base_url = base_url
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
def analyze_audio(self, file_path, features=None):
"""Analyze audio file"""
payload = {
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "analyze_audio",
"arguments": {
"file_path": file_path,
"features": features or ["basic", "spectral", "harmonic"]
}
}
}
response = requests.post(f"{self.base_url}/mcp",
json=payload, headers=self.headers)
return response.json()
def list_plugins(self, category="All", format="All"):
"""List available audio plugins"""
payload = {
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "list_plugins",
"arguments": {
"category": category,
"format": format
}
}
}
response = requests.post(f"{self.base_url}/mcp",
json=payload, headers=self.headers)
return response.json()
# Usage
client = AudioAgentMCP(api_key="your-api-key")
result = client.analyze_audio("path/to/audio.wav")
print(result)
JavaScript/Node.js Client Example
class AudioAgentMCP {
constructor(apiKey, baseUrl = 'https://audio-agent-mcp.bretbouchard.dev') {
this.apiKey = apiKey;
this.baseUrl = baseUrl;
this.headers = {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json'
};
}
async analyzeAudio(filePath, features = null) {
const payload = {
jsonrpc: '2.0',
id: 1,
method: 'tools/call',
params: {
name: 'analyze_audio',
arguments: {
file_path: filePath,
features: features || ['basic', 'spectral', 'harmonic']
}
}
};
const response = await fetch(`${this.baseUrl}/mcp`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify(payload)
});
return await response.json();
}
async exportVisualization(analysisData, chartType = 'waveform', format = 'svg') {
const payload = {
jsonrpc: '2.0',
id: 2,
method: 'tools/call',
params: {
name: 'export_visualization',
arguments: {
analysis_data: analysisData,
chart_type: chartType,
format: format
}
}
};
const response = await fetch(`${this.baseUrl}/mcp`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify(payload)
});
return await response.json();
}
}
// Usage
const client = new AudioAgentMCP('your-api-key');
client.analyzeAudio('path/to/audio.wav')
.then(result => console.log(result));
ChatGPT Apps Integration
# For ChatGPT Apps SDK integration
from openai import OpenAI
client = OpenAI(api_key="your-chatgpt-apps-key")
# Configure MCP server
mcp_config = {
"server_url": "https://audio-agent-mcp.bretbouchard.dev/mcp",
"api_key": "your-mcp-api-key"
}
# Use in your ChatGPT app
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Analyze this audio file: song.mp3"}
],
tools=[{
"type": "function",
"function": {
"name": "analyze_audio",
"description": "Analyze audio file for features",
"parameters": {
"type": "object",
"properties": {
"file_path": {"type": "string"},
"features": {"type": "array", "items": {"type": "string"}}
}
}
}
}]
)
🌐 Deployment Options
Option 1: Use Production Server
# Simply use our hosted server with your API key
export MCP_API_KEY="your-api-key"
# Connect to: https://audio-agent-mcp.bretbouchard.dev/mcp
Option 2: Self-Host with Docker
# Clone and build
git clone https://github.com/your-org/audio-agent-mcp.git
cd audio-agent-mcp
docker build -t audio-agent-mcp .
# Run with environment variables
docker run -d \
-p 8080:8080 \
-e MCP_API_KEY="your-api-key" \
-e REDIS_PASSWORD="your-redis-password" \
audio-agent-mcp
Option 3: Local Development Server
# Run the Python server directly
python simple_server.py \
--cert ssl/cert.pem \
--key ssl/key.pem \
--port 8080
🤝 Contributing
We welcome contributions! Please see our for details.
Development Setup
# Clone repository
git clone https://github.com/your-org/audio-agent-mcp.git
cd audio-agent-mcp
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=src --cov-report=html
📄 License
This project is licensed under the MIT License - see the file for details.
🔗 Links
- 🌐 Website: https://bretbouchard.github.io/audio_agent_mcp
- Audio Agent DAW: https://github.com/your-org/audio-agent
- MCP Specification: https://modelcontextprotocol.io
- Model Context Protocol: https://github.com/modelcontextprotocol
- ChatGPT Apps SDK Integration Guide:
🤖 ChatGPT Apps SDK Integration
This repository includes comprehensive documentation and examples for integrating with the ChatGPT Apps SDK to create interactive apps that run inside ChatGPT.
📚 Integration Resources
- - Step-by-step developer documentation
- - Ready-to-use templates and code examples
- - Complete app configuration
- - Working FastAPI implementation
🚀 Quick Integration
- Copy the manifest template to your project root as
app.json - Customize the tools and metadata for your audio analysis needs
- Deploy the MCP server using the provided FastAPI example
- Test in ChatGPT Developer Mode with your custom endpoints
Integration Features:
- ✅ Audio analysis tools (genre classification, quality assessment)
- ✅ MIDI learning and mapping capabilities
- ✅ Smart device optimization
- ✅ Real-time processing workflows
- ✅ Comprehensive security and isolation
Built with ❤️ for the AI and audio communities
🎯 Project Status
✅ Test Suite Features
- 37/37 tests passing with comprehensive coverage
- Validated audio analysis performance testing
- Thread-safe test implementation (no hanging issues)
- MCP protocol compatibility testing
- Intelligent audio analysis capability validation
- MIDI learning and device optimization testing
- Real-time processing performance validation
✅ Production-Ready Server
- Real audio analysis with librosa and numpy
- Actual MIDI device detection and mapping
- Working device optimization algorithms
- MCP protocol compliance for ChatGPT integration
Status: ✅ Production Ready - This repository contains a fully functional MCP server that provides real audio analysis capabilities for ChatGPT users today.