susfungit/database-agent
If you are the rightful owner of database-agent and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Database Agent MCP Server is a natural language to SQL query generator with Model Context Protocol (MCP) server capabilities, designed to facilitate autonomous database interaction.
Database Agent MCP Server - Phase 2A
A natural language to SQL query generator with MCP (Model Context Protocol) server capabilities. This is Phase 2A of a planned AI agent that will eventually provide autonomous database interaction capabilities.
Current Phase: SQL Query Generator with Basic Schema Integration
Future Vision: Autonomous Database AI Agent
π Features (Phase 2A)
Current Capabilities
- Natural Language to SQL: Convert natural language prompts to SQL queries
- LLMWrapper Integration: Uses your existing
llmwrapper
for LLM interactions - MCP Protocol: Implements Model Context Protocol for tool integration
- FastAPI Server: RESTful API endpoints for easy integration
- Health Monitoring: Built-in health checks and monitoring
- Comprehensive Logging: Detailed logging with configurable levels
- Configuration Management: YAML-based configuration with environment variable overrides
- Schema Loading & Caching: Loads and caches database schema using
schema-graph-builder
(see below)
Planned Features (Phase 2B+)
- Schema-Aware Generation: Use database schema for context-aware SQL generation
- Query Execution: Execute generated SQL and return results
- Autonomous Decision Making: Plan and execute multi-step database operations
- Memory & Learning: Remember user preferences and learn from interactions
- Interactive Refinement: Ask clarifying questions and refine queries
- Relationship Detection: Automatic foreign key relationship inference
π§© Schema Integration (Phase 2A)
Whatβs New in Phase 2A?
- SchemaManager: Loads, caches, and provides access to the database schema (tables, relationships, metadata) using the
schema-graph-builder
package. - Schema Context: Schema is now available for future use in prompt enhancement, validation, and agent reasoning (but not yet used for SQL generation).
- Health Checks: SchemaManager exposes health and summary info for monitoring.
- Unit Tests: Comprehensive tests for schema loading, caching, and health.
How Schema Integration Works
- On startup, if enabled in config, the agent loads the schema from your database using
schema-graph-builder
. - The schema (tables, columns, relationships) is cached and periodically refreshed.
- The schema is not yet used to influence SQL generation, but is available for future phases.
Configuration Example
Edit config/llm_config.yaml
:
schema:
enabled: true
database_url: "sqlite:///:memory:"
refresh_interval: 3600 # seconds
Requirements
schema-graph-builder
must be installed (see Installation below).- The schema config section must be present and enabled.
π Prerequisites
- Python 3.8+
- Your existing
llmwrapper
library (installed withpip install -e ../llmwrapper
) - Your existing
schema-graph-builder
library (installed withpip install -e ../schema-graph-builder
) - OpenAI API key (or other LLM provider)
π§ Installation
- Clone and setup the project:
# Navigate to the database-agent directory
cd database-agent
# Install dependencies
pip install -r requirements.txt
# Install your existing packages in development mode
pip install -e ../llmwrapper
pip install -e ../schema-graph-builder
- Set up environment variables:
# Copy the example environment file
cp env.example .env
# Edit .env with your API keys
nano .env
- Configure your LLM provider and schema integration:
# Edit config/llm_config.yaml
llm:
provider: "openai" # or anthropic, gemini, grok, ollama
model: "gpt-4" # Model name for your provider
api_key: "" # Will be overridden by OPENAI_API_KEY env var
schema:
enabled: true
database_url: "sqlite:///:memory:"
refresh_interval: 3600
π¦ Component Status Table
Feature | Status |
---|---|
Natural language β SQL (LLM) | β Complete |
SchemaManager (load/cache) | β Complete |
Schema context API | β Complete |
Unit tests for schema features | β Complete |
Query execution | β Not yet |
Schema-aware prompt/validation | β Not yet |
π Log Files & .gitignore
- All log files (
*.log
,*.log.*
) and thelogs/
directory are ignored by git (see.gitignore
). - You do not need to manually clean up logs before committing.
π§ͺ Testing
# Run all tests (including schema integration)
pytest tests/
# Run with coverage
pytest tests/ --cov=src --cov-report=html
# Run specific test
pytest tests/test_database_agent.py -v
All tests are currently passing.
π Project Structure
database-agent/
βββ README.md # This file
βββ requirements.txt # Python dependencies (includes schema-graph-builder)
βββ env.example # Environment variables template
βββ architecture_diagram.md # System architecture documentation
βββ config/
β βββ llm_config.yaml # Configuration file
βββ src/
β βββ mcp_server.py # Main MCP server (FastAPI)
β βββ database_agent/
β β βββ agent.py # Core database agent (Phase 1: SQL generator)
β β βββ llm_integration.py # LLMWrapper integration
β β βββ schema_manager.py # SchemaManager (Phase 2A)
β β βββ tools/
β β βββ query_tool.py # MCP query tool
β βββ utils/
β βββ config_loader.py # Configuration management
β βββ logger.py # Logging setup
βββ tests/
β βββ test_database_agent.py # Unit tests (Phase 1)
β βββ test_schema_manager.py # Unit tests (Phase 2A)
βββ examples/
βββ basic_usage.py # Usage example
π API Reference
Endpoints
GET /
- Description: Server information
- Response:
{"message": "Database Agent MCP Server", "version": "1.0.0"}
GET /health
- Description: Health check
- Response: Health status with LLM status and timestamp
POST /generate-sql
- Description: Generate SQL from natural language (Phase 1: Generation only)
- Request Body:
{ "prompt": "Show me all users who made orders in the last month" }
- Response:
{ "sql_query": "SELECT u.* FROM users u JOIN orders o ON u.user_id = o.user_id WHERE o.order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH);", "explanation": "Generated SQL query for: Show me all users who made orders in the last month", "prompt": "Show me all users who made orders in the last month", "timestamp": "2024-01-15T10:30:45.123456" }
- Note: This endpoint generates SQL but doesn't execute it. Query execution will be added in Phase 2.
GET /tools
- Description: Get available MCP tools
- Response: List of available tools with schemas
π§ Configuration
Environment Variables
Variable | Description | Default |
---|---|---|
OPENAI_API_KEY | OpenAI API key | Required |
LLM_PROVIDER | LLM provider (openai, anthropic, etc.) | openai |
LLM_MODEL | Model name | gpt-4 |
SERVER_HOST | Server host | localhost |
SERVER_PORT | Server port | 8000 |
LOG_LEVEL | Logging level | INFO |
Configuration File
Edit config/llm_config.yaml
:
# LLM Configuration
llm:
provider: "openai"
model: "gpt-4"
api_key: ""
timeout: 30
# Server Configuration
server:
host: "localhost"
port: 8000
debug: false
# Logging Configuration
logging:
level: "INFO"
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
file: "logs/database_agent.log"
π Troubleshooting
Common Issues
-
LLMWrapper not found
ModuleNotFoundError: No module named 'llmwrapper'
- Install llmwrapper in development mode:
pip install -e ../llmwrapper
- Ensure you're in the database-agent directory when running the command
- Install llmwrapper in development mode:
-
API Key not set
Failed to initialize LLM: API key not provided
- Set
OPENAI_API_KEY
environment variable - Or add it to your
.env
file
- Set
-
Port already in use
Error: [Errno 48] Address already in use
- Change the port in config or use
--port
argument - Kill the process using the port
- Change the port in config or use
-
SQL not executing
Note: This is expected behavior in Phase 1
- Phase 1 only generates SQL, doesn't execute it
- Query execution will be added in Phase 2
Logs
Logs are written to:
- Console output
logs/database_agent.log
(if configured)
Check logs for detailed error information.
π― Current Limitations (Phase 1)
- No Query Execution: Generated SQL is not executed against databases
- No Schema Context: SQL generation doesn't use database schema information
- No Memory: Each request is processed independently
- No Validation: Generated SQL is not validated against actual database structure
- No Interactive Refinement: Cannot ask clarifying questions
These limitations will be addressed in future phases.
π Development Roadmap
Phase 1: SQL Query Generator β (Current)
- β Natural language to SQL conversion
- β MCP server with REST API
- β LLM integration
- β Basic error handling and logging
Phase 2: Schema-Aware Generator (Next)
- π Schema Integration: Use your
schema-graph-builder
for database context - π Relationship Detection: Automatic foreign key relationship inference
- π Query Validation: Validate generated SQL against schema
- π Enhanced Prompts: Schema-aware SQL generation
- π Query Execution: Execute queries and return results
Phase 3: Autonomous Agent (Future)
- π Memory System: Remember conversation history and user preferences
- π Planning Engine: Break complex requests into executable steps
- π Tool Orchestration: Multiple specialized tools beyond SQL generation
- π Interactive Capabilities: Ask clarifying questions when requests are ambiguous
- π Learning & Adaptation: Improve responses based on user feedback
Phase 4: Full AI Agent (Vision)
- π Autonomous Decision Making: Plan and execute multi-step operations
- π Context Awareness: Understand database state and user intent
- π Proactive Suggestions: Suggest optimizations and insights
- π Natural Conversation: Maintain context across multiple interactions
π§© Schema Integration (Phase 2A)
Schema integration allows the agent to load and cache your actual database schema using the schema-graph-builder
library. This enables future schema-aware SQL generation and validation.
What it does
- Loads tables, columns, and relationships from your database
- Caches schema for fast access
- Provides schema context for future LLM-based SQL generation
How to enable
- Install
schema-graph-builder
(see below) - Add a
schema
section to yourconfig/llm_config.yaml
:
schema:
enabled: true
database_url: "postgresql://user:pass@localhost/dbname"
refresh_interval: 3600 # seconds
- The agent will automatically load and cache your schema if enabled.
Dependency
You must install schema-graph-builder
in development mode:
pip install -e ../schema-graph-builder
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
π License
This project is licensed under the MIT License.