applied-ai-systems/aas-lancedb-mcp
If you are the rightful owner of aas-lancedb-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The AAS LanceDB MCP Server is an advanced server designed for managing datastores with integrated sentence transformers for text embedding.
AAS LanceDB MCP Server
A comprehensive Model Context Protocol (MCP) server that provides AI agents with database-like operations over LanceDB with automatic embedding generation using state-of-the-art BGE-M3 multilingual embeddings.
โจ Why This MCP Server?
- ๐ฏ Database-like Interface: Works like SQLite MCP - create tables, CRUD operations, migrations
- ๐ค Automatic Embeddings: BGE-M3 generates 1024D multilingual embeddings for searchable text
- ๐ Semantic Search: Natural language search across your data using vector similarity
- ๐ Rich Resources: Dynamic database inspection (schemas, samples, statistics)
- ๐ก Intelligent Prompts: AI guidance for schema design, optimization, troubleshooting
- ๐ก๏ธ Safe Migrations: Built-in table migration with validation and automatic backups
- ๐ Multilingual: BGE-M3 provides excellent performance across 100+ languages
๐ Quick Start
Install & Run with uvx (Recommended)
# Run directly without installation
uvx aas-lancedb-mcp --help
# Or install globally
uv tool install aas-lancedb-mcp
aas-lancedb-mcp --version
Install from Source
git clone https://github.com/applied-ai-systems/aas-lancedb-mcp.git
cd aas-lancedb-mcp
uv tool install .
๐ ๏ธ MCP Capabilities Overview
๐ง 10 Database Tools
Tool | Purpose | Example |
---|---|---|
create_table | Create tables with schema | Create products table with searchable descriptions |
list_tables | Show all tables | Get overview of database contents |
describe_table | Get table schema & info | Understand table structure and metadata |
drop_table | Delete tables | Remove unused tables |
insert | Add data (auto-embeddings) | Insert product with searchable description |
select | Query with filtering/sorting | Find products by price range |
update | Modify data (auto-embeddings) | Update product info with new description |
delete | Remove rows by conditions | Delete discontinued products |
search | Semantic text search | "Find sustainable products" โ matches related items |
migrate_table | Safe schema changes | Add columns or change structure safely |
๐ Dynamic Resources
Resources provide AI agents with real-time database insights:
lancedb://overview
- Complete database statistics and table summarylancedb://tables/{name}/schema
- Table schema, columns, searchable fieldslancedb://tables/{name}/sample
- Sample data for understanding contentslancedb://tables/{name}/stats
- Column statistics, data quality metrics
๐ฌ 5 Intelligent Prompts
AI-powered guidance for database operations:
analyze_table
- Generate insights about data patterns and qualitydesign_schema
- Help design optimal table schemas for use casesoptimize_queries
- Performance optimization recommendationstroubleshoot_performance
- Diagnose and solve database issuesmigration_planning
- Plan safe schema migrations step-by-step
๐ Usage Examples
Creating a Product Catalog
{
"tool": "create_table",
"arguments": {
"schema": {
"name": "products",
"columns": [
{"name": "title", "type": "text", "required": true, "searchable": true},
{"name": "description", "type": "text", "searchable": true},
{"name": "price", "type": "float", "required": true},
{"name": "category", "type": "text", "required": true},
{"name": "metadata", "type": "json"}
],
"description": "E-commerce product catalog with semantic search"
}
}
}
Adding Products (Embeddings Generated Automatically)
{
"tool": "insert",
"arguments": {
"data": {
"table_name": "products",
"data": {
"title": "Eco-Friendly Water Bottle",
"description": "Sustainable stainless steel water bottle with insulation",
"price": 24.99,
"category": "sustainability",
"metadata": {"brand": "EcoLife", "material": "stainless_steel"}
}
}
}
}
Semantic Search (Natural Language)
{
"tool": "search",
"arguments": {
"query": {
"table_name": "products",
"query": "environmentally friendly drinking containers",
"limit": 5
}
}
}
Database Inspection (Resources)
{
"resource": "lancedb://tables/products/sample"
}
Returns sample product data for AI agents to understand the table structure.
AI Guidance (Prompts)
{
"prompt": "design_schema",
"arguments": {
"use_case": "Customer support ticket system",
"data_types": "ticket text, priority levels, timestamps",
"search_requirements": "semantic search across ticket descriptions"
}
}
Returns AI-generated recommendations for optimal table design.
โ๏ธ Configuration
Claude Desktop Setup
Add to claude_desktop_config.json
:
{
"mcpServers": {
"aas-lancedb": {
"command": "aas-lancedb-mcp",
"args": ["--db-uri", "~/my_database"],
"env": {
"EMBEDDING_MODEL": "BAAI/bge-m3"
}
}
}
}
Environment Variables
export LANCEDB_URI="./my_database" # Database location
export EMBEDDING_MODEL="BAAI/bge-m3" # Embedding model (default)
export EMBEDDING_DEVICE="cpu" # cpu or cuda
Command Line Options
aas-lancedb-mcp --help # Show help
aas-lancedb-mcp --version # Show version
aas-lancedb-mcp --db-uri ./my_db # Custom database path
๐๏ธ Architecture
Enhanced MCP Server Architecture
โโโ ๐ง Tools (10) - Database operations (CRUD, search, migrate)
โโโ ๐ Resources (dynamic) - Real-time database introspection
โโโ ๐ฌ Prompts (5) - AI guidance for database tasks
โโโ ๐ค BGE-M3 Embeddings - Automatic 1024D multilingual vectors
โโโ ๐ก๏ธ Safe Migrations - Schema evolution with validation
โโโ ๐ Rich Metadata - Column types, constraints, statistics
Key Technical Features
- ๐ฏ Database-like Interface: Familiar SQL-style operations hiding vector complexity
- ๐ค Automatic Embedding Generation: BGE-M3 creates vectors for searchable text columns
- ๐ Hybrid Search: Combine semantic similarity with traditional filtering
- ๐ Dynamic Resources: Real-time database inspection for AI agents
- ๐ก Contextual Prompts: AI guidance using actual database state
- ๐ก๏ธ Migration Safety: Backup, validate, and rollback capabilities
- ๐ Multilingual: BGE-M3 excels across 100+ languages
๐งช Development & Testing
# Clone and setup
git clone https://github.com/applied-ai-systems/aas-lancedb-mcp.git
cd aas-lancedb-mcp
# Install dependencies
uv sync --all-extras
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=src --cov-report=term-missing
# Format and lint
uv run ruff format .
uv run ruff check .
# Test CLI
uv run aas-lancedb-mcp --help
๐ Performance & Scalability
- BGE-M3 Embeddings: 1024 dimensions, excellent multilingual performance
- LanceDB Backend: Columnar vector database optimized for scale
- Efficient Operations: Automatic embedding caching and batch processing
- Memory Management: Lazy loading and streaming for large datasets
- Search Performance: HNSW indexing for fast vector similarity search
๐ค Contributing
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature
) - Make changes with tests (
pytest tests/
) - Format code (
uv run ruff format .
) - Submit Pull Request
๐ License
MIT License - see file for details.
๐ Acknowledgments
- LanceDB - High-performance columnar vector database
- BGE-M3 - State-of-the-art multilingual embeddings
- Model Context Protocol - Standardized AI tool integration
- Sentence Transformers - Easy-to-use embedding framework
๐ Related MCP Projects
- MCP Servers - Official MCP server collection
- FastMCP - Fast Pythonic MCP framework
- SQLite MCP - Database MCP inspiration
Ready to supercharge your AI agents with powerful database capabilities? ๐
uvx aas-lancedb-mcp --help