ChrisGVE/localdata-mcp
If you are the rightful owner of localdata-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
LocalData MCP Server is a robust and secure server designed for managing local databases and structured text files, offering advanced security and efficient handling of large datasets.
LocalData MCP Server
A comprehensive MCP server for databases, spreadsheets, and structured data files with security features, performance optimization, and extensive format support.
What's New in v1.3.1 🚀
Memory-Safe Query Architecture
- Intelligent Pre-Query Analysis: Automatic query complexity assessment using
COUNT(*)
and sample row analysis - Memory-Bounded Streaming: Predictable memory usage with configurable limits and streaming pipeline
- Smart Token Management: AI-optimized response sizes with automatic chunking for large datasets
Advanced Configuration System
- Dual Configuration Mode: Simple environment variables for basic setups, powerful YAML for complex scenarios
- Hot Configuration Reload: Dynamic configuration updates without service restart
- Multi-Database Support: Granular per-database settings with timeout and memory controls
Enhanced Security & Performance
- SQL Query Validation: Complete protection against non-SELECT operations
- Configurable Timeouts: Per-database and global query timeout enforcement
- Structured Logging: JSON logging with detailed query metrics and security events
- Connection Optimization: Intelligent connection pooling and resource management
Developer Experience Improvements
- Rich Response Metadata: Query execution stats, memory usage, and optimization hints
- Progressive Data Loading: Chunk-based access for massive datasets
- Enhanced Error Messages: Actionable guidance for configuration and query issues
- Backward Compatibility: 100% API compatibility with automated migration tools
Ready to upgrade? See the for step-by-step instructions.
Table of Contents
- Features
- Quick Start
- Available Tools
- Supported Data Sources
- Security Features
- Performance & Scalability
- Examples
- Testing & Quality
- Troubleshooting
- Roadmap
- Contributing
Features
Multi-Database Support
- SQL Databases: PostgreSQL, MySQL, SQLite, DuckDB
- Modern Databases: MongoDB, Redis, Elasticsearch, InfluxDB, Neo4j, CouchDB
- Spreadsheets: Excel (.xlsx/.xls), LibreOffice Calc (.ods), Apple Numbers (.numbers)
- Structured Files: CSV, TSV, JSON, YAML, TOML, XML, INI
- Analytical Formats: Parquet, Feather, Arrow, HDF5
Advanced Security
- Path Security: Restricts file access to current working directory only
- SQL Injection Prevention: Parameterized queries and safe table identifiers
- Connection Limits: Maximum 10 concurrent database connections
- Input Validation: Comprehensive validation and sanitization
Large Dataset Handling
- Query Buffering: Automatic buffering for results with 100+ rows
- Large File Support: 100MB+ files automatically use temporary SQLite storage
- Chunk Retrieval: Paginated access to large result sets
- Auto-Cleanup: 10-minute expiry with file modification detection
Developer Experience
- Clean Tool Surface: 8 essential database operation tools
- Error Handling: Detailed, actionable error messages
- Thread Safety: Concurrent operation support
- Backward Compatible: All existing APIs preserved
Quick Start
Installation
# Using pip
pip install localdata-mcp
# Using uv (recommended)
uv tool install localdata-mcp
# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .
Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"localdata": {
"command": "localdata-mcp",
"env": {}
}
}
}
Docker Usage: See for container deployment and configuration.
Usage Examples
Connect to Databases
# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")
# SQLite
connect_database("local", "sqlite", "./data.sqlite")
# CSV Files
connect_database("csvdata", "csv", "./data.csv")
# JSON Files
connect_database("config", "json", "./config.json")
# Excel Spreadsheets (all sheets)
connect_database("sales", "xlsx", "./sales_data.xlsx")
# Excel with specific sheet
connect_database("q1data", "xlsx", "./quarterly.xlsx?sheet=Q1_Sales")
# LibreOffice Calc
connect_database("budget", "ods", "./budget_2024.ods")
# Tab-separated values
connect_database("exports", "tsv", "./export_data.tsv")
# XML structured data
connect_database("config_xml", "xml", "./config.xml")
# INI configuration files
connect_database("settings", "ini", "./app.ini")
# Analytical formats
connect_database("analytics", "parquet", "./data.parquet")
connect_database("features", "feather", "./features.feather")
connect_database("vectors", "arrow", "./vectors.arrow")
Query Data
# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")
# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")
Handle Large Results
# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")
# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")
# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")
Available Tools
Tool | Description | Use Case |
---|---|---|
connect_database | Connect to databases/files | Initial setup |
disconnect_database | Close connections | Cleanup |
list_databases | Show active connections | Status check |
execute_query | Run SQL with automatic chunking | All query needs |
next_chunk | Get next chunk of large result sets | Large data |
describe_database | Show database schema | Exploration |
describe_table | Show table structure | Analysis |
find_table | Locate tables by name | Navigation |
Supported Data Sources
Detailed Connection Guide: See for setup instructions, connection strings, and security practices.
SQL Databases
- PostgreSQL: Full support with connection pooling
- MySQL: Complete MySQL/MariaDB compatibility
- SQLite: Local file and in-memory databases
- DuckDB: High-performance analytical SQL database
Modern Databases
- MongoDB: Document store with collection queries and aggregation
- Redis: High-performance key-value store
- Elasticsearch: Full-text search and analytics engine
- InfluxDB: Time-series database for metrics and IoT data
- Neo4j: Graph database for relationship queries
- CouchDB: Document-oriented database with HTTP API
Structured Files
Spreadsheet Formats
- Excel (.xlsx, .xls): Full multi-sheet support with automatic table creation
- LibreOffice Calc (.ods): Complete ODS support with sheet handling
- Apple Numbers (.numbers): Native support for Numbers documents
- Multi-sheet handling: Each sheet becomes a separate queryable table
Text-Based Formats
- CSV: Large file automatic SQLite conversion
- TSV: Tab-separated values with same features as CSV
- JSON: Nested structure flattening
- YAML: Configuration file support
- TOML: Settings and config files
- XML: Structured XML document parsing
- INI: Configuration file format support
Analytical Formats
- Parquet: High-performance columnar data format
- Feather: Fast binary format for data interchange
- Arrow: In-memory columnar format support
- HDF5: Hierarchical data format for scientific computing
Security Features
Path Security
# ✅ Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"
# ❌ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"
SQL Injection Prevention
# ✅ Safe - parameterized queries
describe_table("mydb", "users") # Validates table name
# ❌ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")
Resource Limits
- Connection Limit: Maximum 10 concurrent connections
- File Size Threshold: 100MB triggers temporary storage
- Query Buffering: Automatic for 100+ row results
- Auto-Cleanup: Buffers expire after 10 minutes
Performance & Scalability
Large File Handling
- Files over 100MB automatically use temporary SQLite storage
- Memory-efficient streaming for large datasets
- Automatic cleanup of temporary files
Query Optimization
- Results with 100+ rows automatically use buffering system
- Chunk-based retrieval for large datasets
- File modification detection for cache invalidation
Concurrency
- Thread-safe connection management
- Concurrent query execution support
- Resource pooling and limits
Testing & Quality
Comprehensive Test Coverage
- 68% test coverage with 500+ test cases
- Import error handling and graceful degradation
- Security vulnerability testing
- Performance benchmarking with large datasets
- Modern database connection testing
Security Validated
- Path traversal prevention
- SQL injection protection
- Resource exhaustion testing
- Malicious input handling
Performance Tested
- Large file processing
- Concurrent connection handling
- Memory usage optimization
- Query response times
API Compatibility
All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:
- All original tools work unchanged
- Enhanced responses with additional metadata
- New buffering tools for large datasets
- Improved error messages and validation
Examples
Production Examples: See for production-ready usage patterns and complex scenarios.
Basic Database Operations
# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")
# Explore structure
describe_database("sales")
describe_table("sales", "orders")
# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")
Large Dataset Processing
# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")
# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")
# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")
Multi-Database Analysis
# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")
# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")
Multi-Sheet Spreadsheet Handling
LocalData MCP Server provides comprehensive support for multi-sheet spreadsheets (Excel and LibreOffice Calc):
Automatic Multi-Sheet Processing
# Connect to Excel file - all sheets become separate tables
connect_database("workbook", "xlsx", "./financial_data.xlsx")
# Query specific sheet (table names are sanitized sheet names)
execute_query("workbook", "SELECT * FROM Q1_Sales")
execute_query("workbook", "SELECT * FROM Q2_Budget")
execute_query("workbook", "SELECT * FROM Annual_Summary")
Single Sheet Selection
# Connect to specific sheet only using ?sheet=SheetName syntax
connect_database("q1only", "xlsx", "./financial_data.xlsx?sheet=Q1 Sales")
# The data is available as the default table
execute_query("q1only", "SELECT * FROM data")
Sheet Name Sanitization
Sheet names are automatically sanitized for SQL compatibility:
Original Sheet Name | SQL Table Name |
---|---|
"Q1 Sales" | Q1_Sales |
"2024-Budget" | _2024_Budget |
"Summary & Notes" | Summary__Notes |
Discovering Available Sheets
# Connect to multi-sheet workbook
connect_database("workbook", "xlsx", "./data.xlsx")
# List all available tables (sheets)
describe_database("workbook")
# Get sample data from specific sheet
get_table_sample("workbook", "Sheet1")
Troubleshooting
For comprehensive troubleshooting guidance, see . For common questions, check the .
Roadmap
Completed (v1.1.0)
- Spreadsheet Formats: Excel (.xlsx/.xls), LibreOffice Calc (.ods) with full multi-sheet support
- Enhanced File Formats: XML, INI, TSV support
- Analytical Formats: Parquet, Feather, Arrow support
Planned Features
- Caching Layer: Configurable query result caching
- Connection Pooling: Advanced connection management
- Streaming APIs: Real-time data processing
- Monitoring Tools: Connection and performance metrics
- Export Capabilities: Query results to various formats
Contributing
Contributions welcome! Please read our for details.
Development Setup
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest
License
MIT License - see the file for details.
Links
- GitHub: localdata-mcp
- PyPI: localdata-mcp
- MCP Protocol: Model Context Protocol
- FastMCP: FastMCP Framework
Tags
mcp
model-context-protocol
database
postgresql
mysql
sqlite
mongodb
spreadsheet
excel
xlsx
ods
csv
tsv
json
yaml
toml
xml
ini
parquet
feather
arrow
ai
machine-learning
data-integration
python
security
performance
Made with care for the MCP Community