ChrisGVE/localdata-mcp
If you are the rightful owner of localdata-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
LocalData MCP Server is a robust and secure server designed for managing local databases and structured text files, offering advanced security and efficient handling of large datasets.
LocalData MCP Server
A powerful, secure MCP server for local databases and structured text files with advanced security features and large dataset handling.
โจ Features
๐๏ธ Multi-Database Support
- SQL Databases: PostgreSQL, MySQL, SQLite
- Document Databases: MongoDB
- Structured Files: CSV, JSON, YAML, TOML
๐ Advanced Security
- Path Security: Restricts file access to current working directory only
- SQL Injection Prevention: Parameterized queries and safe table identifiers
- Connection Limits: Maximum 10 concurrent database connections
- Input Validation: Comprehensive validation and sanitization
๐ Large Dataset Handling
- Query Buffering: Automatic buffering for results with 100+ rows
- Large File Support: 100MB+ files automatically use temporary SQLite storage
- Chunk Retrieval: Paginated access to large result sets
- Auto-Cleanup: 10-minute expiry with file modification detection
๐ ๏ธ Developer Experience
- Comprehensive Tools: 12 database operation tools
- Error Handling: Detailed, actionable error messages
- Thread Safety: Concurrent operation support
- Backward Compatible: All existing APIs preserved
๐ Quick Start
Installation
# Using pip
pip install localdata-mcp
# Using uv (recommended)
uv tool install localdata-mcp
# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .
Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"localdata": {
"command": "localdata-mcp",
"env": {}
}
}
}
Usage Examples
Connect to Databases
# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")
# SQLite
connect_database("local", "sqlite", "./data.sqlite")
# CSV Files
connect_database("csvdata", "csv", "./data.csv")
# JSON Files
connect_database("config", "json", "./config.json")
Query Data
# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")
# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")
Handle Large Results
# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")
# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")
# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")
๐ง Available Tools
Tool | Description | Use Case |
---|---|---|
connect_database | Connect to databases/files | Initial setup |
disconnect_database | Close connections | Cleanup |
list_databases | Show active connections | Status check |
execute_query | Run SQL (markdown output) | Small results |
execute_query_json | Run SQL (JSON output) | Large results |
describe_database | Show schema/structure | Exploration |
describe_table | Show table details | Analysis |
get_table_sample | Preview table data | Quick look |
get_table_sample_json | Preview (JSON format) | Development |
find_table | Locate tables by name | Navigation |
read_text_file | Read structured files | File access |
get_query_chunk | Paginated result access | Large data |
get_buffered_query_info | Buffer status info | Monitoring |
clear_query_buffer | Manual buffer cleanup | Management |
๐ Supported Data Sources
SQL Databases
- PostgreSQL: Full support with connection pooling
- MySQL: Complete MySQL/MariaDB compatibility
- SQLite: Local file and in-memory databases
Document Databases
- MongoDB: Collection queries and aggregation
Structured Files
- CSV: Large file automatic SQLite conversion
- JSON: Nested structure flattening
- YAML: Configuration file support
- TOML: Settings and config files
๐ก๏ธ Security Features
Path Security
# โ
Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"
# โ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"
SQL Injection Prevention
# โ
Safe - parameterized queries
describe_table("mydb", "users") # Validates table name
# โ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")
Resource Limits
- Connection Limit: Maximum 10 concurrent connections
- File Size Threshold: 100MB triggers temporary storage
- Query Buffering: Automatic for 100+ row results
- Auto-Cleanup: Buffers expire after 10 minutes
๐ Performance & Scalability
Large File Handling
- Files over 100MB automatically use temporary SQLite storage
- Memory-efficient streaming for large datasets
- Automatic cleanup of temporary files
Query Optimization
- Results with 100+ rows automatically use buffering system
- Chunk-based retrieval for large datasets
- File modification detection for cache invalidation
Concurrency
- Thread-safe connection management
- Concurrent query execution support
- Resource pooling and limits
๐งช Testing & Quality
โ 100% Test Coverage
- 100+ comprehensive test cases
- Security vulnerability testing
- Performance benchmarking
- Edge case validation
๐ Security Validated
- Path traversal prevention
- SQL injection protection
- Resource exhaustion testing
- Malicious input handling
โก Performance Tested
- Large file processing
- Concurrent connection handling
- Memory usage optimization
- Query response times
๐ API Compatibility
All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:
- โ All original tools work unchanged
- โ Enhanced responses with additional metadata
- โ New buffering tools for large datasets
- โ Improved error messages and validation
๐ Examples
Basic Database Operations
# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")
# Explore structure
describe_database("sales")
describe_table("sales", "orders")
# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")
Large Dataset Processing
# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")
# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")
# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")
Multi-Database Analysis
# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")
# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")
๐ง Roadmap
- Enhanced File Formats: Excel, Parquet support
- Caching Layer: Configurable query result caching
- Connection Pooling: Advanced connection management
- Streaming APIs: Real-time data processing
- Monitoring Tools: Connection and performance metrics
๐ค Contributing
Contributions welcome! Please read our for details.
Development Setup
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest
๐ License
MIT License - see the file for details.
๐ Links
- GitHub: localdata-mcp
- PyPI: localdata-mcp
- MCP Protocol: Model Context Protocol
- FastMCP: FastMCP Framework
๐ Stats
๐ Additional Resources
- : Common questions and troubleshooting
- : Comprehensive problem resolution
- : Production-ready usage patterns
- : Technical deep dive and use cases
๐ค Need Help?
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: Available in GitHub profile
- Community: Join MCP community forums
๐ท๏ธ Tags
mcp
model-context-protocol
database
postgresql
mysql
sqlite
mongodb
csv
json
yaml
toml
ai
machine-learning
data-integration
python
security
performance
Made with โค๏ธ for the MCP Community