albertwir314/databricks-mcp
If you are the rightful owner of databricks-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Databricks MCP Server is a comprehensive Model Context Protocol server that facilitates natural language interaction with Databricks resources, enabling seamless integration between Claude AI and Databricks workspaces.
Databricks MCP Server
A comprehensive Model Context Protocol (MCP) server that provides seamless integration between Claude AI and Databricks workspaces. This server enables natural language interaction with Databricks resources including notebooks, clusters, tables, and SQL warehouses across multiple workspace connections.
🐳 Docker-based for complete cross-platform compatibility! Works identically on Windows, WSL, Linux, and macOS.
🚀 Quick Start
Prerequisites
- Docker installed (Get Docker) - for Docker-based installation
- Python 3.10+ - for pip-based installation (alternative to Docker)
- Databricks workspace access with personal access tokens
- Git for cloning the repository
Setup
# Clone the repository
git clone <repository-url>
cd databricks-mcp
# Create your credentials file
cp databricks_connections_example.json databricks_connections.json
# Edit databricks_connections.json with your Databricks host and token
# Register with Claude Code (builds image and registers automatically)
./register-mcp-claude-code.sh
That's it! The registration script:
- ✅ Builds the Docker image
- ✅ Creates a wrapper script that finds your config dynamically
- ✅ Registers globally so it works from any terminal
- ✅ No need to re-register when opening new terminals
Testing manually:
# Test the server directly
docker run --rm -i \
-v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \
databricks-mcp:latest
Windows PowerShell: Replace $(pwd) with ${PWD}
Windows Git Bash: Use $(pwd) as shown above
📋 Features
Core Capabilities
- Multi-workspace support: Connect to multiple Databricks workspaces simultaneously
- Notebook management: Create, list, and read notebooks with natural language
- Cluster operations: Monitor and manage compute clusters
- Data exploration: Browse catalogs, schemas, and tables intuitively
- SQL execution: Run queries on SQL warehouses with automatic warehouse selection
- Table introspection: Get detailed schema and metadata information
Architecture
- FastMCP framework: High-performance MCP server implementation
- Databricks SDK: Official SDK for robust workspace integration
- Docker containerization: Complete cross-platform compatibility
- Zero dependencies: No Python, pip, or UV installation required on host
⚙️ Configuration
Databricks Credentials Setup
Create a databricks_connections.json file in the project root with your workspace credentials:
{
"connections": {
"default": {
"name": "default",
"host": "https://your-workspace.azuredatabricks.net/",
"token": "dapi1234567890abcdef"
},
"staging": {
"name": "staging",
"host": "https://staging-workspace.azuredatabricks.net/",
"token": "dapi0987654321fedcba"
}
}
}
Note: This file is automatically ignored by git (see .gitignore). Never commit credentials to version control.
📦 Installation Methods
Method 1: Docker (Recommended)
Best for: Cross-platform compatibility, isolated environment, production use
See Quick Start above for Docker-based installation.
Pros:
- ✅ Works identically on Windows, WSL, Linux, and macOS
- ✅ No Python installation required on host
- ✅ Complete isolation from host system
- ✅ Easy to distribute and reproduce
Method 2: pip Installation
Best for: Development, Python-native workflows, debugging
# Clone the repository
git clone <repository-url>
cd databricks-mcp
# Create your credentials file
cp databricks_connections_example.json databricks_connections.json
# Edit databricks_connections.json with your host and token
# Install in development mode
pip install -e .
# Verify installation
databricks-mcp --help
Configure Claude Desktop for pip installation:
Edit your Claude Desktop config file (see locations in Configuration section). Ensure databricks_connections.json is in the project directory where you installed the package.
{
"mcpServers": {
"databricks": {
"command": "databricks-mcp",
"args": []
}
}
}
See claude_desktop_config_pip.json for a complete example.
Configure Claude Code CLI for pip installation:
# Option 1: Global registration
claude mcp add -s user databricks databricks-mcp
# Option 2: Project-local .mcp.json
# See .mcp_example.json for configuration examples
Pros:
- ✅ Faster startup (no Docker overhead)
- ✅ Easier debugging with breakpoints
- ✅ Direct access to source code
- ✅ Native Python tooling support
Cons:
- ❌ Requires Python 3.10+ on host
- ❌ May have platform-specific dependency issues
- ❌ Needs virtual environment management
Method 3: PyPI Installation (Future)
Once published to PyPI:
pip install databricks-mcp
databricks-mcp
🛠️ Configuration
Claude Desktop Setup
-
Locate your Claude Desktop config file:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
- Windows:
-
Add this configuration:
{
"mcpServers": {
"databricks": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-v",
"/absolute/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro",
"databricks-mcp:latest"
]
}
}
}
-
Update the volume path:
- Windows:
C:\\Users\\YourName\\path\\to\\databricks-mcp\\databricks_connections.json:/app/databricks_connections.json:ro - macOS:
/Users/yourname/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro - Linux/WSL:
/home/yourname/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro
- Windows:
-
Restart Claude Desktop and verify in Settings → Developer → MCP Servers
Claude Code CLI Setup
Option 1: Automated Registration with Wrapper (Recommended)
# Run the registration script (builds image and registers globally)
./register-mcp-claude-code.sh
How it works:
- Builds the Docker image
- Creates a wrapper script (
databricks-mcp-wrapper.sh) that dynamically finds your config file - Registers the wrapper globally (user scope)
- Works from any terminal - no need to re-register!
This is the recommended approach because:
- ✅ One-time setup: Register once, works everywhere
- ✅ Dynamic config resolution: Wrapper finds
databricks_connections.jsonrelative to itself - ✅ Portable: Different users can clone the repo anywhere
- ✅ No hardcoded paths: Config location determined at runtime, not registration time
Option 2: Manual Registration (Advanced)
If you prefer manual setup or need custom configuration:
# Build the Docker image
docker build -t databricks-mcp:latest .
# Make wrapper executable
chmod +x databricks-mcp-wrapper.sh
# Register the wrapper
claude mcp add -s user databricks "$(pwd)/databricks-mcp-wrapper.sh"
Option 3: Project-Specific Setup
Create or edit .mcp.json in your project directory. See .mcp_example.json for detailed examples of:
- Wrapper script configuration (recommended)
- Direct Docker configuration
- pip-based configuration
- Platform-specific path examples
🔧 Available Tools
Connection Management
list_databricks_connections(): List all configured workspace connectionsadd_databricks_connection(name, host, token): Add a new workspace connection at runtime
Notebook Operations
create_notebook(path, language, content, connection_name): Create new notebooks (Python, SQL, Scala, R)list_notebooks(path, connection_name): Browse workspace directories and notebooksget_notebook_content(path, connection_name): Read notebook source code
Cluster Management
list_clusters(connection_name): View all clusters with status, configuration, and worker counts
Data Catalog Operations
list_catalogs(connection_name): Browse all available Unity Catalog instanceslist_schemas(catalog_name, connection_name): Explore schemas within catalogslist_tables(catalog_name, schema_name, connection_name): View tables within schemasget_table_info(table_name, catalog_name, schema_name, connection_name): Get detailed table metadata including columns, types, and comments
SQL Execution
execute_sql_query(query, warehouse_id, connection_name): Run SQL queries with automatic warehouse selection and result formattinglist_sql_warehouses(connection_name): View available SQL compute resources
💬 Usage Examples
Creating Notebooks
Claude, create a Python notebook at '/Users/myname/data_analysis.py' with some basic pandas data exploration code for analyzing customer data.
Data Exploration
Claude, show me all the catalogs available, then list the tables in the 'sales' schema of the 'production' catalog, and describe the structure of the 'customers' table.
SQL Queries
Claude, run a query to get the top 10 customers by total purchase amount from the sales.customers table, and show me which SQL warehouse was used.
Multi-Workspace Operations
Claude, add a connection to our staging environment at 'https://staging.azuredatabricks.net/' called 'staging', then compare the table schemas between production and staging for the users table.
Cluster Monitoring
Claude, show me the status of all clusters and identify any that are currently running but might be idle.
🐛 Troubleshooting
Docker Issues
Problem: "Docker not found" or "command not found: docker"
- Solution: Install Docker Desktop:
- Windows/macOS: Docker Desktop
- Linux: Docker Engine
- Verify:
docker --version
Problem: "Cannot connect to Docker daemon"
- Windows/macOS: Launch Docker Desktop application
- Linux:
sudo systemctl start docker
Problem: "Permission denied" on Linux
- Solution: Add user to docker group:
sudo usermod -aG docker $USER newgrp docker
Problem: Build fails
- Check: Internet connection
- Try:
docker pull python:3.11-slimto pre-pull base image
Configuration Issues
Problem: "Server disconnected" or "Connection failed"
- Check: Docker image exists:
docker images | grep databricks-mcp - Rebuild:
docker build -t databricks-mcp:latest . - Verify: Absolute path to
databricks_connections.jsonin config
Problem: Volume mount errors on Windows
- Use: Double backslashes in JSON:
C:\\Users\\Name\\... - Or: Forward slashes:
C:/Users/Name/...
Problem: "File not found" for databricks_connections.json
- Check: File exists:
ls databricks_connections.json - Verify: Using absolute path in configuration
- Note: Relative paths don't work with Docker volumes
Claude Code CLI Issues
Problem: "Need to re-register every time I open a new terminal"
- Root Cause: Using project-local
.mcp.jsoninstead of global registration - Solution: Run
./register-mcp-claude-code.shwhich registers with-s userflag (global scope) - Verify: Run
claude mcp listfrom any directory - should show databricks server - Note: The wrapper script approach means you only register once!
Problem: Registration script fails
- Solution:
chmod +x register-mcp-claude-code.sh - Check: Docker is running
- Check:
databricks-mcp-wrapper.shexists in project directory - Alternative: Use manual registration
Problem: Server not showing in claude mcp list
- Solution: Re-run
./register-mcp-claude-code.sh - Check: From different directory - if it only shows in project dir, it's not globally registered
- Clean:
claude mcp remove -s user databricksthen re-register
Problem: "databricks_connections.json not found" error
- Root Cause: Wrapper script can't find config file
- Solution: Ensure
databricks_connections.jsonexists in same directory asdatabricks-mcp-wrapper.sh - Check:
ls -la databricks_connections.jsonin project root - Verify: Run wrapper directly:
./databricks-mcp-wrapper.sh
Authentication Issues
Problem: "Invalid credentials"
- Check:
databricks_connections.jsonexists and is mounted correctly - Verify: Host URL format:
https://workspace.azuredatabricks.net/ - Test: Token hasn't expired
- Debug: Run container directly:
docker run --rm -i \ -v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \ databricks-mcp:latest
General Issues
Problem: Code changes not reflected
- Solution: Rebuild image after changes:
docker build -t databricks-mcp:latest . - Clean build:
docker build --no-cache -t databricks-mcp:latest .
Problem: Multiple server instances
- Solution: Use unique names:
databricks-prod,databricks-staging - Check: Claude Desktop settings or
claude mcp list
🔒 Security Best Practices
Credential Management
- Never commit tokens to version control
- Rotate tokens regularly following your organization's security policy
- Consider service principals for production deployments
Access Control
- Principle of least privilege: Grant minimum necessary permissions
- Separate environments: Use different tokens for dev/staging/prod
- Monitor usage: Review MCP server access logs regularly
Configuration Security
- Restrict file permissions on configuration files containing tokens
- Use workspace-scoped tokens when possible
- Implement token expiration policies
📁 Project Structure
databricks-mcp/
├── src/
│ └── databricks_mcp/
│ ├── __init__.py
│ └── server.py # Main MCP server implementation
├── Dockerfile # Docker container definition
├── docker-compose.yml # Docker Compose configuration
├── databricks-mcp-wrapper.sh # Wrapper script for dynamic config resolution
├── register-mcp-claude-code.sh # Automated registration script
├── pyproject.toml # Python dependencies and package metadata
├── README.md # This file
│
├── .mcp.json # Your Claude Code CLI config (gitignored)
├── .mcp_example.json # Project-local config example
├── .gitignore # Git ignore rules
│
├── databricks_connections.json # Your credentials (gitignored)
├── databricks_connections_example.json # Credentials template
├── claude_desktop_config.json # Your Claude Desktop config (gitignored)
├── claude_desktop_config_example.json # Claude Desktop config example
└── claude_desktop_config_pip.json # Pip-based Claude Desktop config example
Key Components
- databricks-mcp-wrapper.sh: Wrapper script that dynamically finds config at runtime
- register-mcp-claude-code.sh: Automated Docker build and registration
- server.py: FastMCP-based server with 12 Databricks integration tools
- Dockerfile: Optimized single-stage build using pip
- docker-compose.yml: Optional simplified Docker setup
📄 Configuration Files Reference
This section documents all configuration files, including hidden files that may not be visible in your file explorer.
Core Configuration Files
| File | Purpose | Required | Gitignored | Example File |
|---|---|---|---|---|
databricks_connections.json | Your Databricks workspace credentials | ✅ Yes | ✅ Yes | databricks_connections_example.json |
.mcp.json | Project-local MCP server config for Claude Code | ❌ No | ✅ Yes | .mcp_example.json |
claude_desktop_config.json | Claude Desktop MCP server config | ❌ No* | ✅ Yes | claude_desktop_config_example.json or claude_desktop_config_pip.json |
* Required only if using Claude Desktop (not Claude Code CLI)
Configuration File Details
databricks_connections.json (Required)
Location: Project root Purpose: Stores Databricks workspace credentials and connection information Format:
{
"connections": {
"default": {
"name": "default",
"host": "https://your-workspace.azuredatabricks.net/",
"token": "dapi1234567890abcdef"
}
}
}
Security: Never commit this file! It's automatically ignored by .gitignore.
.mcp.json (Optional)
Location: Project root or your Claude Code project directory
Purpose: Project-local MCP server configuration for Claude Code CLI
When to use: If you prefer project-specific configuration instead of global registration
See: .mcp_example.json for detailed configuration examples
claude_desktop_config.json (Claude Desktop only)
Location: Platform-specific
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Purpose: Global MCP server configuration for Claude Desktop application
See: claude_desktop_config_example.json for configuration examples
Hidden Files (start with .)
To view hidden files:
- Linux/macOS:
ls -la - Windows PowerShell:
Get-ChildItem -Force - Windows File Explorer: View → Show → Hidden items
- VS Code: Files are visible by default
Where Configuration is Stored
Claude Code CLI (after running ./register-mcp-claude-code.sh):
- User-scope config:
~/.config/claude/mcp.json(or similar based on platform) - Project-scope config:
.mcp.jsonin your project directory
Claude Desktop:
- See platform-specific paths above
🚀 Development
Local Development Setup
For iterative development, mount your source code:
docker run --rm -it \
-v $(pwd)/src:/app/src \
-v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \
databricks-mcp:latest
Adding New Tools
- Edit
src/databricks_mcp/server.py - Add new
@mcp.tool()decorated functions - Rebuild:
docker build -t databricks-mcp:latest . - Test with Claude
Adding Dependencies
- Edit
pyproject.tomldependencies section - Rebuild:
docker build -t databricks-mcp:latest .
Running Tests
docker run --rm -it databricks-mcp:latest pytest tests/
Code Quality
# Format code
docker run --rm -it -v $(pwd):/app databricks-mcp:latest black src/
# Lint
docker run --rm -it -v $(pwd):/app databricks-mcp:latest ruff check src/
# Type check
docker run --rm -it -v $(pwd):/app databricks-mcp:latest mypy src/
Using Docker Compose
# Start service
docker-compose up
# Run tests
docker-compose exec databricks-mcp pytest
# Stop service
docker-compose down
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes following the existing code style
- Add tests for new functionality
- Update documentation as needed
- Submit a pull request