databricks-mcp by albertwir314 - MCP Server

Databricks MCP Server

A comprehensive Model Context Protocol (MCP) server that provides seamless integration between Claude AI and Databricks workspaces. This server enables natural language interaction with Databricks resources including notebooks, clusters, tables, and SQL warehouses across multiple workspace connections.

🐳 Docker-based for complete cross-platform compatibility! Works identically on Windows, WSL, Linux, and macOS.

🚀 Quick Start

Prerequisites

Docker installed (Get Docker) - for Docker-based installation
Python 3.10+ - for pip-based installation (alternative to Docker)
Databricks workspace access with personal access tokens
Git for cloning the repository

Setup

# Clone the repository
git clone <repository-url>
cd databricks-mcp

# Create your credentials file
cp databricks_connections_example.json databricks_connections.json
# Edit databricks_connections.json with your Databricks host and token

# Register with Claude Code (builds image and registers automatically)
./register-mcp-claude-code.sh

That's it! The registration script:

✅ Builds the Docker image
✅ Creates a wrapper script that finds your config dynamically
✅ Registers globally so it works from any terminal
✅ No need to re-register when opening new terminals

Testing manually:

# Test the server directly
docker run --rm -i \
  -v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \
  databricks-mcp:latest

Windows PowerShell: Replace $(pwd) with ${PWD} Windows Git Bash: Use $(pwd) as shown above

📋 Features

Core Capabilities

Multi-workspace support: Connect to multiple Databricks workspaces simultaneously
Notebook management: Create, list, and read notebooks with natural language
Cluster operations: Monitor and manage compute clusters
Data exploration: Browse catalogs, schemas, and tables intuitively
SQL execution: Run queries on SQL warehouses with automatic warehouse selection
Table introspection: Get detailed schema and metadata information

Architecture

FastMCP framework: High-performance MCP server implementation
Databricks SDK: Official SDK for robust workspace integration
Docker containerization: Complete cross-platform compatibility
Zero dependencies: No Python, pip, or UV installation required on host

⚙️ Configuration

Databricks Credentials Setup

Create a databricks_connections.json file in the project root with your workspace credentials:

{
  "connections": {
    "default": {
      "name": "default",
      "host": "https://your-workspace.azuredatabricks.net/",
      "token": "dapi1234567890abcdef"
    },
    "staging": {
      "name": "staging",
      "host": "https://staging-workspace.azuredatabricks.net/",
      "token": "dapi0987654321fedcba"
    }
  }
}

Note: This file is automatically ignored by git (see .gitignore). Never commit credentials to version control.

📦 Installation Methods

Method 1: Docker (Recommended)

Best for: Cross-platform compatibility, isolated environment, production use

See Quick Start above for Docker-based installation.

Pros:

✅ Works identically on Windows, WSL, Linux, and macOS
✅ No Python installation required on host
✅ Complete isolation from host system
✅ Easy to distribute and reproduce

Method 2: pip Installation

Best for: Development, Python-native workflows, debugging

# Clone the repository
git clone <repository-url>
cd databricks-mcp

# Create your credentials file
cp databricks_connections_example.json databricks_connections.json
# Edit databricks_connections.json with your host and token

# Install in development mode
pip install -e .

# Verify installation
databricks-mcp --help

Configure Claude Desktop for pip installation:

Edit your Claude Desktop config file (see locations in Configuration section). Ensure databricks_connections.json is in the project directory where you installed the package.

{
  "mcpServers": {
    "databricks": {
      "command": "databricks-mcp",
      "args": []
    }
  }
}

See claude_desktop_config_pip.json for a complete example.

Configure Claude Code CLI for pip installation:

# Option 1: Global registration
claude mcp add -s user databricks databricks-mcp

# Option 2: Project-local .mcp.json
# See .mcp_example.json for configuration examples

Pros:

✅ Faster startup (no Docker overhead)
✅ Easier debugging with breakpoints
✅ Direct access to source code
✅ Native Python tooling support

Cons:

❌ Requires Python 3.10+ on host
❌ May have platform-specific dependency issues
❌ Needs virtual environment management

Method 3: PyPI Installation (Future)

Once published to PyPI:

pip install databricks-mcp
databricks-mcp

🛠️ Configuration

Claude Desktop Setup

Locate your Claude Desktop config file:
- Windows: %APPDATA%\Claude\claude_desktop_config.json
- macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
- Linux: ~/.config/Claude/claude_desktop_config.json
Add this configuration:

{
  "mcpServers": {
    "databricks": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-v",
        "/absolute/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro",
        "databricks-mcp:latest"
      ]
    }
  }
}

Update the volume path:
- Windows: C:\\Users\\YourName\\path\\to\\databricks-mcp\\databricks_connections.json:/app/databricks_connections.json:ro
- macOS: /Users/yourname/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro
- Linux/WSL: /home/yourname/path/to/databricks-mcp/databricks_connections.json:/app/databricks_connections.json:ro
Restart Claude Desktop and verify in Settings → Developer → MCP Servers

Claude Code CLI Setup

Option 1: Automated Registration with Wrapper (Recommended)

# Run the registration script (builds image and registers globally)
./register-mcp-claude-code.sh

How it works:

Builds the Docker image
Creates a wrapper script (databricks-mcp-wrapper.sh) that dynamically finds your config file
Registers the wrapper globally (user scope)
Works from any terminal - no need to re-register!

This is the recommended approach because:

✅ One-time setup: Register once, works everywhere
✅ Dynamic config resolution: Wrapper finds databricks_connections.json relative to itself
✅ Portable: Different users can clone the repo anywhere
✅ No hardcoded paths: Config location determined at runtime, not registration time

Option 2: Manual Registration (Advanced)

If you prefer manual setup or need custom configuration:

# Build the Docker image
docker build -t databricks-mcp:latest .

# Make wrapper executable
chmod +x databricks-mcp-wrapper.sh

# Register the wrapper
claude mcp add -s user databricks "$(pwd)/databricks-mcp-wrapper.sh"

Option 3: Project-Specific Setup

Create or edit .mcp.json in your project directory. See .mcp_example.json for detailed examples of:

Wrapper script configuration (recommended)
Direct Docker configuration
pip-based configuration
Platform-specific path examples

🔧 Available Tools

Connection Management

list_databricks_connections(): List all configured workspace connections
add_databricks_connection(name, host, token): Add a new workspace connection at runtime

Notebook Operations

create_notebook(path, language, content, connection_name): Create new notebooks (Python, SQL, Scala, R)
list_notebooks(path, connection_name): Browse workspace directories and notebooks
get_notebook_content(path, connection_name): Read notebook source code

Cluster Management

list_clusters(connection_name): View all clusters with status, configuration, and worker counts

Data Catalog Operations

list_catalogs(connection_name): Browse all available Unity Catalog instances
list_schemas(catalog_name, connection_name): Explore schemas within catalogs
list_tables(catalog_name, schema_name, connection_name): View tables within schemas
get_table_info(table_name, catalog_name, schema_name, connection_name): Get detailed table metadata including columns, types, and comments

SQL Execution

execute_sql_query(query, warehouse_id, connection_name): Run SQL queries with automatic warehouse selection and result formatting
list_sql_warehouses(connection_name): View available SQL compute resources

💬 Usage Examples

Creating Notebooks

Claude, create a Python notebook at '/Users/myname/data_analysis.py' with some basic pandas data exploration code for analyzing customer data.

Data Exploration

Claude, show me all the catalogs available, then list the tables in the 'sales' schema of the 'production' catalog, and describe the structure of the 'customers' table.

SQL Queries

Claude, run a query to get the top 10 customers by total purchase amount from the sales.customers table, and show me which SQL warehouse was used.

Multi-Workspace Operations

Claude, add a connection to our staging environment at 'https://staging.azuredatabricks.net/' called 'staging', then compare the table schemas between production and staging for the users table.

Cluster Monitoring

Claude, show me the status of all clusters and identify any that are currently running but might be idle.

🐛 Troubleshooting

Docker Issues

Problem: "Docker not found" or "command not found: docker"

Solution: Install Docker Desktop:
- Windows/macOS: Docker Desktop
- Linux: Docker Engine
Verify: docker --version

Problem: "Cannot connect to Docker daemon"

Windows/macOS: Launch Docker Desktop application
Linux: sudo systemctl start docker

Problem: "Permission denied" on Linux

Solution: Add user to docker group:

sudo usermod -aG docker $USER
newgrp docker

Problem: Build fails

Check: Internet connection
Try: docker pull python:3.11-slim to pre-pull base image

Configuration Issues

Problem: "Server disconnected" or "Connection failed"

Check: Docker image exists: docker images | grep databricks-mcp
Rebuild: docker build -t databricks-mcp:latest .
Verify: Absolute path to databricks_connections.json in config

Problem: Volume mount errors on Windows

Use: Double backslashes in JSON: C:\\Users\\Name\\...
Or: Forward slashes: C:/Users/Name/...

Problem: "File not found" for databricks_connections.json

Check: File exists: ls databricks_connections.json
Verify: Using absolute path in configuration
Note: Relative paths don't work with Docker volumes

Claude Code CLI Issues

Problem: "Need to re-register every time I open a new terminal"

Root Cause: Using project-local .mcp.json instead of global registration
Solution: Run ./register-mcp-claude-code.sh which registers with -s user flag (global scope)
Verify: Run claude mcp list from any directory - should show databricks server
Note: The wrapper script approach means you only register once!

Problem: Registration script fails

Solution: chmod +x register-mcp-claude-code.sh
Check: Docker is running
Check: databricks-mcp-wrapper.sh exists in project directory
Alternative: Use manual registration

Problem: Server not showing in claude mcp list

Solution: Re-run ./register-mcp-claude-code.sh
Check: From different directory - if it only shows in project dir, it's not globally registered
Clean: claude mcp remove -s user databricks then re-register

Problem: "databricks_connections.json not found" error

Root Cause: Wrapper script can't find config file
Solution: Ensure databricks_connections.json exists in same directory as databricks-mcp-wrapper.sh
Check: ls -la databricks_connections.json in project root
Verify: Run wrapper directly: ./databricks-mcp-wrapper.sh

Authentication Issues

Problem: "Invalid credentials"

Check: databricks_connections.json exists and is mounted correctly
Verify: Host URL format: https://workspace.azuredatabricks.net/
Test: Token hasn't expired

Debug: Run container directly:

docker run --rm -i \
  -v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \
  databricks-mcp:latest

General Issues

Problem: Code changes not reflected

Solution: Rebuild image after changes:

docker build -t databricks-mcp:latest .

Clean build: docker build --no-cache -t databricks-mcp:latest .

Problem: Multiple server instances

Solution: Use unique names: databricks-prod, databricks-staging
Check: Claude Desktop settings or claude mcp list

🔒 Security Best Practices

Credential Management

Never commit tokens to version control
Rotate tokens regularly following your organization's security policy
Consider service principals for production deployments

Access Control

Principle of least privilege: Grant minimum necessary permissions
Separate environments: Use different tokens for dev/staging/prod
Monitor usage: Review MCP server access logs regularly

Configuration Security

Restrict file permissions on configuration files containing tokens
Use workspace-scoped tokens when possible
Implement token expiration policies

📁 Project Structure

databricks-mcp/
├── src/
│   └── databricks_mcp/
│       ├── __init__.py
│       └── server.py                      # Main MCP server implementation
├── Dockerfile                             # Docker container definition
├── docker-compose.yml                     # Docker Compose configuration
├── databricks-mcp-wrapper.sh              # Wrapper script for dynamic config resolution
├── register-mcp-claude-code.sh            # Automated registration script
├── pyproject.toml                         # Python dependencies and package metadata
├── README.md                              # This file
│
├── .mcp.json                              # Your Claude Code CLI config (gitignored)
├── .mcp_example.json                      # Project-local config example
├── .gitignore                             # Git ignore rules
│
├── databricks_connections.json            # Your credentials (gitignored)
├── databricks_connections_example.json    # Credentials template
├── claude_desktop_config.json             # Your Claude Desktop config (gitignored)
├── claude_desktop_config_example.json     # Claude Desktop config example
└── claude_desktop_config_pip.json         # Pip-based Claude Desktop config example

Key Components

databricks-mcp-wrapper.sh: Wrapper script that dynamically finds config at runtime
register-mcp-claude-code.sh: Automated Docker build and registration
server.py: FastMCP-based server with 12 Databricks integration tools
Dockerfile: Optimized single-stage build using pip
docker-compose.yml: Optional simplified Docker setup

📄 Configuration Files Reference

This section documents all configuration files, including hidden files that may not be visible in your file explorer.

Core Configuration Files

File	Purpose	Required	Gitignored	Example File
`databricks_connections.json`	Your Databricks workspace credentials	✅ Yes	✅ Yes	`databricks_connections_example.json`
`.mcp.json`	Project-local MCP server config for Claude Code	❌ No	✅ Yes	`.mcp_example.json`
`claude_desktop_config.json`	Claude Desktop MCP server config	❌ No*	✅ Yes	`claude_desktop_config_example.json` or `claude_desktop_config_pip.json`

* Required only if using Claude Desktop (not Claude Code CLI)

Configuration File Details

`databricks_connections.json` (Required)

Location: Project root Purpose: Stores Databricks workspace credentials and connection information Format:

{
  "connections": {
    "default": {
      "name": "default",
      "host": "https://your-workspace.azuredatabricks.net/",
      "token": "dapi1234567890abcdef"
    }
  }
}

Security: Never commit this file! It's automatically ignored by .gitignore.

`.mcp.json` (Optional)

Location: Project root or your Claude Code project directory Purpose: Project-local MCP server configuration for Claude Code CLI When to use: If you prefer project-specific configuration instead of global registration See: .mcp_example.json for detailed configuration examples

`claude_desktop_config.json` (Claude Desktop only)

Location: Platform-specific

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Purpose: Global MCP server configuration for Claude Desktop application See: claude_desktop_config_example.json for configuration examples

Hidden Files (start with `.`)

To view hidden files:

Linux/macOS: ls -la
Windows PowerShell: Get-ChildItem -Force
Windows File Explorer: View → Show → Hidden items
VS Code: Files are visible by default

Where Configuration is Stored

Claude Code CLI (after running ./register-mcp-claude-code.sh):

User-scope config: ~/.config/claude/mcp.json (or similar based on platform)
Project-scope config: .mcp.json in your project directory

Claude Desktop:

See platform-specific paths above

🚀 Development

Local Development Setup

For iterative development, mount your source code:

docker run --rm -it \
  -v $(pwd)/src:/app/src \
  -v $(pwd)/databricks_connections.json:/app/databricks_connections.json:ro \
  databricks-mcp:latest

Adding New Tools

Edit src/databricks_mcp/server.py
Add new @mcp.tool() decorated functions
Rebuild: docker build -t databricks-mcp:latest .
Test with Claude

Adding Dependencies

Edit pyproject.toml dependencies section
Rebuild: docker build -t databricks-mcp:latest .

Running Tests

docker run --rm -it databricks-mcp:latest pytest tests/

Code Quality

# Format code
docker run --rm -it -v $(pwd):/app databricks-mcp:latest black src/

# Lint
docker run --rm -it -v $(pwd):/app databricks-mcp:latest ruff check src/

# Type check
docker run --rm -it -v $(pwd):/app databricks-mcp:latest mypy src/

Using Docker Compose

# Start service
docker-compose up

# Run tests
docker-compose exec databricks-mcp pytest

# Stop service
docker-compose down

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes following the existing code style
Add tests for new functionality
Update documentation as needed
Submit a pull request