intellijidea-mcp-databricks by ramvlt - MCP Server

Databricks MCP Server

A Model Context Protocol (MCP) server that provides integration with Databricks, enabling AI assistants to interact with your Databricks workspace.

Features

SQL Execution: Run SQL queries on Databricks clusters using Databricks Connect
Workspace Management: List and read notebooks
Cluster Information: Get details about clusters
Jobs Management: List, monitor, and trigger job runs
DBFS Operations: List and read files from Databricks File System

Prerequisites

Node.js 18 or higher
A Databricks workspace
Databricks personal access token
A running Databricks cluster (for SQL execution)

Installation

Clone or navigate to this repository:

cd /path/to/databricks-mcp-server

Install dependencies:

npm install

Build the server:

npm run build

Configuration

Copy the example environment file:

cp .env.example .env

Edit .env and configure your Databricks credentials:

DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=your-databricks-token
DATABRICKS_CLUSTER_ID=your-cluster-id

Getting Databricks Credentials

DATABRICKS_HOST: Your Databricks workspace URL (e.g., https://dbc-12345678-9abc.cloud.databricks.com or https://adb-xxxxx.azuredatabricks.net)
DATABRICKS_TOKEN: Generate a personal access token:
- Go to User Settings > Developer > Access Tokens
- Click "Generate New Token"
- Copy the token value
DATABRICKS_CLUSTER_ID: (Required for SQL execution and cluster operations)
- Go to Compute in the sidebar
- Click on your cluster name
- Copy the Cluster ID from the URL or cluster details (e.g., 1234-567890-abc123)
- Important: The cluster must be running or set to auto-start

Usage with Claude Desktop

Add this configuration to your Claude Desktop config file:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "databricks": {
      "command": "node",
      "args": ["/absolute/path/to/databricks-mcp-server/dist/index.js"],
      "env": {
        "DATABRICKS_HOST": "https://your-workspace.cloud.databricks.com",
        "DATABRICKS_TOKEN": "your-databricks-token",
        "DATABRICKS_CLUSTER_ID": "your-cluster-id"
      }
    }
  }
}

Restart Claude Desktop after updating the configuration.

Available Tools

execute_sql

Execute SQL queries on your Databricks cluster using Spark SQL.

Parameters:
- query (required): SQL query to execute
- max_rows (optional): Maximum rows to return (default: 100)

Note: Requires a running cluster. Queries are executed using Spark SQL.

list_notebooks

List notebooks in a workspace path.

Parameters:
- path (optional): Workspace path (default: "/")

get_notebook_content

Get the content of a notebook.

Parameters:
- path (required): Path to the notebook

list_clusters

List all clusters in the workspace.

get_cluster_info

Get detailed information about a specific cluster.

Parameters:
- cluster_id (required): The cluster ID

list_jobs

List all jobs in the workspace.

Parameters:
- limit (optional): Maximum jobs to return (default: 25)

get_job_runs

Get run history for a job.

Parameters:
- job_id (required): The job ID
- limit (optional): Maximum runs to return (default: 25)

run_job

Trigger a job run.

Parameters:
- job_id (required): The job ID to run

list_dbfs

List files in DBFS.

Parameters:
- path (optional): DBFS path (default: "/")

read_dbfs_file

Read content from a DBFS file.

Parameters:
- path (required): DBFS path to the file

Development

Run in development mode:

npm run dev

Build:

npm run build

Security Notes

Never commit your .env file or expose your Databricks token
Use environment variables or secure secret management for production
Grant minimal necessary permissions to the Databricks token
Consider using service principals instead of personal access tokens for production use

Troubleshooting

"DATABRICKS_HOST and DATABRICKS_TOKEN must be set"

Make sure your environment variables are properly configured in the Claude Desktop config or .env file.

SQL execution errors

Verify that DATABRICKS_CLUSTER_ID is correctly set and points to a valid cluster
Ensure the cluster is running (not terminated or stopped)
Check that your token has permissions to execute commands on the cluster

Connection timeout

Check that your Databricks workspace is accessible and the token is valid
Verify the cluster is in "Running" state
Ensure your network allows connections to Databricks

Command execution takes too long

Cluster may be starting up (this can take several minutes)
Consider using a cluster with auto-termination disabled for faster response times

License

ISC