dbx_mcp_server_demo

robkisk/dbx_mcp_server_demo

3.2

If you are the rightful owner of dbx_mcp_server_demo and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

This document provides a structured overview of a custom Model Context Protocol (MCP) server designed to interface with Databricks.

Tools
8
Resources
0
Prompts
0

Databricks MCP Server Demo

A demonstration of building a custom Model Context Protocol (MCP) server that connects to Databricks using FastMCP and the Databricks Python SDK. This server provides tools for executing SQL queries, managing jobs, and working with Delta Live Table pipelines.

🚄 Quick Start

For Claude Code Users

Add this to your MCP settings:

{
  "mcpServers": {
    "databricks": {
      "command": "uv",
      "args": ["run", "mcp-server"],
      "cwd": "/path/to/dbx_mcp_server_demo"
    }
  }
}

For Cursor Users

Add this to your settings.json:

{
  "mcpServers": {
    "databricks": {
      "command": "uv",
      "args": ["run", "mcp-server"],
      "cwd": "/absolute/path/to/dbx_mcp_server_demo"
    }
  }
}

Note: This configuration assumes you have a .env file with your Databricks credentials in the project directory. See the Setup section for details.

🚀 Features

The MCP server provides the following tools:

SQL Operations

  • execute_sql: Execute SQL queries against a Databricks warehouse
  • get_workspace_info: Get information about the current workspace

Job Management

  • list_jobs: List all Databricks jobs in the workspace
  • get_job_status: Get status and recent run information for a specific job
  • run_job: Trigger a job run

Pipeline Management

  • list_pipelines: List all Delta Live Table pipelines
  • get_pipeline_status: Get status and recent update information for a pipeline
  • start_pipeline: Start a pipeline update

🔧 Setup

Prerequisites

  1. Install UV package manager: https://docs.astral.sh/uv/getting-started/installation/
  2. Databricks CLI: https://docs.databricks.com/dev-tools/cli/databricks-cli.html
  3. Python 3.10: Required for this project

Installation

  1. Clone and navigate to the project:

    cd dbx_mcp_server_demo
    
  2. Install dependencies:

    uv sync
    
  3. Configure Databricks credentials:

    cp .env.example .env
    # Edit .env file with your Databricks credentials
    

    Required environment variables:

    DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
    DATABRICKS_TOKEN=your_personal_access_token
    DATABRICKS_WAREHOUSE_ID=your_warehouse_id  # Optional, can be provided per query
    
  4. Authenticate with Databricks CLI (optional, for bundle operations):

    databricks auth login
    
  5. Validate your setup (recommended):

    uv run python validate_setup.py
    

    This will check your environment and generate MCP configuration templates.

🏃 Running the MCP Server

Manual Server Startup

Start the MCP server manually:

uv run mcp-server

The server will start and listen for MCP protocol connections.

Integration with Claude Code

To use this MCP server with Claude Code, you need to configure it in your MCP settings:

  1. Open Claude Code Settings:

    • In Claude Code, open the settings/preferences
    • Navigate to the MCP configuration section
  2. Add the server configuration:

    {
      "mcpServers": {
        "databricks": {
          "command": "uv",
          "args": ["run", "mcp-server"],
          "cwd": "/path/to/your/dbx_mcp_server_demo"
        }
      }
    }
    
  3. Update the configuration:

    • Replace /path/to/your/dbx_mcp_server_demo with the actual path to your project
    • Ensure your .env file in the project directory contains your Databricks credentials
    • Save the configuration

Integration with Cursor

To use this MCP server with Cursor, configure it in your Cursor settings:

  1. Open Cursor Settings:

    • In Cursor, go to Settings → Extensions → MCP
    • Or edit your Cursor configuration file directly
  2. Add the server configuration:

    {
      "mcpServers": {
        "databricks": {
          "command": "uv",
          "args": ["run", "mcp-server"],
          "cwd": "/absolute/path/to/dbx_mcp_server_demo"
        }
      }
    }
    
  3. Configuration Notes:

    • Use absolute paths for the cwd parameter
    • Ensure UV is installed globally and available in your PATH
    • The server will automatically load credentials from your .env file
    • Restart Cursor after adding the configuration

Alternative: Direct Environment Variables

If you prefer to specify credentials directly in the MCP configuration (not recommended for security reasons), you can add them to the env section:

{
  "mcpServers": {
    "databricks": {
      "command": "uv",
      "args": ["run", "mcp-server"],
      "cwd": "/path/to/your/dbx_mcp_server_demo",
      "env": {
        "DATABRICKS_HOST": "https://your-workspace.cloud.databricks.com",
        "DATABRICKS_TOKEN": "your_personal_access_token",
        "DATABRICKS_WAREHOUSE_ID": "your_warehouse_id"
      }
    }
  }
}

Note: Environment variables in the env section will override any .env file settings. We recommend using the .env file approach for better security.

📝 Example Usage

See examples/mcp_client_example.py for a demonstration of how to interact with the MCP server tools:

uv run python examples/mcp_client_example.py

Example Tool Calls

  1. Execute SQL Query:

    # Execute a simple query
    result = execute_sql("SELECT current_timestamp() as now, 'Hello Databricks' as message")
    
    # Execute with specific warehouse
    result = execute_sql(
        query="SELECT * FROM my_table LIMIT 10",
        warehouse_id="your_warehouse_id"
    )
    
  2. Manage Jobs:

    # List all jobs
    jobs = list_jobs()
    
    # Get job status
    status = get_job_status(job_id=123456)
    
    # Run a job
    run_result = run_job(job_id=123456)
    
  3. Manage Pipelines:

    # List pipelines
    pipelines = list_pipelines()
    
    # Get pipeline status
    status = get_pipeline_status(pipeline_id="your_pipeline_id")
    
    # Start pipeline update
    update = start_pipeline(pipeline_id="your_pipeline_id")
    

🧪 Testing

Run the test suite:

uv run python -m pytest tests/ -v

The tests include:

  • Configuration management tests
  • Mock tests for all MCP tools
  • Error handling validation

🏗️ Project Structure

dbx_mcp_server_demo/
├── mcp_server/
│   ├── __init__.py
│   ├── config.py          # Databricks configuration management
│   └── main.py            # MCP server implementation with tools
├── tests/
│   └── test_mcp_server.py # Test suite
├── examples/
│   └── mcp_client_example.py # Usage examples
├── .env.example           # Environment configuration template
├── pyproject.toml         # Project dependencies and scripts
└── README.md             # This file

📚 Understanding the Code

Configuration Management (mcp_server/config.py)

  • Uses Pydantic for configuration validation
  • Loads credentials from environment variables
  • Validates required settings before connecting

MCP Server (mcp_server/main.py)

  • Built with FastMCP for easy MCP protocol implementation
  • Uses Databricks SDK for all Databricks API interactions
  • Includes comprehensive error handling and logging
  • Each tool returns structured JSON responses

Key Design Patterns

  • Environment-based configuration: Secure credential management
  • Structured responses: Consistent JSON format for all tool outputs
  • Error handling: Graceful handling of missing credentials and API errors
  • Type safety: Pydantic models for configuration validation

🔒 Security Notes

  • Never commit your .env file with real credentials
  • Use Databricks personal access tokens with minimal required permissions
  • Consider using Azure Key Vault, AWS Secrets Manager, or similar for production deployments
  • The server runs locally and doesn't expose any web endpoints

🛠️ Extending the Server

To add new tools:

  1. Create a new function in mcp_server/main.py:

    @mcp.tool()
    def my_new_tool(param: str) -> Dict[str, Any]:
        """Description of what this tool does."""
        # Implementation here
        return {"status": "success", "result": "value"}
    
  2. Add tests in tests/test_mcp_server.py

  3. Update documentation with the new tool capabilities

📖 Additional Resources

🔧 MCP Client Configuration Details

Obtaining Databricks Credentials

  1. Get your Databricks Host:

    • Log into your Databricks workspace
    • Copy the URL (e.g., https://adb-123456789.10.azuredatabricks.net)
  2. Generate a Personal Access Token:

    • In Databricks, go to User Settings → Developer → Access tokens
    • Click "Generate new token"
    • Copy the token value (keep it secure!)
  3. Find your Warehouse ID (optional but recommended):

    • In Databricks, go to SQL → Warehouses
    • Click on your warehouse name
    • Copy the warehouse ID from the URL or settings

MCP Configuration Examples

Claude Code - Example Configuration
{
  "mcpServers": {
    "databricks": {
      "command": "uv",
      "args": ["run", "mcp-server"],
      "cwd": "/Users/yourname/projects/dbx_mcp_server_demo"
    }
  }
}

With .env file in project directory:

DATABRICKS_HOST=https://adb-1234567890123.14.azuredatabricks.net
DATABRICKS_TOKEN=dapi1234567890abcdef1234567890abcdef12  #gitleaks:allow
DATABRICKS_WAREHOUSE_ID=abc123def456
Cursor - Configuration File Location
  • macOS: ~/Library/Application Support/Cursor/User/settings.json
  • Windows: %APPDATA%\Cursor\User\settings.json
  • Linux: ~/.config/Cursor/User/settings.json
Alternative: Node.js/npx Setup

If you prefer using Node.js instead of UV:

  1. Create a startup script (start-mcp.js):

    #!/usr/bin/env node
    const { spawn } = require('child_process');
    const path = require('path');
    
    const server = spawn('uv', ['run', 'mcp-server'], {
      cwd: __dirname,
      stdio: 'inherit',
      env: { ...process.env }
    });
    
    server.on('exit', (code) => process.exit(code));
    
  2. MCP configuration:

    {
      "mcpServers": {
        "databricks": {
          "command": "node",
          "args": ["start-mcp.js"],
          "cwd": "/path/to/your/dbx_mcp_server_demo"
        }
      }
    }
    

Verifying MCP Integration

After configuring your MCP client:

  1. Check server startup: Look for the Databricks MCP server in your client's MCP server list
  2. Test a simple query: Try using the get_workspace_info tool
  3. Verify permissions: Test list_jobs to ensure your token has proper access

🐛 Troubleshooting

Common Issues

  1. "Databricks client not configured":

    • Check your .env file has the correct values
    • Verify your personal access token is valid
    • Ensure DATABRICKS_HOST includes the full URL with https://
    • Check that environment variables are properly set in your MCP configuration
  2. "No warehouse ID provided":

    • Set DATABRICKS_WAREHOUSE_ID in your .env file, or
    • Add it to the env section of your MCP configuration, or
    • Provide warehouse_id parameter when calling execute_sql
  3. "Command not found: uv":

    • Install UV: curl -LsSf https://astral.sh/uv/install.sh | sh
    • Ensure UV is in your PATH
    • Alternative: Use the Node.js startup script approach
  4. Permission errors:

    • Verify your token has necessary permissions for the operations you're trying to perform
    • Check that you have access to the specific jobs/pipelines you're trying to manage
    • Ensure your token hasn't expired
  5. MCP server not appearing in client:

    • Check that the cwd path is correct and absolute
    • Verify all dependencies are installed (uv sync)
    • Check client logs for connection errors
    • Restart your MCP client after configuration changes
  6. Server startup errors:

    • Run uv run mcp-server manually to see error messages
    • Check that all required environment variables are set
    • Verify Python 3.10 is available and used by UV

Validation Script

Before troubleshooting, run the setup validation script:

uv run python validate_setup.py

This will check your environment, dependencies, and generate MCP configuration templates.

Debug Mode

To debug MCP server issues:

# Run with debug logging
PYTHONPATH=. uv run python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
from mcp_server.main import main
import asyncio
asyncio.run(main())
"

📄 License

This project is provided as a demonstration. Modify and use according to your organization's policies.