bricks-use

aymenfurter/bricks-use

3.2

If you are the rightful owner of bricks-use and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

A powerful Model Context Protocol (MCP) server for executing Databricks SQL queries and comparing table data.

A powerful Model Context Protocol (MCP) server for executing Databricks SQL queries and comparing table data.

⚠️ This project is purely meant for demo purposes - use at your own risk!


Table of Contents

Features

FeatureDescription
Execute SQL QueriesRun any SQL query on Databricks with configurable result limits
Table InformationGet detailed information about tables including schema and row counts
Table ComparisonCompare two tables by downloading their data and running CLI diff
Quick ComparisonFast metadata-only comparison of tables

Quick Start

# 1. Clone and setup
git clone https://github.com/aymenfurter/bricks-use.git
cd bricks-use
python -m venv .venv && source .venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure environment (see setup section)
cp .env.example .env  # Edit with your credentials

# 4. Run the server
python databricks_server.py

Setup

Prerequisites

Python3.11 or higher
DatabricksWorkspace access
TokenPersonal access token

Environment Variables

Set the following environment variables or create a .env file:

# Databricks Configuration
DATABRICKS_SERVER_HOSTNAME=your-workspace.cloud.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/your-warehouse-id
DATABRICKS_ACCESS_TOKEN=your-personal-access-token

# Optional Settings
DATABRICKS_CATALOG=main                    # Defaults to 'main'
DATABRICKS_SCHEMA=default                  # Defaults to 'default'
DATABRICKS_TEMP_DIR=/tmp/databricks_mcp    # Temp directory

Installation

  1. Create and activate a virtual environment:

    python -m venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the server:

    python databricks_server.py
    

Tools

ToolPurposeKey Parameters
execute_queryExecute SQL queriesquery, limit
get_table_infoGet table metadatatable_name, catalog, schema
compare_tablesFull data comparisontable1, table2, diff_lines
quick_compare_tablesMetadata comparisontable1, table2

execute_query

Execute a SQL query on Databricks.

Parameters:

  • query (str): SQL query to execute
  • limit (int, optional): Maximum rows to return (default: 1000)

Example:

Execute this query: SELECT * FROM my_table WHERE status = 'active'

get_table_info

Get information about a Databricks table.

Parameters:

  • table_name (str): Name of the table
  • catalog (str, optional): Catalog name
  • schema (str, optional): Schema name

Example:

Get info for table 'users' in catalog 'production' and schema 'analytics'

compare_tables

Compare data between two tables by downloading full data and running diff.

Parameters:

  • table1 (str): First table name
  • table2 (str): Second table name
  • catalog1 (str, optional): Catalog for table1
  • schema1 (str, optional): Schema for table1
  • catalog2 (str, optional): Catalog for table2
  • schema2 (str, optional): Schema for table2
  • diff_lines (int, optional): Number of diff context lines (default: 10)

Example:

Compare tables 'users_old' and 'users_new' and show differences

quick_compare_tables

Quick metadata-only comparison without downloading data.

Parameters:

  • table1 (str): First table name
  • table2 (str): Second table name
  • catalog1 (str, optional): Catalog for table1
  • schema1 (str, optional): Schema for table1
  • catalog2 (str, optional): Catalog for table2
  • schema2 (str, optional): Schema for table2

Example:

Quick compare 'table_a' and 'table_b' schemas and row counts

VS Code MCP Integration

Add this configuration to your VS Code settings (mcp.json):

Click to expand VS Code configuration
{
    "inputs": [
        {
            "type": "promptString",
            "id": "databricks_server_hostname",
            "description": "Databricks Server Hostname"
        },
        {
            "type": "promptString",
            "id": "databricks_http_path",
            "description": "Databricks HTTP Path"
        },
        {
            "type": "promptString",
            "id": "databricks_access_token",
            "description": "Databricks Access Token",
            "password": true
        },
        {
            "type": "promptString",
            "id": "databricks_catalog",
            "description": "Databricks Catalog (default: main)"
        },
        {
            "type": "promptString",
            "id": "databricks_schema",
            "description": "Databricks Schema (default: default)"
        }
    ],
    "servers": {
        "databricks": {
            "command": "python",
            "args": [
                "${workspaceFolder}/databricks_server.py"
            ],
            "env": {
                "PYTHONUNBUFFERED": "1",
                "DATABRICKS_SERVER_HOSTNAME": "${input:databricks_server_hostname}",
                "DATABRICKS_HTTP_PATH": "${input:databricks_http_path}",
                "DATABRICKS_ACCESS_TOKEN": "${input:databricks_access_token}",
                "DATABRICKS_CATALOG": "${input:databricks_catalog}",
                "DATABRICKS_SCHEMA": "${input:databricks_schema}"
            },
            "workingDirectory": "${workspaceFolder}"
        }
    }
}

Usage Examples

Execute a Query

Execute this SQL query: SELECT customer_id, order_date, total_amount 
                       FROM orders 
                       WHERE order_date >= '2024-01-01' 
                       LIMIT 100

Compare Tables

Compare tables 'sales_2023' and 'sales_2024' and show me the differences

Quick Schema Comparison

Quick compare the schemas of 'old_users' and 'new_users' tables

License

This project is licensed under the MIT License.

Made with ❤️ for the Databricks community