effcamp/bigquery-mcp
If you are the rightful owner of bigquery-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A Model Context Protocol (MCP) server for Google BigQuery, enabling AI assistants to query and interact with BigQuery datasets.
BigQuery MCP Server
A Model Context Protocol (MCP) server for Google BigQuery, enabling AI assistants to query and interact with BigQuery datasets.
Features
- Query Catalog: Pre-defined, tested SQL queries exposed as MCP resources (no context window bloat!)
- Custom SQL Queries: Execute parameterized BigQuery SQL queries when needed
- Dataset Discovery: List datasets and tables in your project
- Schema Inspection: Get table schemas with field types and descriptions
- Simple Queries: Query tables with WHERE clauses and LIMIT without writing SQL
- Type-Safe: Built with Pydantic for robust data validation
- Configurable: YAML-based configuration with environment variable support
- Scalable: Add hundreds of business queries without impacting LLM performance
Installation
Prerequisites
- Python 3.13+ (recommended for full features)
- For Python 3.10+: Most features work, but type hints use
|syntax (requires 3.10+)
- For Python 3.10+: Most features work, but type hints use
- Google Cloud Platform account with BigQuery enabled
- GCP credentials (service account or default credentials)
Method 1: Using pip (Recommended)
Install from GCP Artifact Registry:
# Configure pip to use your GCP Artifact Registry
pip install bigquery-mcp \
--index-url https://us-central1-python.pkg.dev/<your-project-id>/<your-repo>/simple/
Or configure authentication once:
# Authenticate with GCP Artifact Registry
gcloud auth application-default login
pip install keyring
pip config set global.index-url https://us-central1-python.pkg.dev/<your-project-id>/<your-repo>/simple/
# Install the package
pip install bigquery-mcp
# Run the server
bigquery-mcp
Method 2: Using pipx (Isolated Install)
For isolated installation without affecting your global Python environment:
# Install pipx if not already installed
pip install pipx
pipx ensurepath
# Install bigquery-mcp from Artifact Registry
pipx install bigquery-mcp \
--pip-args="--index-url https://us-central1-python.pkg.dev/<your-project-id>/<your-repo>/simple/"
# Run the server
bigquery-mcp
Method 3: Using uv (Development)
For development with local source:
# Clone the repository
git clone <your-repo-url>
cd python-mcp
# Install dependencies
uv sync
# Run the server
uv run bigquery-mcp
Which Method Should I Use?
| Method | Best For | Pros | Cons |
|---|---|---|---|
| pip | Production, system-wide installs | Standard tooling, simple | May affect global Python environment |
| pipx | User installs, multiple versions | Isolated environments per app | Extra tool required |
| uv | Development, contributing | Fast, full dev environment | Requires source code |
Configuration
Option 1: Configuration File (Recommended)
Create config/bigquery.yaml:
project_id: "your-gcp-project-id"
dataset_id: "your-default-dataset" # optional
credentials_path: "/path/to/service-account.json" # optional
location: "US" # optional, default: US
max_results: 1000 # optional, default: 1000
Option 2: Environment Variables
export GCP_PROJECT_ID="your-gcp-project-id"
export GCP_DATASET_ID="your-default-dataset" # optional
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json" # optional
export GCP_LOCATION="US" # optional
export GCP_MAX_RESULTS="1000" # optional
Option 3: Custom Config Path
export BIGQUERY_MCP_CONFIG="/custom/path/to/config.yaml"
uv run bigquery-mcp
Authentication
The server supports multiple authentication methods (in order of precedence):
- Service Account File: Specified in config or
GOOGLE_APPLICATION_CREDENTIALS - Application Default Credentials: Uses gcloud credentials
To set up gcloud credentials:
gcloud auth application-default login
Query Catalog
The server includes a query catalog system that allows you to define pre-written SQL queries that the LLM can discover and execute. This avoids bloating the LLM context window with hundreds of tool definitions.
How It Works
- Define Queries: Create YAML files in
config/queries/with your business queries - Exposed as Resources: Queries are automatically exposed as MCP resources
- LLM Discovery: The LLM can search and read query definitions via resources
- Execute by ID: Use the
execute_catalog_querytool to run queries
Benefits
- No Context Bloat: Add hundreds of queries without impacting LLM performance
- Natural Language: LLM matches user requests to catalog queries by description/tags
- Consistent SQL: Pre-written, tested queries ensure reliability
- Easy Maintenance: Update queries in YAML files without changing code
Example Query Definition
queries:
- id: account_balance_reconciliation
name: Account Balance Reconciliation
description: Compare account balances between internal and external systems
category: reconciliation
tags: [reconciliation, accounting, balance]
parameters:
- name: start_date
type: DATE
description: Start date (YYYY-MM-DD)
required: true
- name: end_date
type: DATE
description: End date (YYYY-MM-DD)
required: true
sql: |
SELECT account_id, internal_balance, external_balance,
(internal_balance - external_balance) as discrepancy
FROM ...
WHERE date BETWEEN @start_date AND @end_date
Example Use Cases
The included example queries demonstrate:
- Account Balance Reconciliation: Compare internal vs external balances
- Wallet Balance Reconciliation: Verify wallet transactions match balances
- Transaction Reconciliation: Match internal transactions with payment processors
- Daily Transaction Summaries: Revenue and volume metrics by day
- Customer Lifetime Value: LTV calculations and segmentation
- Revenue by Category: Monthly breakdowns and trends
See config/queries/README.md for full documentation on creating your own queries.
Available Tools
1. execute_catalog_query
Execute a pre-defined query from the catalog. Use this for common business operations.
Parameters:
query_id(string, required): ID of the catalog query to executeparameters(object, optional): Query parameters as key-value pairsmax_results(integer, optional): Maximum number of results to return
Example:
{
"query_id": "account_balance_reconciliation",
"parameters": {
"start_date": "2024-01-01",
"end_date": "2024-01-31"
}
}
Discovery:
- Use MCP resources (
list_resources/read_resource) to discover available queries - The LLM automatically matches user requests to catalog queries
2. execute_query
Execute a custom BigQuery SQL query with optional parameters.
Parameters:
query(string, required): SQL query to execute (use@param_namefor parameters)parameters(object, optional): Query parameters as key-value pairsmax_results(integer, optional): Maximum number of results to return
Example:
{
"query": "SELECT * FROM `project.dataset.table` WHERE name = @name AND age > @min_age",
"parameters": {
"name": "John",
"min_age": 25
},
"max_results": 100
}
Catalog Query vs Custom SQL
The server provides intelligent guidance to the LLM on when to use catalog queries vs custom SQL:
- Prefer catalog queries for common business operations and reporting
- Use custom SQL for ad-hoc exploration or unique one-time queries
- Resources expose catalog without polluting the tool list
3. list_datasets
List all datasets in the BigQuery project.
Parameters: None
Returns:
{
"datasets": ["dataset1", "dataset2", "dataset3"]
}
4. list_tables
List all tables in a specific dataset.
Parameters:
dataset_id(string, required): Dataset ID to list tables from
Returns:
{
"tables": ["table1", "table2", "table3"]
}
5. get_table_schema
Get the schema of a specific table.
Parameters:
dataset_id(string, required): Dataset IDtable_id(string, required): Table ID
Returns:
{
"schema": [
{
"name": "id",
"type": "INTEGER",
"mode": "REQUIRED",
"description": "Unique identifier"
},
{
"name": "name",
"type": "STRING",
"mode": "NULLABLE",
"description": "User name"
}
]
}
6. query_table
Query a table with optional WHERE clause and LIMIT.
Parameters:
dataset_id(string, required): Dataset IDtable_id(string, required): Table IDlimit(integer, optional): Maximum number of rows to returnwhere(string, optional): WHERE clause (without the WHERE keyword)
Example:
{
"dataset_id": "my_dataset",
"table_id": "users",
"limit": 10,
"where": "age > 18 AND country = 'US'"
}
MCP Resources
The server exposes catalog queries as MCP resources:
- URI Format:
bigquery://queries/{query_id} - Discovery: Use
list_resourcesto see all available catalog queries - Details: Use
read_resourceto get full query details including SQL, parameters, and descriptions - No Context Bloat: Resources don't consume tool slots or context window space
Using with MCP Clients
Claude Desktop
Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS or %APPDATA%\Claude\claude_desktop_config.json on Windows):
Using pip or pipx (Recommended)
{
"mcpServers": {
"bigquery": {
"command": "bigquery-mcp",
"env": {
"GCP_PROJECT_ID": "your-project-id",
"GOOGLE_APPLICATION_CREDENTIALS": "/path/to/credentials.json"
}
}
}
}
Using uv (Development)
{
"mcpServers": {
"bigquery": {
"command": "uv",
"args": [
"--directory",
"/path/to/python-mcp",
"run",
"bigquery-mcp"
],
"env": {
"GCP_PROJECT_ID": "your-project-id",
"GOOGLE_APPLICATION_CREDENTIALS": "/path/to/credentials.json"
}
}
}
}
Other MCP Clients
The server communicates via stdio using the MCP protocol. Any MCP-compatible client can connect by running:
bigquery-mcp
Development
Running Type Checks
uv run basedpyright src/
Running Linter
uv run ruff check src/
Formatting Code
uv run ruff format src/
Running Tests
uv run pytest
Project Structure
python-mcp/
├── src/
│ └── bigquery_mcp/
│ ├── __init__.py
│ ├── server.py # Main MCP server
│ ├── bigquery_client.py # BigQuery client wrapper
│ ├── tools.py # Tool definitions
│ ├── models.py # Pydantic models
│ └── config.py # Configuration loader
├── config/
│ └── bigquery.yaml.example # Example configuration
├── examples/
│ └── example_queries.sql # Sample SQL queries
├── tests/ # Test files
├── pyproject.toml # Project configuration
└── README.md
Architecture
The server follows the MCP Toolbox pattern:
- Configuration Layer (
config.py): Loads settings from YAML or environment variables - Client Layer (
bigquery_client.py): Wraps BigQuery API with async methods - Tools Layer (
tools.py): Defines MCP tools and handles parameter validation - Server Layer (
server.py): MCP server implementation with stdio transport - Models Layer (
models.py): Pydantic models for type safety
Comparison to MCP Toolbox
This implementation is inspired by googleapis/genai-toolbox but:
- Python-based instead of Go
- BigQuery-specific instead of multi-database
- Simpler POC focused on core functionality
- Direct MCP integration without separate control plane
License
MIT License - see file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.