HoloborodkoBohdan/gbq-mcp
If you are the rightful owner of gbq-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The BigQuery MCP Server is a production-ready server designed for secure, read-only access to Google BigQuery datasets, featuring table-level access control, query cost estimation, HTTP transport, and comprehensive security validation.
BigQuery MCP Server
Production-ready MCP server for secure, read-only access to Google BigQuery datasets. Features table-level access control, query cost estimation, HTTP transport, and comprehensive security validation.
📚 Documentation:
- - System design and architecture
- - Windows + WSL configuration
- - Share server with others
Quick Start
1. Install Dependencies
pip install -r requirements.txt
2. Setup Authentication
For public datasets: Place your service account JSON file at:
/mcp-gbq/service-account.json
The server will automatically use this file if it exists.
Alternative methods (if not using service-account.json):
# Option 1: Use your Google account
gcloud auth application-default login
# Option 2: Set environment variable
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
3. Run the Server
HTTP mode (default - recommended for deployment):
# Install HTTP dependencies
pip install fastapi uvicorn
# Run on default port 8000
python server.py
# Run on custom port
python server.py 8765
Stdio mode (for Desktop Agent integration):
python server.py --stdio
Ngrok mode (share with others):
# Install ngrok support
pip install pyngrok
# Start server with ngrok
python server.py --ngrok
# Returns public URL like: https://abc123.ngrok-free.app
# Share this URL with anyone!
📖 See for complete ngrok guide with authentication, monitoring, and troubleshooting.
Connect to Desktop Agent
macOS/Linux
Add to your Desktop Agent config file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"bigquery": {
"command": "python",
"args": ["/absolute/path/to/mcp-gbq/server.py", "--stdio"]
}
}
}
Restart Desktop Agent. You should see the MCP server connected with 4 tools available.
Windows + WSL
📖 See for complete Windows + WSL configuration guide.
Quick configuration:
{
"mcpServers": {
"bigquery": {
"command": "wsl",
"args": [
"bash",
"-c",
"cd /home/YOUR_USERNAME/mcp-gbq && source venv/bin/activate && python server.py --stdio"
]
}
}
}
Replace YOUR_USERNAME with your WSL username.
Local HTTP Server
The server runs in HTTP mode by default for local testing and development.
Start Local HTTP Server
# Start HTTP server
python server.py 8000
# Test health endpoint
curl http://localhost:8000/health
# Access in browser
open http://localhost:8000
HTTP Endpoints
The server exposes the following HTTP endpoints:
GET /
Root endpoint with service information.
Response:
{
"service": "BigQuery MCP Server",
"status": "running",
"endpoints": {
"health": "/health",
"mcp": "/mcp"
}
}
GET /health
Health check endpoint for monitoring and load balancers.
Response:
{
"status": "healthy",
"service": "bigquery-mcp",
"timestamp": "2025-10-22T08:46:24.271778Z"
}
Use cases:
- Container health checks (Docker, Kubernetes)
- Load balancer health probes
- Monitoring systems (Prometheus, Datadog, etc.)
- Uptime monitoring
POST /mcp
MCP protocol endpoint for tool execution.
This is the main endpoint used by MCP clients to:
- List available tools
- Execute queries
- Get table schemas
- Estimate costs
Note: This endpoint uses the MCP protocol and is typically accessed through MCP clients, not directly via HTTP requests.
Testing with MCP Inspector
npx @modelcontextprotocol/inspector python server.py --stdio
Available Tools
The server provides 5 tools:
- get_query_limits - View current query limits and BigQuery configuration
- list_tables - See all available datasets and tables
- get_table_schema - View table structure and field types
- estimate_query_cost - Estimate query cost without executing (dry-run)
- bq_query - Run SELECT queries on allowed tables
Available Resources
The server exposes 4 MCP resources for browsable data discovery:
bigquery://tables
List all available tables and patterns you have access to. Provides a quick overview of your data catalog.
bigquery://table/{table_id}/schema
Get detailed schema information for a specific table including:
- Row count and table size
- Creation and modification dates
- Field names, types, and descriptions
- Formatted as a readable markdown table
Example: bigquery://table/bigquery-public-data.iowa_liquor_sales.sales/schema
bigquery://datasets
Browse all accessible datasets with their configurations:
- Datasets with full access
- Individual table permissions
- Wildcard patterns
- Blacklisted tables
bigquery://limits
View current query limits and configuration:
- Max results per query
- Billing limits
- Cost information
- How to adjust settings
Benefits of Resources:
- 📖 Claude can browse and discover your data without executing tools
- 🔍 Natural data exploration and schema inspection
- 💡 Better context for query generation
- ⚡ Faster responses (no tool execution needed for metadata)
Security Features
Read-Only Enforcement:
- Only SELECT queries allowed
- Blocks: DELETE, UPDATE, INSERT, CREATE, DROP, ALTER, MERGE, TRUNCATE, REPLACE, GRANT, REVOKE
- Prevents SQL injection and multi-statement attacks
- Removes comments and string literals before validation
- Validates against allowed table whitelist
Query Limits (configurable via .env):
- Maximum 10,000 rows per query (default - set
MAX_QUERY_RESULTSto change) - 100 MB billing limit per query (default - set
MAX_BYTES_BILLED_MBto change) - Table access controlled by
access-control.json
Validation Examples:
✓ SELECT * FROM table WHERE name = 'DELETE' -- String literals OK
✓ SELECT * FROM table /* comment */ -- Comments removed safely
✗ SELECT * FROM table; DROP TABLE users; -- Multi-statement blocked
✗ DELETE FROM table -- Non-SELECT blocked
Current Datasets
bigquery-public-data.iowa_liquor_sales.sales- Iowa liquor retail salesbigquery-public-data.austin_bikeshare.bikeshare_stations- Austin bike stationsbigquery-public-data.austin_bikeshare.bikeshare_trips- Austin bike trip history
How to Use
Once connected to Desktop Agent, you can interact with BigQuery through natural language. Claude will automatically use the MCP tools.
Example Queries
List available tables:
You: What tables are available in BigQuery?
You: Show me what datasets I can query
Explore table schemas:
You: What's the schema for the Iowa liquor sales table?
You: Show me the fields in austin bikeshare trips
You: What columns are available in the bikeshare stations table?
Estimate query costs:
You: How much will this query cost to run? SELECT * FROM `bigquery-public-data.iowa_liquor_sales.sales`
You: Estimate the cost of querying all Austin bike trips
You: What's the data size for this query?
Query data:
You: Show me the top 10 liquor sales from Iowa
You: Which cities in Iowa have the highest liquor sales?
You: Get 5 recent bike trips from Austin
You: How many bike stations are there in each council district?
You: What's the average trip duration in the Austin bikeshare system?
Complex analysis:
You: Analyze Iowa liquor sales by category and show trends
You: Compare bike usage patterns between different Austin neighborhoods
You: Find the busiest bike stations in Austin
Example Conversation
You: What datasets do I have access to?
Claude: [Uses list_tables and shows 3 available tables with descriptions]
You: Show me the schema for austin bikeshare trips
Claude: [Uses get_table_schema and displays field names, types, and metadata]
You: Get the top 5 most popular start stations
Claude: [Uses bq_query with SQL and presents results]
You: Now show me the average trip duration for each subscriber type
Claude: [Executes another query and analyzes the data]
Direct SQL Queries
You can also write SQL directly:
You: Run this query:
SELECT store_name, SUM(sale_dollars) as total
FROM `bigquery-public-data.iowa_liquor_sales.sales`
GROUP BY store_name
ORDER BY total DESC
LIMIT 10
Advanced Features
Automatic Cost Protection
All queries automatically perform dry-run cost estimation first!
The server protects against expensive queries by:
- Running a dry-run to estimate cost before execution
- Comparing estimated bytes against billing limits (100 MB default)
- Requiring user confirmation for queries that exceed limits
Example workflow:
You: Query all Iowa liquor sales for the entire year
Claude: [Runs dry-run first]
Claude: ⚠️ This query will process 2.5 GB and cost approximately $0.0125.
This exceeds the 100 MB limit. Do you want to proceed?
You: Yes, proceed
Claude: [Executes query with confirmed=True]
Claude: [Returns results with actual cost information]
Cost estimate includes:
- Bytes to be processed
- Data size in MB/GB
- Estimated cost in USD ($5 per TB)
- Comparison to configured limits
Benefits:
- Prevents accidental expensive queries
- User always knows the cost before execution
- No surprises on your BigQuery bill
- Can still run large queries after confirmation
Async Lifecycle Management
The server uses proper async context management for BigQuery client lifecycle:
- Automatic connection setup on startup
- Graceful cleanup on shutdown
- Resource pooling for better performance
HTTP Transport
Run the server with HTTP transport for local access or sharing via ngrok:
# Start HTTP server (local access)
python server.py 8000
# Share publicly via ngrok
python server.py --ngrok
# Access via HTTP endpoint
curl http://localhost:8000
Use cases:
- Local testing and development
- Share with others via ngrok
- Access from web applications
Cost Estimation (Dry-run)
Estimate query costs before execution:
# Via Desktop Agent
You: "Estimate cost for: SELECT * FROM bigquery-public-data.iowa_liquor_sales.sales WHERE date > '2020-01-01'"
# Response includes:
# - Bytes to be processed
# - Estimated cost in USD
# - Data size in MB/GB
Benefits:
- Avoid expensive queries
- Budget planning
- Query optimization
Testing
Run All Tests
python tests/run_all_tests.py
Expected output:
Total Tests: 54
Passed: 54 ✓
Failed: 0
Pass Rate: 100.0%
🎉 All tests passed! 🎉
Test Coverage
✅ SQL validation, features, and imports - 54 tests total. See for details.
What's Next
Add More Datasets
Edit access-control.json to add more tables or datasets:
{
"allowed_tables": [
"bigquery-public-data.iowa_liquor_sales.sales",
"bigquery-public-data.dataset_name.table_name"
],
"allowed_datasets": {
"bigquery-public-data.austin_bikeshare": {
"allow_all_tables": true,
"blacklisted_tables": [],
"description": "Austin bike sharing system"
},
"bigquery-public-data.dataset_name": {
"allow_all_tables": true,
"blacklisted_tables": ["sensitive_table"],
"description": "Your dataset description"
}
},
"allowed_patterns": [
"bigquery-public-data.*"
]
}
Browse public datasets: https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset
Add Features
Potential enhancements:
- Query result caching
- Rate limiting per user
- Query history logging
- Custom prompt templates for common queries
- Support for parameterized queries
Troubleshooting
Permission Denied Errors
If you see "Permission denied" or "403" errors:
1. Check your service account has the right roles:
Your service account needs at minimum:
BigQuery Userrole (to run queries)BigQuery Data Viewerrole (to read data)
In Google Cloud Console:
- Go to IAM & Admin > Service Accounts
- Find your service account
- Click "Permissions" tab
- Grant roles:
BigQuery UserandBigQuery Data Viewer
2. Enable billing on your project:
Public datasets are free to query, but you still need billing enabled:
- Go to Billing in Google Cloud Console
- Link a billing account to your project
3. For public datasets:
If querying public datasets (like bigquery-public-data.*), your service account needs:
- BigQuery User role on YOUR project (where queries run)
- Billing enabled on YOUR project
You don't need special permissions on the bigquery-public-data project.
4. Verify service account file:
# Check service account file exists
ls -la /mcp-gbq/service-account.json
# Verify it's valid JSON
cat service-account.json | jq .project_id
Table Not Found Errors
"Table not found" or "404":
- Verify table ID is correct:
project.dataset.table - Check table exists in BigQuery console
- Ensure table is in allowed list (use
list_tablesto verify)
Query Errors
"Exceeded budget":
- Queries are limited to 100 MB billing
- Add
LIMITclause to reduce data scanned - Use
WHEREto filter data before scanning
Connection errors:
- Restart Desktop Agent
- Check WSL is running:
wsl --statusin Windows - Verify Python and venv work:
wsl bash -c "cd /mcp-gbq && source venv/bin/activate && python --version"
Test Your Setup
Run this in WSL to test authentication:
cd ~/mcp-gbq
source venv/bin/activate
python -c "
from google.cloud import bigquery
import json
with open('service-account.json') as f:
info = json.load(f)
project = info['project_id']
client = bigquery.Client.from_service_account_json('service-account.json', project=project)
query = 'SELECT 1 as test'
result = list(client.query(query).result())
print(f'✓ Authentication works! Project: {project}')
print(f'✓ Test query result: {result}')
"
If this works, the MCP server should work too.