Skip to main content
Glama

BigQuery Validator

by caron14

mcp-bigquery

Safe BigQuery exploration through Model Context Protocol

MIT License PyPI Version Python Support Downloads

Documentation | Quick Start | Tools | Examples


📌 What is this?

mcp-bigquery is an MCP (Model Context Protocol) server that enables AI assistants like Claude to safely interact with Google BigQuery.

🎯 Key Features

graph LR A[AI Assistant] -->|MCP Protocol| B[mcp-bigquery] B -->|Dry-run Only| C[BigQuery API] B -.->|❌ Never Executes| D[Actual Query Execution]
  • 🛡️ 100% Safe: All operations are dry-run only (never executes queries)

  • 💰 Cost Transparency: See costs before running any query

  • 🔍 Complete Analysis: Analyze SQL structure, dependencies, and performance

  • 📊 Schema Explorer: Browse datasets, tables, and columns with ease

⚡ Why use mcp-bigquery?

Problem

Solution with mcp-bigquery

💸 Accidentally running expensive queries

Check costs before execution

🐛 Wasting time on SQL errors

Detect syntax errors before running

🗺️ Unknown table structures

Easily explore schemas

⚠️ AI executing dangerous operations

Everything is read-only and safe

🚀 Quick Start (4 minutes)

Step 1: Install (1 minute)

pip install mcp-bigquery

Step 2: Authenticate with Google Cloud (2 minutes)

# For personal accounts gcloud auth application-default login # For service accounts export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

Step 3: Configure Claude Desktop (1 minute)

Open your Claude Desktop config:

  • Mac: ~/Library/Application Support/Claude/claude_desktop_config.json

  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this configuration:

{ "mcpServers": { "mcp-bigquery": { "command": "mcp-bigquery", "env": { "BQ_PROJECT": "your-gcp-project-id" // ← Replace with your project ID } } } }

Step 4: Test It!

Restart Claude Desktop and try these questions:

"What datasets are available in my BigQuery project?" "Can you estimate the cost of: SELECT * FROM dataset.table" "Show me the schema for the users table"

🛠️ Available Tools

📝 SQL Validation & Analysis

Tool

Purpose

When to Use

bq_validate_sql

Check SQL syntax

Before running any query

bq_dry_run_sql

Get cost estimates & metadata

💰 To check costs

bq_analyze_query_structure

Analyze query complexity

To improve performance

bq_extract_dependencies

Extract table dependencies

To understand data lineage

bq_validate_query_syntax

Detailed error analysis

To debug SQL errors

🔍 Schema Discovery

Tool

Purpose

When to Use

bq_list_datasets

List all datasets

To explore your project

bq_list_tables

List tables with partitioning info

To browse a dataset

bq_describe_table

Get detailed table schema

To understand columns

bq_get_table_info

Complete table metadata

To get statistics

bq_query_info_schema

Query INFORMATION_SCHEMA

For advanced metadata queries

⚡ Performance Optimization

Tool

Purpose

When to Use

bq_analyze_query_performance

Analyze performance

To optimize queries

💡 Real-World Examples

Example 1: Check Costs Before Running

# Before running an expensive query... query = "SELECT * FROM `bigquery-public-data.github_repos.commits`" # First, check the cost result = bq_dry_run_sql(sql=query) print(f"Estimated cost: ${result['usdEstimate']}") print(f"Data processed: {result['totalBytesProcessed'] / 1e9:.2f} GB") # Output: # Estimated cost: $12.50 # Data processed: 2500.00 GB

Example 2: Understand Table Structure

# Check table schema result = bq_describe_table( dataset_id="your_dataset", table_id="users" ) # Output: # ├── user_id (INTEGER, REQUIRED) # ├── email (STRING, NULLABLE) # ├── created_at (TIMESTAMP, REQUIRED) # └── profile (RECORD, REPEATED) # ├── name (STRING) # └── age (INTEGER)

Example 3: Get Optimization Suggestions

# Analyze a slow query query = """ SELECT * FROM large_table WHERE date > '2024-01-01' """ result = bq_analyze_query_performance(sql=query) # Output: # Performance Score: 45/100 (Needs Improvement) # # Optimization Suggestions: # 1. Avoid SELECT * - specify only needed columns # 2. Use partition filter on date field # 3. Consider adding LIMIT clause

Example 4: Track Data Dependencies

# Understand query dependencies query = """ WITH user_stats AS ( SELECT user_id, COUNT(*) as order_count FROM orders GROUP BY user_id ) SELECT u.name, s.order_count FROM users u JOIN user_stats s ON u.id = s.user_id """ result = bq_extract_dependencies(sql=query) # Output: # Tables: ['orders', 'users'] # Columns: ['user_id', 'name', 'id'] # Dependency Graph: # orders → user_stats → final_result # users → final_result

🎨 How It Works

Your Code ← → Claude/AI Assistant ↓ MCP Protocol ↓ mcp-bigquery ↓ BigQuery API (Dry-run) ↓ BigQuery (Never executes actual queries)

⚙️ Configuration

Environment Variables

export BQ_PROJECT="my-project" # GCP Project ID (required) export BQ_LOCATION="asia-northeast1" # Region (optional) export SAFE_PRICE_PER_TIB="5.0" # Price per TiB (default: $5) export DEBUG="true" # Enable debug logging

Full Claude Desktop Configuration

{ "mcpServers": { "mcp-bigquery": { "command": "mcp-bigquery", "env": { "BQ_PROJECT": "my-production-project", "BQ_LOCATION": "asia-northeast1", "SAFE_PRICE_PER_TIB": "6.0", "DEBUG": "false" } } } }

🔧 Troubleshooting

Common Issues & Solutions

❌ Authentication Error

Error: Could not automatically determine credentials

Solution:

gcloud auth application-default login

❌ Permission Error

Error: User does not have bigquery.tables.get permission

Solution: Grant BigQuery Data Viewer role

gcloud projects add-iam-policy-binding YOUR_PROJECT \ --member="user:your-email@example.com" \ --role="roles/bigquery.dataViewer"

❌ Project Not Set

Error: Project ID is required

Solution: Set BQ_PROJECT in your configuration

Debug Mode

If issues persist, enable debug mode:

{ "env": { "DEBUG": "true", "BQ_PROJECT": "your-project" } }

📚 Learn More

Getting Started

For Developers

🚦 Project Status

Version

Release Date

Key Features

v0.4.1

2025-01-22

Better error handling, debug logging

v0.4.0

2025-01-22

Added 6 schema discovery tools

v0.3.0

2025-01-17

SQL analysis engine

v0.2.0

2025-01-16

Basic validation & dry-run

🤝 Contributing

Pull requests are welcome! See our Contributing Guide.

# Setup development environment git clone https://github.com/caron14/mcp-bigquery.git cd mcp-bigquery pip install -e ".[dev]" # Run tests pytest tests/

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

  • Google BigQuery team for the excellent API

  • Anthropic for the MCP protocol

  • All contributors and users


Built for safe BigQuery exploration 🛡️

Report Bug · Request Feature · Discussions

Related MCP Servers

  • -
    security
    F
    license
    -
    quality
    A server that enables executing and validating SQL queries against Google BigQuery with safety features that prevent data modifications and excessive processing.
    Last updated -
    2
    • Apple
  • -
    security
    F
    license
    -
    quality
    A tool that provides simple API to execute SQL queries and manage MySQL databases, designed to integrate with Cursor IDE for AI assistants to directly perform database operations.
    Last updated -
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Handles SQL query execution for a natural language interface to SQLite databases, enabling users to interact with databases using plain English rather than writing SQL manually.
    Last updated -
    1
    MIT License
  • -
    security
    F
    license
    -
    quality
    Manages query validation, database connection, and security for a system that transforms SQL databases into interactive dashboards using natural language queries.
    Last updated -

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/caron14/mcp-bigquery'

If you have feedback or need assistance with the MCP directory API, please join our Discord server