GCP Sales Analytics MCP Server

PROJECT_SUMMARY.md•9.66 KiB

# Project Summary - GCP Sales Analytics POC ## Overview This is a complete proof-of-concept implementation demonstrating an AI-powered sales analytics system that intelligently queries multiple data sources using the Anthropic Claude API and Model Context Protocol (MCP). ## What Was Built ### 1. Infrastructure (Terraform) - **Cloud SQL PostgreSQL instance** with public IP access - Automated deployment with configurable regions - Database creation and user management - Cost-optimized settings for POC (db-f1-micro) ### 2. Database Schema & Data - **Three tables**: customers, orders, vendors - **Synthetic data generation**: 50 records each table - Realistic transactional data using Faker library - Includes relationships and indexes for performance ### 3. MCP Server - Standards-compliant Model Context Protocol server - **Six tools** for data access: - `query_cloudsql` - Execute SQL against Cloud SQL - `query_bigquery` - Execute SQL against BigQuery - `list_cloudsql_tables` - List Cloud SQL tables - `list_bigquery_tables` - List BigQuery tables - `get_cloudsql_schema` - Get Cloud SQL table schema - `get_bigquery_schema` - Get BigQuery table schema - Security: SELECT-only queries, connection pooling - Error handling and logging ### 4. ADK Agent (Two Versions) **Version 1: agent.py** (Conceptual) - Demonstrates agentic architecture - Tool calling framework - Placeholder tool execution **Version 2: integrated_agent.py** (Functional) - Fully integrated with Cloud SQL and BigQuery - Direct tool execution - Interactive CLI interface - Intelligent data source selection - Context-aware query routing ### 5. Automation Scripts - **deploy.sh**: One-command deployment - **cleanup.sh**: Safe resource teardown - **generate_seed_data.py**: Synthetic data generation - **test_agent.py**: Automated testing ### 6. Documentation - **README.md**: Comprehensive documentation - **QUICKSTART.md**: 15-minute setup guide - **PROJECT_SUMMARY.md**: This file - Inline code comments throughout ## Key Features ### Intelligent Query Routing The agent automatically determines which data source to use: ``` Customer question → Analyze keywords → Choose data source → Execute query ``` **Cloud SQL**: customers, vendors, recent orders **BigQuery**: products, inventory, analytics, historical data ### Architecture Highlights 1. **Agentic Loop**: Multi-turn conversations with tool calling 2. **MCP Compliance**: Standards-based tool interface 3. **Error Handling**: Graceful degradation and helpful errors 4. **Security**: SQL injection prevention, credential management 5. **Observability**: Logging and query visibility ## File Structure ``` gcp_mcp/ ├── README.md # Main documentation ├── QUICKSTART.md # Quick start guide ├── PROJECT_SUMMARY.md # This file ├── requirements.txt # Python dependencies ├── .env.example # Environment template ├── .gitignore # Git ignore rules │ ├── terraform/ # Infrastructure as Code │ ├── main.tf # Cloud SQL resources │ ├── variables.tf # Input variables │ ├── outputs.tf # Output values │ └── terraform.tfvars.example # Variable values template │ ├── data/ │ └── schema.sql # Database schema (3 tables) │ ├── scripts/ │ ├── deploy.sh # Automated deployment (Bash) │ ├── cleanup.sh # Resource cleanup (Bash) │ ├── generate_seed_data.py # Synthetic data generator │ └── test_agent.py # Automated tests │ ├── mcp_server/ │ └── server.py # MCP server implementation │ └── adk_agent/ ├── agent.py # Conceptual agent (framework) └── integrated_agent.py # Functional agent (complete) ``` ## Technologies Used - **Cloud Platform**: Google Cloud Platform (GCP) - Cloud SQL (PostgreSQL 15) - BigQuery (thelook_ecommerce public dataset) - **Infrastructure**: Terraform 1.0+ - **Language**: Python 3.9+ - **Key Libraries**: - `anthropic` - Claude API client - `google-cloud-bigquery` - BigQuery client - `psycopg2` - PostgreSQL client - `mcp` - Model Context Protocol - `faker` - Synthetic data generation - **AI Model**: Claude 3.5 Sonnet ## Data Sources ### Cloud SQL Database ```sql -- customers table (50 records) - customer_id, first_name, last_name, email - phone, address, city, state, zip_code, country - created_at, updated_at -- vendors table (50 records) - vendor_id, vendor_name, contact_name, email - phone, address, city, state, zip_code, country - created_at, updated_at -- orders table (50 records) - order_id, customer_id, vendor_id - order_date, ship_date, total_amount, status - product_name, quantity, unit_price - created_at, updated_at ``` ### BigQuery Dataset ``` bigquery-public-data.thelook_ecommerce - products, users, events - inventory_items, order_items, orders - distribution_centers ``` ## How to Use ### Quick Start ```bash # 1. Setup cp .env.example .env # Edit .env with your ANTHROPIC_API_KEY # 2. Deploy ./scripts/deploy.sh # 3. Run python3 adk_agent/integrated_agent.py ``` ### Example Queries ``` 📊 You: What are the total sales from Cloud SQL? 📊 You: How many customers do we have? 📊 You: Show me orders with status 'delivered' 📊 You: What's the average order value? 📊 You: List the top 5 vendors by order volume ``` ## Agent Decision Making The agent uses a sophisticated decision-making process: 1. **Question Analysis**: Parse user intent and keywords 2. **Schema Discovery**: Optionally list tables to understand data 3. **Source Selection**: Choose Cloud SQL or BigQuery based on: - Data type (transactional vs analytical) - Keywords (customers/vendors → Cloud SQL) - Query complexity (aggregations, joins) - Data recency requirements 4. **Query Formulation**: Generate appropriate SQL 5. **Execution**: Run query against selected source 6. **Presentation**: Format results with context and insights ## Security Considerations ### Current (POC) - Public IP with authorized networks - Environment variable credentials - SELECT-only queries - Basic SQL injection prevention ### Production Recommendations - Cloud SQL Proxy or private IP - Secret Manager for credentials - VPC networking - IAM authentication - Query validation and sanitization - Rate limiting - Audit logging - Cloud SQL backups ## Cost Optimization - db-f1-micro instance (~$7-10/month) - Backups disabled for POC - Public dataset (BigQuery) - minimal cost - Manual cleanup script provided **Remember to run `./scripts/cleanup.sh` when done!** ## Extensibility This POC can be extended with: 1. **Additional Data Sources** - Cloud Spanner - Firestore - External APIs - CSV/JSON files 2. **Advanced Features** - Data visualization - Report generation - Scheduled queries - Email notifications - Slack integration 3. **Enhanced Agent Capabilities** - Multi-step reasoning - Query optimization suggestions - Data quality checks - Anomaly detection - Predictive analytics 4. **Production Features** - Caching layer (Redis) - Query result pagination - Streaming responses - Monitoring (Cloud Monitoring) - Alerting (Cloud Alerting) - CI/CD pipeline ## Testing ```bash # Run automated tests python3 scripts/test_agent.py # Test individual components python3 -c "from adk_agent.integrated_agent import IntegratedSalesAgent; print('OK')" # Test Cloud SQL connection psql -h $DB_HOST -U $CLOUDSQL_USER -d $CLOUDSQL_DATABASE -c "SELECT COUNT(*) FROM orders" # Test BigQuery access bq query --use_legacy_sql=false "SELECT COUNT(*) FROM \`bigquery-public-data.thelook_ecommerce.products\` LIMIT 10" ``` ## Troubleshooting Common issues and solutions are documented in: - [README.md](README.md#troubleshooting) - [QUICKSTART.md](QUICKSTART.md#troubleshooting) ## Success Criteria ✅ Cloud SQL deployed and accessible ✅ Database schema created with 3 tables ✅ 150 synthetic records generated (50 each table) ✅ BigQuery public dataset accessible ✅ MCP server implements 6 tools ✅ ADK agent intelligently routes queries ✅ Automated deployment script ✅ Comprehensive documentation ✅ Working example queries ## Lessons Learned 1. **MCP Standard**: Provides clean abstraction for tool interfaces 2. **Agentic Patterns**: Multi-turn conversations enable complex tasks 3. **Data Source Selection**: Clear tool descriptions guide agent decisions 4. **Error Handling**: Critical for production reliability 5. **Documentation**: Essential for POC handoff and maintenance ## Next Steps 1. **Immediate**: - Test with real data - Add more example queries - Performance benchmarking 2. **Short Term**: - Implement caching - Add data visualization - Create web interface 3. **Long Term**: - Production hardening - Multi-tenant support - Advanced analytics - MLOps integration ## Resources - [Anthropic Claude API](https://docs.anthropic.com/) - [Model Context Protocol](https://modelcontextprotocol.io/) - [Google Cloud SQL](https://cloud.google.com/sql/docs) - [BigQuery Public Datasets](https://cloud.google.com/bigquery/public-data) - [Terraform GCP Provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs) ## Credits Built as a proof of concept demonstrating: - Multi-source data querying - Intelligent agent decision making - MCP server implementation - Infrastructure as Code - Production-ready patterns ## License Proof of concept for demonstration purposes. --- **Questions?** Check the [README.md](README.md) or [QUICKSTART.md](QUICKSTART.md)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/miames/gcp_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

PROJECT_SUMMARY.md•9.66 KiB