README.mdā¢10.3 kB
# TNTM Google Sheets Analytics MCP Server

A clean, practical MCP (Model Context Protocol) server for analyzing Google Sheets data with multi-tab support. Built for Claude Code and other MCP-compatible AI assistants by TNTM.
## š Features
- **Smart Sync** - Sync Google Sheets with configurable row limits to prevent timeouts
- **Multi-tab Support** - Query across multiple sheets with SQL JOINs
- **SQL Queries** - Direct SQL access to synced data
- **Sheet Analysis** - Get suggestions for cross-sheet queries
- **Quick Preview** - Preview sheets without full sync
- **Performance Optimized** - Row limits and result pagination for large datasets
## š Prerequisites
- Python 3.8+
- Claude Code or another MCP-compatible client
- Google Cloud Project with Sheets API enabled
- OAuth2 credentials from Google Cloud Console
## š ļø Setup
### ā” One-Click Setup with Claude Code (Recommended)
1. **Drag this project folder into Claude Code**
2. **Ask Claude Code**: *"Follow the README instructions to install this MCP server into Claude Code"*
3. **Get Google OAuth credentials** (Claude Code will guide you through this):
- Go to [Google Cloud Console](https://console.cloud.google.com/)
- Create a new project or select existing one
- Enable the Google Sheets API
- Create OAuth2 credentials (Desktop Application)
- Download and save as `credentials.json` in the project root
That's it! Claude Code will handle virtual environments, dependencies, and OAuth setup automatically.
### š Automated Installation (Alternative)
For non-Claude Code users or manual setup:
#### Option 1: Shell Script (macOS/Linux)
```bash
# Download and run the automated installer
curl -sSL https://raw.githubusercontent.com/yourusername/google-sheet-analytics-mcp/main/install.sh | bash
# Or clone first, then run
git clone https://github.com/yourusername/google-sheet-analytics-mcp.git
cd google-sheet-analytics-mcp
./install.sh
```
#### Option 2: Python Script (All platforms)
```bash
# Clone the repository
git clone https://github.com/yourusername/google-sheet-analytics-mcp.git
cd google-sheet-analytics-mcp
# Run the Python installer
python3 setup.py
```
#### Option 3: Manual Step-by-step
```bash
# 1. Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 2. Install dependencies
pip install -e .
# 3. Install MCP server
mcp install src/mcp_server.py --name google-sheets-analytics --with-editable .
# 4. Setup OAuth (after adding credentials.json)
python src/auth/oauth_setup.py
```
### š Getting Google Credentials
Before first use, you need OAuth2 credentials:
1. Go to [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project or select existing one
3. Enable the **Google Sheets API**
4. Go to **APIs & Services > Credentials**
5. Click **Create Credentials > OAuth 2.0 Client IDs**
6. Choose **Desktop Application**
7. Download the JSON file
8. Save it as `credentials.json` in the project root
### š First Run - OAuth Setup
After adding your `credentials.json` file, run the OAuth setup:
```bash
python src/auth/oauth_setup.py
```
This will:
1. Open your browser for Google authentication
2. Create a `token.json` file with your access credentials
3. Verify the connection works
**You only need to do this once!** After setup, all MCP tools will work automatically.
## š§ Tools
### `smart_sync`
Sync Google Sheet data with intelligent chunking for large datasets.
```
Use smart_sync with url "https://docs.google.com/spreadsheets/d/your_sheet_id" and max_rows 100000
```
- `url` (required): Google Sheets URL
- `max_rows` (optional): Max rows per sheet (default: 100000, supports up to 1M+)
- `sheets` (optional): Array of specific sheet names to sync
**Auto-scaling behavior:**
- Sheets <10K rows: Single fetch
- Sheets 10K-100K rows: 10K row chunks
- Sheets >100K rows: 50K row chunks with sampling
### `query_sheets`
Run SQL queries on synced data, including JOINs across tabs.
```
Use query_sheets with query "SELECT * FROM sheet1 JOIN sheet2 ON sheet1.id = sheet2.id LIMIT 10"
```
- `query` (required): SQL query to execute
### `list_synced_sheets`
View all synced sheets and their table names.
```
Use list_synced_sheets
```
### `analyze_sheets`
Get suggestions for queries across multiple sheets.
```
Use analyze_sheets with question "How can I combine sales data with customer data?"
```
- `question` (required): What you want to analyze
### `get_sheet_preview`
Quick preview without syncing.
```
Use get_sheet_preview with url "https://docs.google.com/spreadsheets/d/your_sheet_id" and rows 20
```
- `url` (required): Google Sheets URL
- `sheet_name` (optional): Specific sheet to preview
- `rows` (optional): Number of rows to preview (default: 10)
## š How It Works
1. **Authentication** - Uses OAuth2 to securely access Google Sheets API
2. **Sync** - Downloads sheet data to local SQLite database with configurable limits
3. **Query** - Enables SQL queries across all synced sheets
4. **Multi-tab** - Each sheet becomes a separate table, joinable via SQL
## šļø Project Structure
```
google-sheet-analytics-mcp/
āāā src/
ā āāā mcp_server.py # Main MCP server implementation
ā āāā auth/
ā āāā oauth_setup.py # OAuth authentication module
āāā pyproject.toml # Modern Python package configuration
āāā credentials.json.example # Example OAuth credentials format
āāā README.md # This file
āāā LICENSE # MIT License
āāā CLAUDE.md # Claude-specific instructions
āāā data/ # Runtime data (created automatically)
āāā token.json # OAuth token (created during setup)
āāā sheets_data.sqlite # Local database (created on first sync)
```
## ā” Performance
### Scale & Capacity
- **1 Million Row Support**: Handles sheets with up to 1M rows efficiently
- **Chunked Processing**: Automatically chunks large sheets (>10K rows) for optimal performance
- **Bulk Operations**: 50-100x faster inserts using batch processing
- **Configurable Limits**: Default 1000 rows, expandable to 1M+ rows per sheet
### Optimizations
- **Smart Caching**: Skip unchanged sheets, 5-minute cache TTL
- **Streaming Queries**: Results streamed in batches to prevent memory overflow
- **Progressive Hashing**: Samples large datasets for efficient change detection
- **Dynamic Indexing**: Auto-creates indexes on large tables for faster queries
- **Memory Management**: Automatic cleanup after processing large datasets
### Performance Metrics
- **Sync Speed**: 50,000-100,000 rows/second (vs 1,000 rows/second previously)
- **Query Response**: <1 second for most queries on 1M rows
- **Memory Usage**: Constant ~200-500MB regardless of dataset size
- **1M Row Sync Time**: ~10-20 seconds
## š Example Use Cases
### Multi-tab Analysis
```sql
-- Combine sales data with customer information
SELECT
s.product_name,
s.sales_amount,
c.customer_name,
c.customer_segment
FROM sales_data s
JOIN customer_data c ON s.customer_id = c.id
WHERE s.sales_amount > 1000
```
### Cross-sheet Aggregation
```sql
-- Total revenue by region from multiple sheets
SELECT
region,
SUM(amount) as total_revenue
FROM (
SELECT region, amount FROM q1_sales
UNION ALL
SELECT region, amount FROM q2_sales
)
GROUP BY region
ORDER BY total_revenue DESC
```
## š Security
- OAuth2 authentication with Google
- Credentials stored locally (never committed to repo)
- Read-only access to Google Sheets
- Local SQLite database (no external data transmission)
## š Troubleshooting
### Installation Issues
| Issue | Solution |
|-------|----------|
| "Failed to reconnect to google-sheets-analytics" | Run automated setup: `python3 setup.py` or `./install.sh` |
| "ModuleNotFoundError: No module named 'google'" | Dependencies not installed - use automated installer or manual venv setup |
| "externally-managed-environment" | Use virtual environment (automated installers handle this) |
| "MCP server not appearing" | Check Claude Code config and restart app |
### Common Runtime Issues
| Issue | Solution |
|-------|----------|
| "No credentials found" | Ensure `credentials.json` exists in project root or `config/` directory |
| "Authentication failed" | Check token status with `venv/bin/python src/auth/oauth_setup.py --status` |
| "Token expired" | Run `venv/bin/python src/auth/oauth_setup.py --test` (auto-refreshes) |
| "Sync timeout" | Reduce `max_rows` parameter in smart_sync |
| "Tools not appearing" | Restart Claude Desktop after configuration |
| "Rate limit errors" | Wait a few minutes and try again with smaller batches |
### OAuth Troubleshooting
- **Check status**: `venv/bin/python src/auth/oauth_setup.py --status`
- **Test auth**: `venv/bin/python src/auth/oauth_setup.py --test`
- **Reset OAuth**: `venv/bin/python src/auth/oauth_setup.py --reset`
- **Manual setup**: `venv/bin/python src/auth/oauth_setup.py --manual`
### MCP Server Not Appearing
1. Verify config: `cat ~/.config/claude-code/config.json`
2. Check the config includes the google-sheets-analytics server
3. Ensure the virtual environment and dependencies are properly installed
4. Check that the Python path in the config is correct
### Database Issues
- Database location: `data/sheets_data.sqlite`
- Reset database: Delete the file and re-sync
- Check synced sheets: Use the `list_synced_sheets` tool
## š¤ Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## š License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## š Acknowledgments
- Built for the [Model Context Protocol](https://modelcontextprotocol.io/)
- Designed for [Claude Code](https://claude.ai/code)
- Uses [Google Sheets API](https://developers.google.com/sheets/api)
---
**Need help?** Open an issue on GitHub or check the troubleshooting section above.