This MCP server provides comprehensive access to Google's Gemini AI models for chat, file processing, image generation, batch operations, and embeddings generation.
Core Chat & Conversation: Engage in multi-turn chat sessions with Gemini models (2.5-pro, 2.5-flash, 2.0-flash-exp) with text messages and optional file attachments. Manage conversation sessions (start, continue, clear) and configure generation parameters (temperature 0-2, max tokens up to 500K).
File Management: Upload single or multiple files (2-40+) with automatic MIME type detection, parallel processing, and retry logic. List, retrieve metadata, delete individual files, or bulk cleanup all files. Files automatically expire after 48 hours with 20GB project storage limit.
Image Generation: Create new images or edit existing ones using the Gemini 2.5 Flash Image model with text prompts, supporting various aspect ratios and batch generation.
Batch Processing (50% Cost Savings): Process large-scale async tasks with ~24-hour turnaround. Complete automated workflows handle ingestion, upload, job creation, polling, and results download. Supports CSV, JSON, TXT, MD, and JSONL formats with intelligent content conversion.
Embeddings Generation: Create 1536-dimensional embeddings using gemini-embedding-001 model with 8 task types (SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY, CODE_RETRIEVAL_QUERY, QUESTION_ANSWERING, FACT_VERIFICATION). Includes AI-powered task type selector for optimal recommendations.
Job Management: Monitor batch job status with optional auto-polling (PENDING → RUNNING → SUCCEEDED/FAILED), cancel running jobs, download and parse results, and delete completed jobs.
System Resources: Access metadata on available Gemini models (gemini://models/available) and active conversation sessions (gemini://conversations/active).
Provides access to Google's Gemini AI models (2.5 Pro, 2.5 Flash, 2.0 Flash, and Embedding-001) with support for file uploads, multi-turn conversations, batch processing at reduced cost, and embedding generation for various tasks like search, classification, and clustering.
Gemini MCP Server
An MCP Server that provides access to the Gemini Suite.
✨ Features
Support for 1.5 through 2.5 pro
Nano Banana
Embeddings
File Upload
Batch (NLP and Embeddings)
🚀 Quick Start
Option 1: NPX (No Install Required)
Option 2: Global Install
Option 3: Local Project Install
After any installation method, restart Claude Code and you're ready to use Gemini.
Shell Environment
File:
~/.zshrcor~/.bashrcFormat:
export GEMINI_API_KEY="your-key-here"
Usage
MCP Tools
The server provides the following tools:
chat
Send a message to Gemini with optional file attachments.
Parameters:
message(required): The message to sendmodel(optional): Model to use (gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite)files(optional): Array of files with base64 encoded datatemperature(optional): Controls randomness (0.0-2.0)maxTokens(optional): Maximum response tokensconversationId(optional): Continue an existing conversation
start_conversation
Start a new conversation session.
Parameters:
id(optional): Custom conversation ID
clear_conversation
Clear a conversation session.
Parameters:
id(required): Conversation ID to clear
generate_images
Generate images from text prompts or edit existing images using Gemini 2.5 Flash Image model.
Parameters:
prompt(required): Text description of image to generate or editing instructionsaspectRatio(optional): Image aspect ratio -1:1,2:3,3:2,3:4,4:3,4:5,5:4,9:16,16:9,21:9(default:1:1)numImages(optional): Number of images to generate, 1-4 (default:1). Note: Makes sequential API calls, ~10-15s per image.inputImageUri(optional): File URI from uploaded file for image editing (omit for text-to-image generation)outputDir(optional): Directory to save generated images (default:./generated-images)temperature(optional): Controls randomness (0.0-2.0, default: 1.0)
Returns:
Array of generated images with file paths and base64 data
Token usage (~1,290-1,300 tokens per image)
All images include SynthID watermark
Performance Note: The Gemini API generates one image per request. When numImages > 1, the tool makes multiple sequential API calls to generate the requested number of images. Expect ~10-15 seconds per image.
Text-to-Image Example:
Image Editing Example:
🆕 Batch API Tools (v0.3.0)
Process large-scale tasks asynchronously at 50% cost with ~24 hour turnaround.
Content Generation
Simple (Automated):
Advanced (Manual Control):
Embeddings
Simple (Automated):
Advanced (Manual Control):
Task Types (8 options):
SEMANTIC_SIMILARITY- Compare text similarityCLASSIFICATION- Categorize contentCLUSTERING- Group similar itemsRETRIEVAL_DOCUMENT- Build search indexesRETRIEVAL_QUERY- Search queriesCODE_RETRIEVAL_QUERY- Code searchQUESTION_ANSWERING- Q&A systemsFACT_VERIFICATION- Fact-checking
Job Management
Supported Input Formats:
CSV (converts rows to requests)
JSON (wraps objects as requests)
TXT (splits lines as requests)
MD (markdown sections as requests)
JSONL (ready to use)
MCP Resources
gemini://models/available
Information about available Gemini models and their capabilities.
gemini://conversations/active
List of active conversation sessions with metadata.
🔧 Development
Connection Failures
If Claude Code fails to connect:
Verify your API key is correct
Check that the command path is correct (for local installs)
Restart Claude Code after configuration changes
🔒 Security
API keys are never logged or echoed
Files created with 600 permissions (user read/write only)
Masked input during key entry
Real API validation before storage
🤝 Contributing
Contributions are welcome! This package is designed to be production-ready with:
Full TypeScript types
Comprehensive error handling
Automatic retry logic
Real API validation
📄 License
MIT - see LICENSE file
🙋 Support
MCP Protocol: https://modelcontextprotocol.io
Gemini API Docs: https://ai.google.dev/docs