Gemini MCP Server for Claude Desktop
Provides image generation capabilities using Google's Gemini AI models with customizable parameters like style and temperature
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Gemini MCP Server for Claude Desktopgenerate an image of a futuristic city at night with neon lights"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Gemini MCP Server with Smart Tool Intelligence
Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.
๐ Features Overview
๐ค 7 AI-Powered Tools
Image Generation - Create images from text prompts using Gemini 2.0 Flash
Image Editing - Edit existing images with natural language instructions
Chat - Interactive conversations with context-aware responses
Audio Transcription - Convert audio to text with optional verbatim mode
Code Execution - Run Python code in a secure sandbox environment
Video Analysis - Analyze video content for summaries, transcripts, and insights
Image Analysis - Extract objects, text, and detailed descriptions from images
๐ง Smart Tool Intelligence System (First in MCP Ecosystem)
Self-Learning - Automatically learns from successful interactions
Context Detection - Recognizes consciousness research, coding, debugging contexts
Pattern Recognition - Identifies usage patterns and user preferences
Prompt Enhancement - Refines prompts for better AI model performance
Persistent Memory - Stores learned preferences across sessions
Automatic Migration - Seamlessly upgrades preference storage
Related MCP server: Gemini MCP Server
๐ฆ Quick Start
Installation
git clone https://github.com/Garblesnarff/gemini-mcp-server.git
cd gemini-mcp-server
npm installConfiguration
Get your Gemini API key from Google AI Studio
Copy the environment template:
cp .env.example .envEdit
.envand add your API key:GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional
Running the Server
npm start
# or for development with debug logging:
npm run devIntegration with Claude Desktop
Add to your Claude Desktop config (claude_desktop_config.json):
{
\"mcpServers\": {
\"gemini\": {
\"command\": \"node\",
\"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"],
\"env\": {
\"GEMINI_API_KEY\": \"your_api_key_here\"
}
}
}
}๐ ๏ธ Tool Reference
1. Image Generation (generate_image)
Generate images from text descriptions using Gemini 2.0 Flash.
Parameters:
prompt(string, required) - Description of the image to generatecontext(string, optional) - Context for Smart Tool Intelligence enhancement
Example:
{
\"prompt\": \"A serene mountain landscape at sunset with vibrant colors\",
\"context\": \"artistic\"
}Returns:
{
\"content\": [{
\"type\": \"text\",
\"text\": \"Generated a beautiful mountain landscape image.\"
}, {
\"type\": \"image\",
\"data\": \"base64_image_data\",
\"mimeType\": \"image/png\"
}]
}2. Image Editing (gemini-edit-image)
Edit existing images using natural language instructions.
Parameters:
image_path(string, required) - Path to the image file to editedit_instruction(string, required) - Description of desired changescontext(string, optional) - Context for enhancement
Example:
{
\"image_path\": \"/path/to/image.jpg\",
\"edit_instruction\": \"Add shooting stars to the night sky\",
\"context\": \"artistic\"
}3. Chat (gemini-chat)
Interactive conversations with Gemini AI that learns your preferences.
Parameters:
message(string, required) - Your message or questioncontext(string, optional) - Context for Smart Tool Intelligence
Example:
{
\"message\": \"Explain quantum computing in simple terms\",
\"context\": \"consciousness\" // Will apply academic rigor enhancement
}4. Audio Transcription (gemini-transcribe-audio)
Convert audio files to text with Smart Tool Intelligence enhancement.
Parameters:
file_path(string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)language(string, optional) - Language hint for better accuracycontext(string, optional) - Use "verbatim" for exact word-for-word transcriptionpreserve_spelled_acronyms(boolean, optional) - Keep U-R-L instead of URL
Example (Standard):
{
\"file_path\": \"/path/to/audio.mp3\",
\"language\": \"en\"
}Example (Verbatim Mode):
{
\"file_path\": \"/path/to/audio.mp3\",
\"context\": \"verbatim\", // Gets exact word-for-word transcription
\"preserve_spelled_acronyms\": true
}Verbatim Mode Features:
Captures all "um", "uh", "like", repeated words
Preserves emotional expressions: [laughs], [sighs], [clears throat]
Maintains original punctuation and sentence structure
No summarization or cleanup
5. Code Execution (gemini-code-execute)
Execute Python code in a secure sandbox environment.
Parameters:
code(string, required) - Python code to executecontext(string, optional) - Context for enhancement
Example:
{
\"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\",
\"context\": \"code\"
}6. Video Analysis (gemini-analyze-video)
Analyze video content for summaries, transcripts, and detailed insights.
Parameters:
file_path(string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)analysis_type(string, optional) - "summary", "transcript", "objects", "detailed", "custom"context(string, optional) - Context for enhancement
Example:
{
\"file_path\": \"/path/to/video.mp4\",
\"analysis_type\": \"detailed\"
}7. Image Analysis (gemini-analyze-image)
Extract detailed information from images including objects, text, and descriptions.
Parameters:
file_path(string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)analysis_type(string, optional) - "summary", "objects", "text", "detailed", "custom"context(string, optional) - Context for enhancement
Example:
{
\"file_path\": \"/path/to/image.jpg\",
\"analysis_type\": \"objects\"
}๐ง Smart Tool Intelligence System
How It Works
The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:
Detects Context - Recognizes if you're doing consciousness research, coding, debugging, etc.
Enhances Prompts - Adds relevant instructions based on learned patterns
Learns Patterns - Stores successful interaction patterns for future use
Adapts Over Time - Gets better at helping you with each interaction
Context Types
The system recognizes these contexts and applies appropriate enhancements:
consciousness- Adds academic rigor, citations, detailed explanationscode- Includes practical examples, working code, best practicesdebugging- Focuses on root cause analysis and specific fixesgeneral- Applies comprehensive, structured responsesverbatim- For audio transcription, provides exact word-for-word output
Storage Location
Preferences are stored internally at ./data/tool-preferences.json with automatic migration from external storage.
Implementing Smart Tool Intelligence in Your MCP Server
Want to add this revolutionary capability to your own MCP server? Here's how:
1. Core Architecture
// src/intelligence/context-detector.js
class ContextDetector {
detectContext(prompt, toolName) {
// Implement pattern matching for different contexts
if (this.isConsciousnessContext(prompt)) return 'consciousness';
if (this.isCodeContext(prompt)) return 'code';
if (this.isDebuggingContext(prompt)) return 'debugging';
return 'general';
}
}
// src/intelligence/prompt-enhancer.js
class PromptEnhancer {
enhancePrompt(originalPrompt, context, toolName) {
// Apply context-specific enhancements
const enhancement = this.getEnhancementForContext(context);
return `${originalPrompt}\\n\\n${enhancement}`;
}
}
// src/intelligence/preference-store.js
class PreferencesManager {
async storePattern(original, enhanced, context, toolName, success) {
// Store successful patterns for future learning
}
async getPatterns(context) {
// Retrieve learned patterns for context
}
}2. Integration Pattern
// In your tool's execute method:
async execute(args) {
const intelligence = IntelligenceSystem.getInstance();
// Detect context and enhance prompt
const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name);
const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name);
// Execute with enhanced prompt
const result = await this.geminiService.generateContent(enhancedPrompt);
// Store successful pattern
await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name);
return result;
}3. Key Implementation Files
Study these files from this repository:
src/intelligence/index.js- Main intelligence coordinatorsrc/intelligence/context-detector.js- Context recognition logicsrc/intelligence/prompt-enhancer.js- Enhancement applicationsrc/intelligence/preference-store.js- Pattern storage and retrievalsrc/tools/base-tool.js- Integration with tool execution
๐งช Testing
Run Test Suite
# Test basic functionality
npm test
# Test Smart Tool Intelligence
node test-tool-intelligence-full.js
# Test internal storage
node test-internal-storage.js
# Test verbatim transcription
node test-verbatim-mode.jsManual Testing Examples
# Test image generation
echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js
# Test chat with consciousness context
echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js๐ Performance & Limits
File Size Limits
Images: 20MB (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)
Audio: 20MB (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)
Video: 100MB (MP4, MOV, AVI, WEBM, MKV, FLV)
API Rate Limits
Follows Google Gemini API rate limits
Built-in error handling and retry logic
Graceful degradation on quota exceeded
๐๏ธ Architecture Deep Dive
Modular Design
src/
โโโ server.js # MCP protocol handler
โโโ config.js # Configuration management
โโโ tools/ # Tool implementations
โ โโโ index.js # Tool registry & dispatcher
โ โโโ base-tool.js # Abstract base class
โ โโโ chat.js # Chat tool
โ โโโ image-generation.js # Image generation tool
โ โโโ image-editing.js # Image editing tool
โ โโโ audio-transcription.js # Audio transcription tool
โ โโโ code-execution.js # Code execution tool
โ โโโ video-analysis.js # Video analysis tool
โ โโโ image-analysis.js # Image analysis tool
โโโ intelligence/ # Smart Tool Intelligence
โ โโโ index.js # Intelligence coordinator
โ โโโ context-detector.js # Context recognition
โ โโโ prompt-enhancer.js # Prompt enhancement
โ โโโ preference-store.js # Pattern storage
โโโ gemini/ # Gemini API integration
โ โโโ gemini-service.js # API service layer
โ โโโ request-handler.js # Request formatting
โโโ utils/ # Utilities
โโโ logger.js # Logging system
โโโ file-utils.js # File operationsIntelligence System Flow
Request Received โ Tool's execute method called
Context Detection โ Analyze prompt for context clues
Pattern Retrieval โ Get relevant learned patterns
Prompt Enhancement โ Apply context-specific improvements
API Execution โ Send enhanced prompt to Gemini
Pattern Storage โ Store successful interaction pattern
Response Return โ Return enhanced result to user
๐ง Customization
Adding New Contexts
// In src/intelligence/context-detector.js
isMyCustomContext(prompt) {
const patterns = [
/custom pattern 1/i,
/custom pattern 2/i
];
return patterns.some(pattern => pattern.test(prompt));
}
// In src/intelligence/prompt-enhancer.js
getEnhancementForContext(context) {
const enhancements = {
'my_custom_context': 'Apply my custom enhancement instructions here.',
// ... other contexts
};
return enhancements[context] || enhancements.general;
}Adding New Tools
Create tool file in
src/tools/my-new-tool.jsExtend
BaseToolclassImplement
executemethod with intelligence integrationRegister in
src/tools/index.js
// src/tools/my-new-tool.js
class MyNewTool extends BaseTool {
constructor(geminiService, intelligenceSystem) {
super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem);
}
async execute(args) {
// Use intelligence system for enhancement
const context = args.context || this.detectContext(args.input);
const enhancedPrompt = await this.enhancePrompt(args.input, context);
// Your tool logic here
const result = await this.geminiService.someMethod(enhancedPrompt);
// Store successful pattern
await this.storeSuccessfulPattern(args.input, enhancedPrompt, context);
return result;
}
}๐ Troubleshooting
Common Issues
"Missing GEMINI_API_KEY" Error
# Ensure .env file exists and contains your API key
cp .env.example .env
# Edit .env and add: GEMINI_API_KEY=your_key_here"File not found" Errors
# Ensure file paths are absolute and files exist
# Check file permissions and formatsIntelligence System Not Learning
# Check data directory permissions
ls -la data/
# Verify tool-preferences.json is writableDebug Mode
DEBUG=true npm start
# or
npm run devLogs Location
Application logs: Console output
Intelligence patterns:
./data/tool-preferences.jsonGenerated images:
$OUTPUT_DIR(default:~/Claude/gemini-images)
๐ค Contributing
We welcome contributions! This project represents a new paradigm in MCP server development.
Development Setup
git clone https://github.com/Garblesnarff/gemini-mcp-server.git
cd gemini-mcp-server
npm install
npm run devAreas for Contribution
New Contexts - Add support for specialized domains
Enhanced Patterns - Improve learning algorithms
New Tools - Expand Gemini AI capabilities
Performance - Optimize intelligence system performance
Documentation - Improve guides and examples
๐ Roadmap
Multi-language Support - Context detection in multiple languages
Advanced Analytics - Usage patterns and performance metrics
Tool Chaining - Intelligent coordination between multiple tools
Custom Models - Support for fine-tuned Gemini models
Collaborative Learning - Share anonymized patterns across instances
Visual Interface - Web-based configuration and monitoring
๐ Why This Matters
This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.
For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.
๐ License
This project is licensed under the MIT License - feel free to use, modify, and distribute.
๐ Acknowledgments
Built with:
Google Gemini AI - Powering the core AI capabilities
Model Context Protocol - Enabling seamless integration
Node.js & NPM - Runtime and package management
Claude & Rob - Human-AI collaboration at its finest
Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! ๐"
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Garblesnarff/gemini-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server