Provides image generation capabilities using Google's Gemini AI models with customizable parameters like style and temperature
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Gemini MCP Server for Claude Desktopgenerate an image of a futuristic city at night with neon lights"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Gemini MCP Server with Smart Tool Intelligence
Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.
š Features Overview
š¤ 7 AI-Powered Tools
Image Generation - Create images from text prompts using Gemini 2.0 Flash
Image Editing - Edit existing images with natural language instructions
Chat - Interactive conversations with context-aware responses
Audio Transcription - Convert audio to text with optional verbatim mode
Code Execution - Run Python code in a secure sandbox environment
Video Analysis - Analyze video content for summaries, transcripts, and insights
Image Analysis - Extract objects, text, and detailed descriptions from images
š§ Smart Tool Intelligence System (First in MCP Ecosystem)
Self-Learning - Automatically learns from successful interactions
Context Detection - Recognizes consciousness research, coding, debugging contexts
Pattern Recognition - Identifies usage patterns and user preferences
Prompt Enhancement - Refines prompts for better AI model performance
Persistent Memory - Stores learned preferences across sessions
Automatic Migration - Seamlessly upgrades preference storage
Related MCP server: Gemini MCP Server
š¦ Quick Start
Installation
Configuration
Get your Gemini API key from Google AI Studio
Copy the environment template:
cp .env.example .envEdit
.envand add your API key:GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional
Running the Server
Integration with Claude Desktop
Add to your Claude Desktop config (claude_desktop_config.json):
š ļø Tool Reference
1. Image Generation (generate_image)
Generate images from text descriptions using Gemini 2.0 Flash.
Parameters:
prompt(string, required) - Description of the image to generatecontext(string, optional) - Context for Smart Tool Intelligence enhancement
Example:
Returns:
2. Image Editing (gemini-edit-image)
Edit existing images using natural language instructions.
Parameters:
image_path(string, required) - Path to the image file to editedit_instruction(string, required) - Description of desired changescontext(string, optional) - Context for enhancement
Example:
3. Chat (gemini-chat)
Interactive conversations with Gemini AI that learns your preferences.
Parameters:
message(string, required) - Your message or questioncontext(string, optional) - Context for Smart Tool Intelligence
Example:
4. Audio Transcription (gemini-transcribe-audio)
Convert audio files to text with Smart Tool Intelligence enhancement.
Parameters:
file_path(string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)language(string, optional) - Language hint for better accuracycontext(string, optional) - Use "verbatim" for exact word-for-word transcriptionpreserve_spelled_acronyms(boolean, optional) - Keep U-R-L instead of URL
Example (Standard):
Example (Verbatim Mode):
Verbatim Mode Features:
Captures all "um", "uh", "like", repeated words
Preserves emotional expressions: [laughs], [sighs], [clears throat]
Maintains original punctuation and sentence structure
No summarization or cleanup
5. Code Execution (gemini-code-execute)
Execute Python code in a secure sandbox environment.
Parameters:
code(string, required) - Python code to executecontext(string, optional) - Context for enhancement
Example:
6. Video Analysis (gemini-analyze-video)
Analyze video content for summaries, transcripts, and detailed insights.
Parameters:
file_path(string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)analysis_type(string, optional) - "summary", "transcript", "objects", "detailed", "custom"context(string, optional) - Context for enhancement
Example:
7. Image Analysis (gemini-analyze-image)
Extract detailed information from images including objects, text, and descriptions.
Parameters:
file_path(string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)analysis_type(string, optional) - "summary", "objects", "text", "detailed", "custom"context(string, optional) - Context for enhancement
Example:
š§ Smart Tool Intelligence System
How It Works
The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:
Detects Context - Recognizes if you're doing consciousness research, coding, debugging, etc.
Enhances Prompts - Adds relevant instructions based on learned patterns
Learns Patterns - Stores successful interaction patterns for future use
Adapts Over Time - Gets better at helping you with each interaction
Context Types
The system recognizes these contexts and applies appropriate enhancements:
consciousness- Adds academic rigor, citations, detailed explanationscode- Includes practical examples, working code, best practicesdebugging- Focuses on root cause analysis and specific fixesgeneral- Applies comprehensive, structured responsesverbatim- For audio transcription, provides exact word-for-word output
Storage Location
Preferences are stored internally at ./data/tool-preferences.json with automatic migration from external storage.
Implementing Smart Tool Intelligence in Your MCP Server
Want to add this revolutionary capability to your own MCP server? Here's how:
1. Core Architecture
2. Integration Pattern
3. Key Implementation Files
Study these files from this repository:
src/intelligence/index.js- Main intelligence coordinatorsrc/intelligence/context-detector.js- Context recognition logicsrc/intelligence/prompt-enhancer.js- Enhancement applicationsrc/intelligence/preference-store.js- Pattern storage and retrievalsrc/tools/base-tool.js- Integration with tool execution
š§Ŗ Testing
Run Test Suite
Manual Testing Examples
š Performance & Limits
File Size Limits
Images: 20MB (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)
Audio: 20MB (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)
Video: 100MB (MP4, MOV, AVI, WEBM, MKV, FLV)
API Rate Limits
Follows Google Gemini API rate limits
Built-in error handling and retry logic
Graceful degradation on quota exceeded
šļø Architecture Deep Dive
Modular Design
Intelligence System Flow
Request Received ā Tool's execute method called
Context Detection ā Analyze prompt for context clues
Pattern Retrieval ā Get relevant learned patterns
Prompt Enhancement ā Apply context-specific improvements
API Execution ā Send enhanced prompt to Gemini
Pattern Storage ā Store successful interaction pattern
Response Return ā Return enhanced result to user
š§ Customization
Adding New Contexts
Adding New Tools
Create tool file in
src/tools/my-new-tool.jsExtend
BaseToolclassImplement
executemethod with intelligence integrationRegister in
src/tools/index.js
š Troubleshooting
Common Issues
"Missing GEMINI_API_KEY" Error
"File not found" Errors
Intelligence System Not Learning
Debug Mode
Logs Location
Application logs: Console output
Intelligence patterns:
./data/tool-preferences.jsonGenerated images:
$OUTPUT_DIR(default:~/Claude/gemini-images)
š¤ Contributing
We welcome contributions! This project represents a new paradigm in MCP server development.
Development Setup
Areas for Contribution
New Contexts - Add support for specialized domains
Enhanced Patterns - Improve learning algorithms
New Tools - Expand Gemini AI capabilities
Performance - Optimize intelligence system performance
Documentation - Improve guides and examples
š Roadmap
Multi-language Support - Context detection in multiple languages
Advanced Analytics - Usage patterns and performance metrics
Tool Chaining - Intelligent coordination between multiple tools
Custom Models - Support for fine-tuned Gemini models
Collaborative Learning - Share anonymized patterns across instances
Visual Interface - Web-based configuration and monitoring
š Why This Matters
This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.
For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.
š License
This project is licensed under the MIT License - feel free to use, modify, and distribute.
š Acknowledgments
Built with:
Google Gemini AI - Powering the core AI capabilities
Model Context Protocol - Enabling seamless integration
Node.js & NPM - Runtime and package management
Claude & Rob - Human-AI collaboration at its finest
Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! š"