Skip to main content
Glama

Gemini MCP Server for Claude Desktop

README.md16.9 kB
# Gemini MCP Server with Smart Tool Intelligence Welcome to the Gemini MCP Server, the **first MCP server with Smart Tool Intelligence** - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness. ## 🚀 Features Overview ### 🤖 7 AI-Powered Tools - **Image Generation** - Create images from text prompts using Gemini 2.0 Flash - **Image Editing** - Edit existing images with natural language instructions - **Chat** - Interactive conversations with context-aware responses - **Audio Transcription** - Convert audio to text with optional verbatim mode - **Code Execution** - Run Python code in a secure sandbox environment - **Video Analysis** - Analyze video content for summaries, transcripts, and insights - **Image Analysis** - Extract objects, text, and detailed descriptions from images ### 🧠 Smart Tool Intelligence System *(First in MCP Ecosystem)* - **Self-Learning** - Automatically learns from successful interactions - **Context Detection** - Recognizes consciousness research, coding, debugging contexts - **Pattern Recognition** - Identifies usage patterns and user preferences - **Prompt Enhancement** - Refines prompts for better AI model performance - **Persistent Memory** - Stores learned preferences across sessions - **Automatic Migration** - Seamlessly upgrades preference storage ## 📦 Quick Start ### Installation ```bash git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install ``` ### Configuration 1. Get your Gemini API key from [Google AI Studio](https://aistudio.google.com/app/apikey) 2. Copy the environment template: ```bash cp .env.example .env ``` 3. Edit `.env` and add your API key: ```env GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional ``` ### Running the Server ```bash npm start # or for development with debug logging: npm run dev ``` ### Integration with Claude Desktop Add to your Claude Desktop config (`claude_desktop_config.json`): ```json { \"mcpServers\": { \"gemini\": { \"command\": \"node\", \"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"], \"env\": { \"GEMINI_API_KEY\": \"your_api_key_here\" } } } } ``` ## 🛠️ Tool Reference ### 1. Image Generation (`generate_image`) Generate images from text descriptions using Gemini 2.0 Flash. **Parameters:** - `prompt` (string, required) - Description of the image to generate - `context` (string, optional) - Context for Smart Tool Intelligence enhancement **Example:** ```javascript { \"prompt\": \"A serene mountain landscape at sunset with vibrant colors\", \"context\": \"artistic\" } ``` **Returns:** ```javascript { \"content\": [{ \"type\": \"text\", \"text\": \"Generated a beautiful mountain landscape image.\" }, { \"type\": \"image\", \"data\": \"base64_image_data\", \"mimeType\": \"image/png\" }] } ``` ### 2. Image Editing (`gemini-edit-image`) Edit existing images using natural language instructions. **Parameters:** - `image_path` (string, required) - Path to the image file to edit - `edit_instruction` (string, required) - Description of desired changes - `context` (string, optional) - Context for enhancement **Example:** ```javascript { \"image_path\": \"/path/to/image.jpg\", \"edit_instruction\": \"Add shooting stars to the night sky\", \"context\": \"artistic\" } ``` ### 3. Chat (`gemini-chat`) Interactive conversations with Gemini AI that learns your preferences. **Parameters:** - `message` (string, required) - Your message or question - `context` (string, optional) - Context for Smart Tool Intelligence **Example:** ```javascript { \"message\": \"Explain quantum computing in simple terms\", \"context\": \"consciousness\" // Will apply academic rigor enhancement } ``` ### 4. Audio Transcription (`gemini-transcribe-audio`) Convert audio files to text with Smart Tool Intelligence enhancement. **Parameters:** - `file_path` (string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A) - `language` (string, optional) - Language hint for better accuracy - `context` (string, optional) - Use \"verbatim\" for exact word-for-word transcription - `preserve_spelled_acronyms` (boolean, optional) - Keep U-R-L instead of URL **Example (Standard):** ```javascript { \"file_path\": \"/path/to/audio.mp3\", \"language\": \"en\" } ``` **Example (Verbatim Mode):** ```javascript { \"file_path\": \"/path/to/audio.mp3\", \"context\": \"verbatim\", // Gets exact word-for-word transcription \"preserve_spelled_acronyms\": true } ``` **Verbatim Mode Features:** - Captures all \"um\", \"uh\", \"like\", repeated words - Preserves emotional expressions: [laughs], [sighs], [clears throat] - Maintains original punctuation and sentence structure - No summarization or cleanup ### 5. Code Execution (`gemini-code-execute`) Execute Python code in a secure sandbox environment. **Parameters:** - `code` (string, required) - Python code to execute - `context` (string, optional) - Context for enhancement **Example:** ```javascript { \"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\", \"context\": \"code\" } ``` ### 6. Video Analysis (`gemini-analyze-video`) Analyze video content for summaries, transcripts, and detailed insights. **Parameters:** - `file_path` (string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV) - `analysis_type` (string, optional) - \"summary\", \"transcript\", \"objects\", \"detailed\", \"custom\" - `context` (string, optional) - Context for enhancement **Example:** ```javascript { \"file_path\": \"/path/to/video.mp4\", \"analysis_type\": \"detailed\" } ``` ### 7. Image Analysis (`gemini-analyze-image`) Extract detailed information from images including objects, text, and descriptions. **Parameters:** - `file_path` (string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF) - `analysis_type` (string, optional) - \"summary\", \"objects\", \"text\", \"detailed\", \"custom\" - `context` (string, optional) - Context for enhancement **Example:** ```javascript { \"file_path\": \"/path/to/image.jpg\", \"analysis_type\": \"objects\" } ``` ## 🧠 Smart Tool Intelligence System ### How It Works The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically: 1. **Detects Context** - Recognizes if you're doing consciousness research, coding, debugging, etc. 2. **Enhances Prompts** - Adds relevant instructions based on learned patterns 3. **Learns Patterns** - Stores successful interaction patterns for future use 4. **Adapts Over Time** - Gets better at helping you with each interaction ### Context Types The system recognizes these contexts and applies appropriate enhancements: - **`consciousness`** - Adds academic rigor, citations, detailed explanations - **`code`** - Includes practical examples, working code, best practices - **`debugging`** - Focuses on root cause analysis and specific fixes - **`general`** - Applies comprehensive, structured responses - **`verbatim`** - For audio transcription, provides exact word-for-word output ### Storage Location Preferences are stored internally at `./data/tool-preferences.json` with automatic migration from external storage. ### Implementing Smart Tool Intelligence in Your MCP Server Want to add this revolutionary capability to your own MCP server? Here's how: #### 1. Core Architecture ```javascript // src/intelligence/context-detector.js class ContextDetector { detectContext(prompt, toolName) { // Implement pattern matching for different contexts if (this.isConsciousnessContext(prompt)) return 'consciousness'; if (this.isCodeContext(prompt)) return 'code'; if (this.isDebuggingContext(prompt)) return 'debugging'; return 'general'; } } // src/intelligence/prompt-enhancer.js class PromptEnhancer { enhancePrompt(originalPrompt, context, toolName) { // Apply context-specific enhancements const enhancement = this.getEnhancementForContext(context); return `${originalPrompt}\\n\\n${enhancement}`; } } // src/intelligence/preference-store.js class PreferencesManager { async storePattern(original, enhanced, context, toolName, success) { // Store successful patterns for future learning } async getPatterns(context) { // Retrieve learned patterns for context } } ``` #### 2. Integration Pattern ```javascript // In your tool's execute method: async execute(args) { const intelligence = IntelligenceSystem.getInstance(); // Detect context and enhance prompt const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name); const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name); // Execute with enhanced prompt const result = await this.geminiService.generateContent(enhancedPrompt); // Store successful pattern await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name); return result; } ``` #### 3. Key Implementation Files Study these files from this repository: - `src/intelligence/index.js` - Main intelligence coordinator - `src/intelligence/context-detector.js` - Context recognition logic - `src/intelligence/prompt-enhancer.js` - Enhancement application - `src/intelligence/preference-store.js` - Pattern storage and retrieval - `src/tools/base-tool.js` - Integration with tool execution ## 🧪 Testing ### Run Test Suite ```bash # Test basic functionality npm test # Test Smart Tool Intelligence node test-tool-intelligence-full.js # Test internal storage node test-internal-storage.js # Test verbatim transcription node test-verbatim-mode.js ``` ### Manual Testing Examples ```bash # Test image generation echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js # Test chat with consciousness context echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js ``` ## 📊 Performance & Limits ### File Size Limits - **Images**: 20MB (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF) - **Audio**: 20MB (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A) - **Video**: 100MB (MP4, MOV, AVI, WEBM, MKV, FLV) ### API Rate Limits - Follows Google Gemini API rate limits - Built-in error handling and retry logic - Graceful degradation on quota exceeded ## 🏗️ Architecture Deep Dive ### Modular Design ``` src/ ├── server.js # MCP protocol handler ├── config.js # Configuration management ├── tools/ # Tool implementations │ ├── index.js # Tool registry & dispatcher │ ├── base-tool.js # Abstract base class │ ├── chat.js # Chat tool │ ├── image-generation.js # Image generation tool │ ├── image-editing.js # Image editing tool │ ├── audio-transcription.js # Audio transcription tool │ ├── code-execution.js # Code execution tool │ ├── video-analysis.js # Video analysis tool │ └── image-analysis.js # Image analysis tool ├── intelligence/ # Smart Tool Intelligence │ ├── index.js # Intelligence coordinator │ ├── context-detector.js # Context recognition │ ├── prompt-enhancer.js # Prompt enhancement │ └── preference-store.js # Pattern storage ├── gemini/ # Gemini API integration │ ├── gemini-service.js # API service layer │ └── request-handler.js # Request formatting └── utils/ # Utilities ├── logger.js # Logging system └── file-utils.js # File operations ``` ### Intelligence System Flow 1. **Request Received** → Tool's execute method called 2. **Context Detection** → Analyze prompt for context clues 3. **Pattern Retrieval** → Get relevant learned patterns 4. **Prompt Enhancement** → Apply context-specific improvements 5. **API Execution** → Send enhanced prompt to Gemini 6. **Pattern Storage** → Store successful interaction pattern 7. **Response Return** → Return enhanced result to user ## 🔧 Customization ### Adding New Contexts ```javascript // In src/intelligence/context-detector.js isMyCustomContext(prompt) { const patterns = [ /custom pattern 1/i, /custom pattern 2/i ]; return patterns.some(pattern => pattern.test(prompt)); } // In src/intelligence/prompt-enhancer.js getEnhancementForContext(context) { const enhancements = { 'my_custom_context': 'Apply my custom enhancement instructions here.', // ... other contexts }; return enhancements[context] || enhancements.general; } ``` ### Adding New Tools 1. Create tool file in `src/tools/my-new-tool.js` 2. Extend `BaseTool` class 3. Implement `execute` method with intelligence integration 4. Register in `src/tools/index.js` ```javascript // src/tools/my-new-tool.js class MyNewTool extends BaseTool { constructor(geminiService, intelligenceSystem) { super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem); } async execute(args) { // Use intelligence system for enhancement const context = args.context || this.detectContext(args.input); const enhancedPrompt = await this.enhancePrompt(args.input, context); // Your tool logic here const result = await this.geminiService.someMethod(enhancedPrompt); // Store successful pattern await this.storeSuccessfulPattern(args.input, enhancedPrompt, context); return result; } } ``` ## 🐛 Troubleshooting ### Common Issues **\"Missing GEMINI_API_KEY\" Error** ```bash # Ensure .env file exists and contains your API key cp .env.example .env # Edit .env and add: GEMINI_API_KEY=your_key_here ``` **\"File not found\" Errors** ```bash # Ensure file paths are absolute and files exist # Check file permissions and formats ``` **Intelligence System Not Learning** ```bash # Check data directory permissions ls -la data/ # Verify tool-preferences.json is writable ``` ### Debug Mode ```bash DEBUG=true npm start # or npm run dev ``` ### Logs Location - Application logs: Console output - Intelligence patterns: `./data/tool-preferences.json` - Generated images: `$OUTPUT_DIR` (default: `~/Claude/gemini-images`) ## 🤝 Contributing We welcome contributions! This project represents a new paradigm in MCP server development. ### Development Setup ```bash git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install npm run dev ``` ### Areas for Contribution - **New Contexts** - Add support for specialized domains - **Enhanced Patterns** - Improve learning algorithms - **New Tools** - Expand Gemini AI capabilities - **Performance** - Optimize intelligence system performance - **Documentation** - Improve guides and examples ## 📈 Roadmap - [ ] **Multi-language Support** - Context detection in multiple languages - [ ] **Advanced Analytics** - Usage patterns and performance metrics - [ ] **Tool Chaining** - Intelligent coordination between multiple tools - [ ] **Custom Models** - Support for fine-tuned Gemini models - [ ] **Collaborative Learning** - Share anonymized patterns across instances - [ ] **Visual Interface** - Web-based configuration and monitoring ## 🌟 Why This Matters This is the **first MCP server that truly learns and adapts**. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time. **For Users**: Better results with less effort as the system learns your preferences. **For Developers**: A blueprint for building truly intelligent, adaptive AI tools. **For the MCP Ecosystem**: A new standard for what MCP servers can become. ## 📄 License This project is licensed under the [MIT License](LICENSE) - feel free to use, modify, and distribute. ## 🙏 Acknowledgments Built with: - **Google Gemini AI** - Powering the core AI capabilities - **Model Context Protocol** - Enabling seamless integration - **Node.js & NPM** - Runtime and package management - **Claude & Rob** - Human-AI collaboration at its finest --- **Ready to experience the future of MCP servers?** [Get started now](#quick-start) and watch your AI tools become smarter with every interaction! 🚀"

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Garblesnarff/gemini-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server