Skip to main content
Glama

generate_image

Create images from text descriptions using Google's Gemini AI model, with optional context for style enhancement.

Instructions

Generate an image using Google's Gemini 2.0 Flash Experimental model (with learned user preferences)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText description of the desired image
contextNoOptional context for intelligent enhancement (e.g., "artistic", "photorealistic", "technical")

Implementation Reference

  • The main handler function (execute) for the generate_image tool. It validates inputs, optionally enhances the prompt using the intelligence system, calls the Gemini service to generate the image, saves the base64 image data to a file, learns from the interaction, and returns a text response with the file path.
    async execute(args) { const prompt = validateNonEmptyString(args.prompt, 'prompt'); const context = args.context ? validateString(args.context, 'context') : null; log(`Generating image: "${prompt}" with context: ${context || 'general'}`, this.name); try { let enhancedPrompt = prompt; if (this.intelligenceSystem.initialized) { try { enhancedPrompt = await this.intelligenceSystem.enhancePrompt(prompt, context, this.name); log('Applied Tool Intelligence enhancement', this.name); } catch (err) { log(`Tool Intelligence enhancement failed: ${err.message}`, this.name); } } const formattedPrompt = `Create a detailed and high-quality image of: ${enhancedPrompt}`; const imageData = await this.geminiService.generateImage('IMAGE_GENERATION', formattedPrompt); if (imageData) { log('Successfully extracted image data', this.name); ensureDirectoryExists(config.OUTPUT_DIR, this.name); const timestamp = Date.now(); const hash = crypto.createHash('md5').update(prompt).digest('hex'); const imageName = `gemini-${hash}-${timestamp}.png`; const imagePath = path.join(config.OUTPUT_DIR, imageName); fs.writeFileSync(imagePath, Buffer.from(imageData, 'base64')); log(`Image saved to: ${imagePath}`, this.name); if (this.intelligenceSystem.initialized) { try { await this.intelligenceSystem.learnFromInteraction(prompt, enhancedPrompt, `Image generated successfully: ${imagePath}`, context, this.name); log('Tool Intelligence learned from interaction', this.name); } catch (err) { log(`Tool Intelligence learning failed: ${err.message}`, this.name); } } let finalResponse = `✓ Image successfully generated from prompt: "${prompt}"\n\nYou can find the image at: ${imagePath}`; // eslint-disable-line max-len if (context && this.intelligenceSystem.initialized) { finalResponse += `\n\n---\n_Enhancement applied based on context: ${context}_`; } return { content: [ { type: 'text', text: finalResponse, }, ], }; } log('No image data found in response', this.name); return { content: [ { type: 'text', text: `Could not generate image for: "${prompt}". No image data was returned by Gemini API.`, }, ], }; } catch (error) { log(`Error generating image: ${error.message}`, this.name); throw new Error(`Error generating image: ${error.message}`); } }
  • JSON schema defining the input parameters for the generate_image tool: required 'prompt' and optional 'context'.
    { type: 'object', properties: { prompt: { type: 'string', description: 'Text description of the desired image', }, context: { type: 'string', description: 'Optional context for intelligent enhancement (e.g., "artistic", "photorealistic", "technical")', }, }, required: ['prompt'], },
  • Registers the ImageGenerationTool instance in the central tool registry using the registerTool function.
    registerTool(new ImageGenerationTool(intelligenceSystem, geminiService));
  • Helper method in GeminiService that performs the actual image generation by calling the Gemini API with the provided model and prompt, and extracts base64 image data from the response.
    async generateImage(modelType, prompt) { try { const modelConfig = getGeminiModelConfig(modelType); // Pass only the model name to getGenerativeModel const model = this.genAI.getGenerativeModel({ model: modelConfig.model }); const content = formatTextPrompt(prompt); // Image generation also uses text prompt // Pass the generationConfig to the generateContent method const result = await model.generateContent({ contents: [{ parts: content }], generationConfig: modelConfig.generationConfig, }); log(`Image generation response received from Gemini API for model type: ${modelType}`, 'gemini-service'); return extractImageData(result.response?.candidates?.[0]); } catch (error) { log(`Error generating image with Gemini API for model type ${modelType}: ${error.message}`, 'gemini-service'); throw new Error(`Gemini image generation failed: ${error.message}`); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Garblesnarff/gemini-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server