generate_image_from_text
Create images from text descriptions using Google's Gemini AI model. Provide a text prompt to generate visual content through the MCP protocol.
Instructions
Generate an image based on the given text prompt using Google's Gemini model.
Args:
prompt: User's text prompt describing the desired image to generate
Returns:
Path to the generated image file using Gemini's image generation capabilities
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes |
Implementation Reference
- The handler function for the 'generate_image_from_text' tool. It is registered via the @mcp.tool() decorator. Translates the prompt, generates detailed contents using get_image_generation_prompt, processes the image generation via process_image_with_gemini, and handles errors.@mcp.tool() async def generate_image_from_text(prompt: str) -> Tuple[bytes, str]: """Generate an image based on the given text prompt using Google's Gemini model. Args: prompt: User's text prompt describing the desired image to generate Returns: Path to the generated image file using Gemini's image generation capabilities """ try: # Translate the prompt to English translated_prompt = await translate_prompt(prompt) # Create detailed generation prompt contents = get_image_generation_prompt(translated_prompt) # Process with Gemini and return the result return await process_image_with_gemini([contents], prompt) except Exception as e: error_msg = f"Error generating image: {str(e)}" logger.error(error_msg) return error_msg
- src/mcp_server_gemini_image_generator/server.py:246-246 (registration)The @mcp.tool() decorator registers the generate_image_from_text function as an MCP tool.@mcp.tool()
- Core helper function that calls the Gemini API for image generation/processing, generates filename, saves the image, and returns image bytes and path. Used by the handler.async def process_image_with_gemini( contents: List[Any], prompt: str, model: str = "gemini-2.0-flash-preview-image-generation" ) -> Tuple[bytes, str]: """Process an image request with Gemini and save the result. Args: contents: List containing the prompt and optionally an image prompt: Original prompt for filename generation model: Gemini model to use Returns: Path to the saved image file """ # Call Gemini Vision API gemini_response = await call_gemini( contents, model=model, config=types.GenerateContentConfig( response_modalities=['Text', 'Image'] ) ) # Generate a filename for the image filename = await convert_prompt_to_filename(prompt) # Save the image and return the path saved_image_path = await save_image(gemini_response, filename) return gemini_response, saved_image_path
- Helper function that translates the input prompt to English using Gemini for better results. Called by the handler.async def translate_prompt(text: str) -> str: """Translate and optimize the user's prompt to English for better image generation results. Args: text: The original prompt in any language Returns: English translation of the prompt with preserved intent """ try: # Create a prompt for translation with strict intent preservation prompt = get_translate_prompt(text) # Call Gemini and get the translated prompt translated_prompt = await call_gemini(prompt, text_only=True) logger.info(f"Original prompt: {text}") logger.info(f"Translated prompt: {translated_prompt}") return translated_prompt except Exception as e: logger.error(f"Error translating prompt: {str(e)}") # Return original text if translation fails return text