Skip to main content
Glama
Garblesnarff

Gemini MCP Server for Claude Desktop

gemini-nano-banana-pro

Generate professional 4K images using AI with multiple reference images, character consistency, and studio-grade controls for various creative needs.

Instructions

Generate professional images with Nano Banana Pro (Gemini 3 Pro Image): 4K resolution, up to 14 reference images, advanced text rendering, character consistency, and studio-grade controls

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText description of the desired image or editing instruction
modeNoGeneration mode: fusion (blend up to 14 images), consistency (maintain character/style for up to 5 characters), targeted_edit (precise localized edits), template (follow layout), standard (basic generation)
resolutionNoOutput resolution: 1k (1024px), 2k (2048px), or 4k (4096px). Higher resolutions cost more.
aspect_ratioNoAspect ratio for the generated image
reference_imagesNoOptional array of file paths to reference images (up to 14 for Nano Banana Pro)
contextNoOptional context for intelligent enhancement (e.g., "professional", "artistic", "infographic")

Implementation Reference

  • The main handler function 'execute' that implements the gemini-nano-banana-pro tool logic: validates inputs, processes reference images, enhances prompt with intelligence system, calls OpenRouter service for image generation, saves output, and returns formatted response.
    async execute(args) {
      const prompt = validateNonEmptyString(args.prompt, 'prompt');
      const mode = args.mode || 'standard';
      const resolution = args.resolution || '1k';
      const aspectRatio = args.aspect_ratio || '1:1';
      const referenceImagePaths = args.reference_images || [];
      const context = args.context ? validateString(args.context, 'context') : null;
    
      log(`Nano Banana Pro: mode="${mode}", resolution="${resolution}", aspect_ratio="${aspectRatio}", prompt="${prompt}"`, this.name);
    
      try {
        // Validate reference images count
        if (referenceImagePaths.length > 14) {
          throw new Error(`Nano Banana Pro supports up to 14 reference images, got ${referenceImagePaths.length}`);
        }
    
        // Process reference images
        const referenceImages = [];
        if (referenceImagePaths.length > 0) {
          log(`Processing ${referenceImagePaths.length} reference images`, this.name);
    
          for (let i = 0; i < referenceImagePaths.length; i++) {
            const imagePath = referenceImagePaths[i];
            log(`Processing reference image ${i + 1}/${referenceImagePaths.length}: ${imagePath}`, this.name);
    
            try {
              if (!path.isAbsolute(imagePath)) {
                throw new Error(`File path must be absolute, got relative path: ${imagePath}`);
              }
    
              if (!fs.existsSync(imagePath)) {
                throw new Error(`Reference image file not found: ${imagePath}`);
              }
    
              validateFileSize(imagePath, config.MAX_IMAGE_SIZE_MB);
              const imageBuffer = readFileAsBuffer(imagePath);
              const mimeType = getMimeType(imagePath, config.SUPPORTED_IMAGE_MIMES);
    
              referenceImages.push({
                data: imageBuffer.toString('base64'),
                mimeType,
              });
    
              log(`āœ“ Loaded reference image ${i + 1}: ${imagePath} (${(imageBuffer.length / 1024).toFixed(2)}KB)`, this.name);
            } catch (fileError) {
              throw new Error(`Reference Image Error: Failed to process image ${i + 1} (${imagePath}): ${fileError.message}`);
            }
          }
        }
    
        // Validate mode requirements
        if (['fusion', 'consistency', 'template'].includes(mode) && referenceImages.length === 0) {
          throw new Error(`Mode "${mode}" requires at least one reference image`);
        }
    
        if (mode === 'fusion' && referenceImages.length < 2) {
          throw new Error('Fusion mode requires at least 2 reference images');
        }
    
        // Apply intelligent enhancement
        let enhancedPrompt = prompt;
        if (this.intelligenceSystem.initialized) {
          try {
            const contextForEnhancement = context || mode;
            enhancedPrompt = await this.intelligenceSystem.enhancePrompt(prompt, contextForEnhancement, this.name);
            log('Applied Tool Intelligence enhancement', this.name);
          } catch (err) {
            log(`Tool Intelligence enhancement failed: ${err.message}`, this.name);
          }
        }
    
        // Check if OpenRouter is available
        if (!openRouterService.isServiceAvailable()) {
          throw new Error('OpenRouter service is not available. Nano Banana Pro requires OpenRouter.');
        }
    
        // Generate image using Nano Banana Pro
        log('Generating image with Nano Banana Pro via OpenRouter', this.name);
        const imageData = await openRouterService.generateNanaBananaProImage(
          enhancedPrompt,
          referenceImages,
          {
            mode,
            resolution,
            aspect_ratio: aspectRatio,
          }
        );
    
        if (imageData) {
          log('Successfully generated Nano Banana Pro image', this.name);
    
          ensureDirectoryExists(config.OUTPUT_DIR, this.name);
    
          const timestamp = Date.now();
          const hash = crypto.createHash('md5').update(prompt + mode + resolution).digest('hex').substring(0, 8);
          const imageName = `nanobananapro-${mode}-${resolution}-${hash}-${timestamp}.png`;
          const imagePath = path.join(config.OUTPUT_DIR, imageName);
    
          fs.writeFileSync(imagePath, Buffer.from(imageData, 'base64'));
          log(`Image saved to: ${imagePath}`, this.name);
    
          // Learn from interaction
          if (this.intelligenceSystem.initialized) {
            try {
              const resultDescription = `Nano Banana Pro image generated (${mode}, ${resolution}): ${imagePath}`;
              await this.intelligenceSystem.learnFromInteraction(
                prompt,
                enhancedPrompt,
                resultDescription,
                context || mode,
                this.name
              );
            } catch (err) {
              log(`Tool Intelligence learning failed: ${err.message}`, this.name);
            }
          }
    
          // Build response
          let finalResponse = `āœ“ **Nano Banana Pro** image successfully generated\n\n`;
          finalResponse += `**Mode:** ${mode}\n`;
          finalResponse += `**Resolution:** ${resolution}\n`;
          finalResponse += `**Aspect Ratio:** ${aspectRatio}\n`;
          finalResponse += `**Prompt:** "${prompt}"\n`;
          finalResponse += `**Output:** ${imagePath}`;
    
          if (referenceImages.length > 0) {
            finalResponse += `\n**Reference Images:** ${referenceImages.length} image(s)`;
          }
    
          // Cost estimation
          const costEstimate = this.estimateCost(resolution);
          finalResponse += `\n\nšŸ’° **Estimated Cost:** ~$${costEstimate.toFixed(3)}`;
    
          // Mode-specific details
          switch (mode) {
            case 'fusion':
              finalResponse += `\n\n**Fusion Details:** Blended ${referenceImages.length} images with advanced reasoning`;
              break;
            case 'consistency':
              finalResponse += `\n\n**Consistency Details:** Maintained character/style (supports up to 5 characters)`;
              break;
            case 'targeted_edit':
              finalResponse += `\n\n**Edit Details:** Applied precise localized modifications`;
              break;
            case 'template':
              finalResponse += `\n\n**Template Details:** Followed layout and structure from reference`;
              break;
          }
    
          // Nano Banana Pro capabilities note
          finalResponse += `\n\n---\n_Powered by Nano Banana Pro (Gemini 3 Pro Image) via OpenRouter_`;
    
          return {
            content: [
              {
                type: 'text',
                text: finalResponse,
              },
            ],
          };
        }
    
        throw new Error('No image data returned from Nano Banana Pro');
    
      } catch (error) {
        log(`Nano Banana Pro error: ${error.message}`, this.name);
    
        if (error.message.includes('Reference Image Error:')) {
          throw new Error(`${error.message}\n\nSupported formats: ${Object.keys(config.SUPPORTED_IMAGE_MIMES).join(', ')}`);
        } else if (error.message.includes('Mode') && error.message.includes('requires')) {
          throw new Error(`${error.message}\n\nNote: Fusion needs 2+ images, consistency/template need 1+`);
        } else {
          throw new Error(`Nano Banana Pro failed: ${error.message}`);
        }
      }
    }
  • JSON schema defining the input parameters for the tool: prompt (required), mode, resolution, aspect_ratio, reference_images, context.
    {
      type: 'object',
      properties: {
        prompt: {
          type: 'string',
          description: 'Text description of the desired image or editing instruction',
        },
        mode: {
          type: 'string',
          enum: ['fusion', 'consistency', 'targeted_edit', 'template', 'standard'],
          description: 'Generation mode: fusion (blend up to 14 images), consistency (maintain character/style for up to 5 characters), targeted_edit (precise localized edits), template (follow layout), standard (basic generation)',
        },
        resolution: {
          type: 'string',
          enum: ['1k', '2k', '4k'],
          description: 'Output resolution: 1k (1024px), 2k (2048px), or 4k (4096px). Higher resolutions cost more.',
        },
        aspect_ratio: {
          type: 'string',
          enum: ['1:1', '2:3', '3:2', '3:4', '4:3', '4:5', '5:4', '9:16', '16:9', '21:9'],
          description: 'Aspect ratio for the generated image',
        },
        reference_images: {
          type: 'array',
          items: {
            type: 'string',
          },
          description: 'Optional array of file paths to reference images (up to 14 for Nano Banana Pro)',
        },
        context: {
          type: 'string',
          description: 'Optional context for intelligent enhancement (e.g., "professional", "artistic", "infographic")',
        },
      },
      required: ['prompt'],
    },
  • Registration of the NanoBananaProTool instance in the tools registry, which uses the name 'gemini-nano-banana-pro' defined in its constructor.
    registerTool(new NanoBananaProTool(intelligenceSystem, geminiService));
  • Import of the NanoBananaProTool class module.
    const NanoBananaProTool = require('./nano-banana-pro');
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions key features like resolution options, reference image limits, and 'studio-grade controls', which adds useful context beyond basic generation. However, it doesn't cover important behavioral aspects like rate limits, authentication needs, cost implications (implied by 'Higher resolutions cost more' in schema but not in description), or what happens with invalid inputs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured as a single sentence listing key features. It's appropriately sized for a complex tool with 6 parameters, though it could be more front-loaded by starting with the core purpose more clearly. Every phrase adds value by highlighting distinctive capabilities of this specific implementation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex image generation tool with 6 parameters and no annotations or output schema, the description provides adequate but incomplete context. It covers the tool's high-level capabilities and some key features, but doesn't address important aspects like output format, error conditions, or how it differs from sibling tools. The absence of an output schema means the description should ideally mention what gets returned, but it doesn't.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description adds minimal parameter semantics beyond what's in the schema - it mentions '4K resolution' and 'up to 14 reference images' which align with schema fields, but doesn't provide additional context about parameter interactions or usage patterns. The baseline of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate professional images with Nano Banana Pro (Gemini 3 Pro Image)'. It specifies the verb ('Generate'), resource ('professional images'), and technology ('Nano Banana Pro/Gemini 3 Pro Image'). However, it doesn't explicitly differentiate from sibling tools like 'gemini-advanced-image' or 'generate_image', which likely serve similar image generation purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It lists features like '4K resolution' and 'up to 14 reference images', but doesn't mention sibling tools such as 'gemini-advanced-image' or 'gemini-edit-image' for comparison. There's no explicit when/when-not usage advice or prerequisites stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Garblesnarff/gemini-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server