Skip to main content
Glama

mcp_imagen_generate

Generate high-quality images from English text prompts using the Google Imagen 3 model. Supports photorealistic, artistic, and specific style outputs, with SynthID watermarking included. Ideal for creating custom visuals with controlled aspects like ratios and person generation.

Instructions

Google Imagen 3 모델을 사용하여 텍스트 프롬프트에서 고품질 이미지를 생성합니다. Imagen 3은 포토리얼리즘, 예술적 디테일, 특정 예술 스타일(인상주의, 애니메이션 등)에 탁월합니다. 생성된 이미지에는 항상 SynthID 워터마크가 포함됩니다. 현재 영어 프롬프트만 지원됩니다.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
aspectRatioNo이미지 가로세로 비율1:1
fileNameNo저장할 이미지 파일 이름 (확장자 제외)
modelNo사용할 Imagen 모델 ID (예: imagen-3.0-generate-002)imagen-3.0-generate-002
numberOfImagesNo생성할 이미지 수 (1-4)
personGenerationNo사람 이미지 생성 허용 여부 (DONT_ALLOW: 사람 이미지 생성 차단, ALLOW_ADULT: 성인 이미지만 생성 허용)ALLOW_ADULT
promptYes이미지 생성을 위한 텍스트 프롬프트. 영어로 작성하세요.
saveDirNo이미지를 저장할 디렉토리./temp

Implementation Reference

  • Handler function for the 'mcp_imagen_generate' tool. Validates that an Imagen model is used and delegates image generation to geminiService.generateImage, returning the result as JSON.
    async handler(args: any): Promise<ToolResponse> {
      try {
        // generateImage 메서드를 사용하지만, Imagen 모델로 고정
        const modelName = args.model || 'imagen-3.0-generate-002';
        if (!modelName.includes('imagen')) {
          throw new Error('이 도구는 Imagen 모델만 지원합니다. 모델 이름에 "imagen"이 포함되어야 합니다.');
        }
        
        const result = await geminiService.generateImage({
          ...args,
          model: modelName
        });
        
        return {
          content: [{
            type: 'text',
            text: JSON.stringify(result, null, 2)
          }]
        };
      } catch (error) {
        return {
          content: [{
            type: 'text',
            text: `Imagen 이미지 생성 오류: ${error instanceof Error ? error.message : String(error)}`
          }]
        };
      }
    }
  • Input schema for 'mcp_imagen_generate' defining parameters like prompt, model, numberOfImages, aspectRatio, etc.
    inputSchema: {
      type: 'object',
      properties: {
        model: {
          type: 'string',
          description: '사용할 Imagen 모델 ID (예: imagen-3.0-generate-002)',
          default: 'imagen-3.0-generate-002',
        },
        prompt: {
          type: 'string',
          description: '이미지 생성을 위한 텍스트 프롬프트. 영어로 작성하세요.',
        },
        numberOfImages: {
          type: 'number',
          description: '생성할 이미지 수 (1-4)',
          default: 1,
          minimum: 1,
          maximum: 4,
        },
        aspectRatio: {
          type: 'string',
          description: '이미지 가로세로 비율',
          default: '1:1',
          enum: ['1:1', '3:4', '4:3', '9:16', '16:9']
        },
        personGeneration: {
          type: 'string',
          description: '사람 이미지 생성 허용 여부 (DONT_ALLOW: 사람 이미지 생성 차단, ALLOW_ADULT: 성인 이미지만 생성 허용)',
          default: 'ALLOW_ADULT',
          enum: ['DONT_ALLOW', 'ALLOW_ADULT']
        },
        saveDir: {
          type: 'string',
          description: '이미지를 저장할 디렉토리',
          default: './temp',
        },
        fileName: {
          type: 'string',
          description: '저장할 이미지 파일 이름 (확장자 제외)',
        }
      },
      required: ['prompt']
  • src/index.ts:51-51 (registration)
    Tool capability registration in MCP server capabilities object (disabled with false).
    mcp_imagen_generate: false,
  • Core helper function generateImageWithImagen that makes the actual API call to Google Imagen for image generation and saves the images to disk.
    private async generateImageWithImagen({
      model,
      prompt,
      numberOfImages = 1,
      aspectRatio = '1:1',
      personGeneration = 'ALLOW_ADULT',
      saveDir = './temp',
      fileName = `imagen-${Date.now()}`,
    }: {
      model: string;
      prompt: string;
      numberOfImages?: number;
      aspectRatio?: string;
      personGeneration?: string;
      saveDir?: string;
      fileName?: string;
    }) {
      const config = this.getRequestConfig();
      const url = `${this.baseUrl}/models/${model}:generateImages`;
    
      const response = await axios.post(
        url,
        {
          prompt,
          config: {
            numberOfImages,
            aspectRatio,
            personGeneration
          }
        },
        config
      );
    
      // 이미지 응답 처리
      const generatedImages = response.data.generatedImages || [];
      const savedFiles = [];
    
      const fs = await import('fs');
      const path = await import('path');
    
      // 저장 디렉토리가 없으면 생성
      if (!fs.existsSync(saveDir)) {
        fs.mkdirSync(saveDir, { recursive: true });
      }
    
      // 이미지 저장
      for (let i = 0; i < generatedImages.length; i++) {
        const imageData = generatedImages[i]?.image?.imageBytes;
        if (imageData) {
          const buffer = Buffer.from(imageData, 'base64');
          const filePath = path.join(saveDir, `${fileName}-${i + 1}.png`);
          fs.writeFileSync(filePath, buffer);
          savedFiles.push(filePath);
        }
      }
    
      return {
        model: model,
        prompt: prompt,
        images: savedFiles,
        count: savedFiles.length,
        text: [],
      };
    }
  • Full tool registration object for 'mcp_imagen_generate' in the exported tools array.
    {
      name: 'mcp_imagen_generate',
      description: 'Google Imagen 3 모델을 사용하여 텍스트 프롬프트에서 고품질 이미지를 생성합니다. Imagen 3은 포토리얼리즘, 예술적 디테일, 특정 예술 스타일(인상주의, 애니메이션 등)에 탁월합니다. 생성된 이미지에는 항상 SynthID 워터마크가 포함됩니다. 현재 영어 프롬프트만 지원됩니다.',
      inputSchema: {
        type: 'object',
        properties: {
          model: {
            type: 'string',
            description: '사용할 Imagen 모델 ID (예: imagen-3.0-generate-002)',
            default: 'imagen-3.0-generate-002',
          },
          prompt: {
            type: 'string',
            description: '이미지 생성을 위한 텍스트 프롬프트. 영어로 작성하세요.',
          },
          numberOfImages: {
            type: 'number',
            description: '생성할 이미지 수 (1-4)',
            default: 1,
            minimum: 1,
            maximum: 4,
          },
          aspectRatio: {
            type: 'string',
            description: '이미지 가로세로 비율',
            default: '1:1',
            enum: ['1:1', '3:4', '4:3', '9:16', '16:9']
          },
          personGeneration: {
            type: 'string',
            description: '사람 이미지 생성 허용 여부 (DONT_ALLOW: 사람 이미지 생성 차단, ALLOW_ADULT: 성인 이미지만 생성 허용)',
            default: 'ALLOW_ADULT',
            enum: ['DONT_ALLOW', 'ALLOW_ADULT']
          },
          saveDir: {
            type: 'string',
            description: '이미지를 저장할 디렉토리',
            default: './temp',
          },
          fileName: {
            type: 'string',
            description: '저장할 이미지 파일 이름 (확장자 제외)',
          }
        },
        required: ['prompt']
      },
      async handler(args: any): Promise<ToolResponse> {
        try {
          // generateImage 메서드를 사용하지만, Imagen 모델로 고정
          const modelName = args.model || 'imagen-3.0-generate-002';
          if (!modelName.includes('imagen')) {
            throw new Error('이 도구는 Imagen 모델만 지원합니다. 모델 이름에 "imagen"이 포함되어야 합니다.');
          }
          
          const result = await geminiService.generateImage({
            ...args,
            model: modelName
          });
          
          return {
            content: [{
              type: 'text',
              text: JSON.stringify(result, null, 2)
            }]
          };
        } catch (error) {
          return {
            content: [{
              type: 'text',
              text: `Imagen 이미지 생성 오류: ${error instanceof Error ? error.message : String(error)}`
            }]
          };
        }
      }
    },
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It does well by mentioning: 1) the SynthID watermark that will always be included, 2) the current limitation to English prompts only, and 3) the model's specific strengths. However, it doesn't mention potential rate limits, authentication requirements, or error conditions that might be relevant for an image generation API.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured and concise - four sentences that each add distinct value: 1) core functionality, 2) model capabilities, 3) watermark information, 4) language limitation. No wasted words, front-loaded with the most important information first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 7-parameter tool with no annotations and no output schema, the description does quite well. It covers the model's capabilities, important behavioral constraints (watermark, language), and gives context about when to use it. The main gap is the lack of information about what the tool returns (image format, file location, error responses) since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 7 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions English prompts, which aligns with the prompt parameter's description, but this is redundant. Baseline 3 is appropriate when the schema does all the parameter documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Google Imagen 3 모델을 사용하여 텍스트 프롬프트에서 고품질 이미지를 생성합니다' (generates high-quality images from text prompts using Google Imagen 3 model). It specifies the exact model and distinguishes itself from sibling tools by mentioning Imagen 3's specific strengths (photorealism, artistic detail, specific art styles).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool: for generating images with photorealism, artistic detail, or specific art styles using Imagen 3. It also mentions that '현재 영어 프롬프트만 지원됩니다' (only English prompts are currently supported), which is important usage guidance. However, it doesn't explicitly compare with alternatives like the various Gemini image generation siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bigdata-coss/agent_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server