Skip to main content
Glama

Submit Image Generation Task

submit_image_generation

Asynchronous image generation: submit a prompt and output path to create images with customizable aspect ratio, size, style, or subject reference. Returns a task ID for batch processing; call task_barrier for final results.

Instructions

Generate images asynchronously. RECOMMENDED: Submit multiple tasks in batch to saturate rate limits, then call task_barrier once to wait for all completions. Returns task ID only - actual files available after task_barrier.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYes
outputFileYesAbsolute path for generated image
aspectRatioNoAspect ratio for the image. Options: 1:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16, 21:91:1
customSizeNoCustom image dimensions (width x height in pixels). Range: 512-2048, must be multiples of 8. Total resolution should stay under 2M pixels. Only supported with image-01 model (cannot be used with style parameter). When both customSize and aspectRatio are set, aspectRatio takes precedence
seedNoRandom seed for reproducible results
subjectReferenceNoFile path to a portrait image for maintaining facial characteristics in generated images. Only supported with image-01 model (cannot be used with style parameter). Provide a clear frontal face photo for best results. Supports local file paths and URLs. Max 10MB, formats: jpg, jpeg, png
styleNoArt style control settings. Uses image-01-live model which does not support customSize or subjectReference parameters. Cannot be combined with customSize or subjectReference

Implementation Reference

  • The main handler for 'submit_image_generation'. It registers the tool via server.registerTool, validates parameters using validateImageParams, submits the task via taskManager.submitImageTask which calls imageService.generateImage, and returns the task ID. On error, it logs via ErrorHandler and returns a user-friendly error message.
    server.registerTool(
      "submit_image_generation",
      {
        title: "Submit Image Generation Task",
        description: "Generate images asynchronously. RECOMMENDED: Submit multiple tasks in batch to saturate rate limits, then call task_barrier once to wait for all completions. Returns task ID only - actual files available after task_barrier.",
        inputSchema: imageGenerationSchema.shape
      },
      async (params: unknown): Promise<ToolResponse> => {
        try {
          const validatedParams = validateImageParams(params);
          const { taskId } = await taskManager.submitImageTask(async () => {
            return await imageService.generateImage(validatedParams);
          });
    
          return {
            content: [{
              type: "text",
              text: `Task ${taskId} submitted`
            }]
          };
        } catch (error: any) {
          ErrorHandler.logError(error, { tool: 'submit_image_generation', params });
          return {
            content: [{
              type: "text",
              text: `❌ Failed to submit image generation task: ${ErrorHandler.formatErrorForUser(error)}`
            }]
          };
        }
      }
    );
  • Input schema definition (imageGenerationSchema) using Zod. Defines the expected parameters: prompt, outputFile, aspectRatio, customSize, seed, subjectReference, and style. Used as inputSchema on line 62 of the registration.
    export const imageGenerationSchema = z.object({
      prompt: z.string()
        .min(1, 'Prompt is required')
        .max(CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH, `Prompt must not exceed ${CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH} characters`),
      
      outputFile: filePathSchema.describe('Absolute path for generated image'),
      
      
      aspectRatio: z.enum(CONSTRAINTS.IMAGE.ASPECT_RATIOS as readonly [AspectRatio, ...AspectRatio[]])
        .default('1:1' as AspectRatio)
        .describe(`Aspect ratio for the image. Options: ${CONSTRAINTS.IMAGE.ASPECT_RATIOS.join(', ')}`),
        
      customSize: z.object({
        width: z.number()
          .min(CONSTRAINTS.IMAGE.MIN_DIMENSION)
          .max(CONSTRAINTS.IMAGE.MAX_DIMENSION)
          .multipleOf(CONSTRAINTS.IMAGE.DIMENSION_STEP),
        height: z.number()
          .min(CONSTRAINTS.IMAGE.MIN_DIMENSION)
          .max(CONSTRAINTS.IMAGE.MAX_DIMENSION)
          .multipleOf(CONSTRAINTS.IMAGE.DIMENSION_STEP)
      }).optional().describe('Custom image dimensions (width x height in pixels). Range: 512-2048, must be multiples of 8. Total resolution should stay under 2M pixels. Only supported with image-01 model (cannot be used with style parameter). When both customSize and aspectRatio are set, aspectRatio takes precedence'),
      
        
      seed: positiveIntSchema.optional().describe('Random seed for reproducible results'),
      
      subjectReference: z.string().optional().describe('File path to a portrait image for maintaining facial characteristics in generated images. Only supported with image-01 model (cannot be used with style parameter). Provide a clear frontal face photo for best results. Supports local file paths and URLs. Max 10MB, formats: jpg, jpeg, png'),
      
      style: z.object({
        style_type: z.enum(CONSTRAINTS.IMAGE.STYLE_TYPES as readonly [StyleType, ...StyleType[]])
          .describe(`Art style type. Options: ${CONSTRAINTS.IMAGE.STYLE_TYPES.join(', ')}`),
        style_weight: z.number()
          .min(CONSTRAINTS.IMAGE.STYLE_WEIGHT_MIN, 'Style weight must be greater than 0')
          .max(CONSTRAINTS.IMAGE.STYLE_WEIGHT_MAX, 'Style weight must not exceed 1')
          .default(0.8)
          .describe('Style control weight (0-1]. Higher values apply stronger style effects. Default: 0.8')
      }).optional().describe('Art style control settings. Uses image-01-live model which does not support customSize or subjectReference parameters. Cannot be combined with customSize or subjectReference'),
      
    });
  • Validation function (validateImageParams) that parses and validates the input parameters using the Zod schema, including custom logic to ensure incompatible parameters (style vs customSize/subjectReference) are rejected.
    export function validateImageParams(params: unknown): ImageGenerationParams {
      try {
        const parsed = imageGenerationSchema.parse(params);
        
        // Manual validation for incompatible parameter combinations
        const hasStyle = !!parsed.style;
        const hasCustomSize = !!parsed.customSize;
        const hasSubjectReference = !!parsed.subjectReference;
        
        if (hasStyle && hasCustomSize) {
          throw new Error('Style parameter (image-01-live model) cannot be combined with customSize (image-01 model feature)');
        }
        
        if (hasStyle && hasSubjectReference) {
          throw new Error('Style parameter (image-01-live model) cannot be combined with subjectReference (image-01 model feature)');
        }
        
        return parsed;
      } catch (error) {
        if (error instanceof z.ZodError) {
          const messages = error.errors.map(e => `${e.path.join('.')}: ${e.message}`);
          throw new Error(`Validation failed: ${messages.join(', ')}`);
        }
        throw error;
      }
    }
  • src/index.ts:57-87 (registration)
    The registration line itself is part of the handler block: server.registerTool('submit_image_generation', ...) on line 57-87 registers the tool with the MCP server.
    server.registerTool(
      "submit_image_generation",
      {
        title: "Submit Image Generation Task",
        description: "Generate images asynchronously. RECOMMENDED: Submit multiple tasks in batch to saturate rate limits, then call task_barrier once to wait for all completions. Returns task ID only - actual files available after task_barrier.",
        inputSchema: imageGenerationSchema.shape
      },
      async (params: unknown): Promise<ToolResponse> => {
        try {
          const validatedParams = validateImageParams(params);
          const { taskId } = await taskManager.submitImageTask(async () => {
            return await imageService.generateImage(validatedParams);
          });
    
          return {
            content: [{
              type: "text",
              text: `Task ${taskId} submitted`
            }]
          };
        } catch (error: any) {
          ErrorHandler.logError(error, { tool: 'submit_image_generation', params });
          return {
            content: [{
              type: "text",
              text: `❌ Failed to submit image generation task: ${ErrorHandler.formatErrorForUser(error)}`
            }]
          };
        }
      }
    );
  • ImageGenerationService.generateImage — the core logic that builds the API payload, makes the request to the Minimax API, processes the response (downloads/saves images), and returns the result. Called by the handler via taskManager.submitImageTask.
    async generateImage(params: ImageGenerationParams): Promise<ImageGenerationResult> {
      try {
        // Build API payload (MCP handles validation)
        const payload = this.buildPayload(params);
        
        // Make API request
        const response = await this.post(API_CONFIG.ENDPOINTS.IMAGE_GENERATION, payload) as ImageGenerationResponse;
        
        // Process response
        return await this.processImageResponse(response, params);
        
      } catch (error: any) {
        const processedError = ErrorHandler.handleAPIError(error);
        ErrorHandler.logError(processedError, { service: 'image', params });
        
        // Throw the error so task manager can properly mark it as failed
        throw processedError;
      }
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description fully carries the burden. It specifies asynchronous generation, that only a task ID is returned, and that files become available after task_barrier. It lacks details on rate limits, idempotency, or cost, but covers the essential behavioral traits for a generation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: the first gives the primary purpose, the second provides a key recommendation and return behavior. Every sentence adds value, and the structure is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description explains the return value (task ID) and how to get results (task_barrier). It also suggests batch usage. It does not cover error handling or prerequisites, but for a parameter-heavy tool with good schema descriptions, this is largely sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 86% (high), so the schema already documents most parameters. The description adds no parameter-specific information beyond what is in the schema, meeting the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Generate images asynchronously,' clearly stating the verb and resource. It distinguishes itself from sibling tool task_barrier by explaining that it returns only a task ID and that actual files are retrieved via task_barrier.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description recommends batch submission to saturate rate limits and using task_barrier for waiting. It provides a clear usage pattern but does not explicitly contrast with submit_speech_generation or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PsychArch/minimax-mcp-tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server