Skip to main content
Glama

generateImage

Create custom images from text descriptions using AI. Specify prompts, dimensions, and parameters to generate visual content.

Instructions

Generate an image based on a prompt

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptNo
negative_promptNo
widthNo
heightNo
stepsNo
seedNo
guidance_scaleNo
random_stringNo

Implementation Reference

  • Core handler function that validates input parameters, merges with defaults, calls Draw Things API for txt2img generation, saves the image to disk, and returns formatted result or error.
    async generateImage(
      inputParams: Partial<ImageGenerationParams> = {}
    ): Promise<DrawThingsGenerationResult> {
      try {
        // handle input params
        let params: Partial<ImageGenerationParams> = {};
    
        // validate params
        try {
          const validationResult = validateImageGenerationParams(inputParams);
          if (validationResult.valid) {
            params = inputParams;
          } else {
            console.error("parameter validation failed, use default params");
          }
        } catch (error) {
          console.error("parameter validation error:", error);
        }
    
        // handle random_string special case
        if (
          params.random_string &&
          (!params.prompt || Object.keys(params).length === 1)
        ) {
          params.prompt = params.random_string;
          delete params.random_string;
        }
    
        // ensure prompt
        if (!params.prompt) {
          params.prompt = inputParams.prompt || defaultParams.prompt;
        }
    
        // merge params
        const requestParams = {
          ...defaultParams,
          ...params,
          seed: params.seed ?? Math.floor(Math.random() * 2147483647),
        };
    
        console.error(`use prompt: "${requestParams.prompt}"`);
    
        // send request to Draw Things API
        console.error("send request to Draw Things API...");
        const response = await this.axios.post(
          "/sdapi/v1/txt2img",
          requestParams
        );
    
        // handle response
        if (
          !response.data ||
          !response.data.images ||
          response.data.images.length === 0
        ) {
          throw new Error("API did not return image data");
        }
    
        // handle image data
        const imageData = response.data.images[0];
    
        // format image data
        const formattedImageData = imageData.startsWith("data:image/")
          ? imageData
          : `data:image/png;base64,${imageData}`;
    
        console.error("image generation success");
        
        // record the start time of image generation
        const startTime = Date.now() - 2000; // assume the image generation took 2 seconds
        const endTime = Date.now();
        
        // automatically save the generated image
        const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
        const defaultFileName = `generated-image-${timestamp}.png`;
        
        // save the generated image
        const imagePath = await this.saveImage({
          base64Data: formattedImageData,
          fileName: defaultFileName
        });
        
        return {
          isError: false,
          imageData: formattedImageData,
          imagePath: imagePath,
          metadata: {
            alt: `Image generated from prompt: ${requestParams.prompt}`,
            inference_time_ms: endTime - startTime,
          }
        };
      } catch (error) {
        console.error("image generation error:", error);
    
        // error message
        let errorMessage = "unknown error";
    
        if (error instanceof Error) {
          errorMessage = error.message;
        }
    
        // handle axios error
        const axiosError = error as any;
        if (axiosError.response) {
          errorMessage = `API error: ${axiosError.response.status} - ${
            axiosError.response.data?.error || axiosError.message
          }`;
        } else if (axiosError.code === "ECONNREFUSED") {
          errorMessage =
            "cannot connect to Draw Things API. please ensure Draw Things is running and API is enabled.";
        } else if (axiosError.code === "ETIMEDOUT") {
          errorMessage =
            "connection to Draw Things API timeout. image generation may take longer, or API not responding.";
        }
    
        return {
          isError: true,
          errorMessage,
        };
      }
    }
  • src/index.ts:155-233 (registration)
    MCP server.tool registration for 'generateImage', defines Zod input schema, handles MCP params, delegates to service handler, and returns standardized MCP content response.
    server.tool(
      "generateImage",
      "Generate an image based on a prompt",
      paramsSchema,
      async (mcpParams: any) => {
        try {
          log("Received image generation request");
          log(`mcpParams====== ${JSON.stringify(mcpParams)}`);
          // handle ai prompts
          const parameters =
            mcpParams?.params?.arguments || mcpParams?.arguments || mcpParams || {};
    
          if (parameters.prompt) {
            log(`Using provided prompt: ${parameters.prompt}`);
          } else {
            log("No prompt provided, using default");
            parameters.prompt = "A cute dog";
          }
    
          // Generate image
          const result: ImageGenerationResult =
            await drawThingsService.generateImage(parameters);
    
          // Handle generation result
          if (result.isError) {
            log(`Error generating image: ${result.errorMessage}`);
            throw new Error(result.errorMessage || "Unknown error");
          }
    
          if (!result.imageData && (!result.images || result.images.length === 0)) {
            log("No image data returned from generation");
            throw new Error("No image data returned from generation");
          }
    
          const imageData =
            result.imageData ||
            (result.images && result.images.length > 0
              ? result.images[0]
              : undefined);
          if (!imageData) {
            log("No valid image data available");
            throw new Error("No valid image data available");
          }
    
          log("Successfully generated image, returning directly via MCP");
    
          // calculate the difference between the start and end time (example value)
          const startTime = Date.now() - 2000; // assume the image generation took 2 seconds
          const endTime = Date.now();
    
          // build the response format
          const responseData = {
            image_paths: result.imagePath ? [result.imagePath] : [],
            metadata: {
              alt: `Image generated from prompt: ${parameters.prompt}`,
              inference_time_ms:
                result.metadata?.inference_time_ms || endTime - startTime,
            },
          };
    
          return {
            content: [
              {
                type: "text",
                text: JSON.stringify(responseData, null, 2),
              },
            ],
          };
        } catch (error) {
          log(
            `Error handling image generation: ${
              error instanceof Error ? error.message : String(error)
            }`
          );
          await logError(error);
          throw error;
        }
      }
    );
  • TypeScript interface defining the input parameters for image generation, used for type safety in the handler.
    export interface ImageGenerationParams {
      // basic params
      prompt?: string;
      negative_prompt?: string;
    
      // size params
      width?: number;
      height?: number;
    
      // generate control params
      steps?: number;
      seed?: number;
      guidance_scale?: number;
    
      // model params
      model?: string;
      sampler?: string;
    
      // MCP special params
      random_string?: string;
    
      // allow other params
      [key: string]: any;
    }
  • Zod schema for input validation in the MCP tool registration.
    const paramsSchema = {
      prompt: z.string().optional(),
      negative_prompt: z.string().optional(),
      width: z.number().optional(),
      height: z.number().optional(),
      steps: z.number().optional(),
      seed: z.number().optional(),
      guidance_scale: z.number().optional(),
      random_string: z.string().optional(),
    };
  • Default parameters used by the handler when user params are incomplete.
    export const defaultParams: ImageGenerationParams = {
      speed_up_with_guidance_embed: true,
      motion_scale: 127,
      image_guidance: 1.5,
      tiled_decoding: false,
      decoding_tile_height: 640,
      negative_prompt_for_image_prior: true,
      batch_size: 1,
      decoding_tile_overlap: 128,
      separate_open_clip_g: false,
      hires_fix_height: 960,
      decoding_tile_width: 640,
      diffusion_tile_height: 1024,
      num_frames: 14,
      stage_2_guidance: 1,
      t5_text_encoder_decoding: true,
      mask_blur_outset: 0,
      resolution_dependent_shift: true,
      model: "flux_1_schnell_q5p.ckpt",
      hires_fix: false,
      strength: 1,
      loras: [],
      diffusion_tile_width: 1024,
      diffusion_tile_overlap: 128,
      original_width: 512,
      seed: -1,
      zero_negative_prompt: false,
      upscaler_scale: 0,
      steps: 8,
      upscaler: null,
      mask_blur: 1.5,
      sampler: "DPM++ 2M AYS",
      width: 320,
      negative_original_width: 512,
      batch_count: 1,
      refiner_model: null,
      shift: 1,
      stage_2_shift: 1,
      open_clip_g_text: null,
      crop_left: 0,
      controls: [],
      start_frame_guidance: 1,
      original_height: 512,
      image_prior_steps: 5,
      guiding_frame_noise: 0.019999999552965164,
      clip_weight: 1,
      clip_skip: 1,
      crop_top: 0,
      negative_original_height: 512,
      preserve_original_after_inpaint: true,
      separate_clip_l: false,
      guidance_embed: 3.5,
      negative_aesthetic_score: 2.5,
      aesthetic_score: 6,
      clip_l_text: null,
      hires_fix_strength: 0.699999988079071,
      guidance_scale: 7.5,
      stochastic_sampling_gamma: 0.3,
      seed_mode: "Scale Alike",
      target_width: 512,
      hires_fix_width: 960,
      tiled_diffusion: false,
      fps: 5,
      refiner_start: 0.8500000238418579,
      height: 512,
      prompt: "A cute koala sitting on a eucalyptus tree, watercolor style, beautiful lighting, detailed",
      negative_prompt: "deformed, distorted, unnatural pose, extra limbs, blurry, low quality, ugly, bad anatomy, poor details, mutated, text, watermark"
    }; 
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It only states the action ('Generate an image') without mentioning any behavioral traits like rate limits, authentication needs, output format, or potential side effects (e.g., resource usage). For a tool with 8 parameters and no annotation coverage, this is inadequate, as it leaves critical operational details unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence, 'Generate an image based on a prompt', which is front-loaded and wastes no words. It efficiently communicates the core function without unnecessary elaboration, making it easy to parse quickly. This is optimal for brevity, though it may sacrifice detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (8 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain the return values, error conditions, or how parameters interact, leaving significant gaps for the agent to operate effectively. While conciseness is high, it lacks the necessary context to handle the tool's full scope, making it insufficient for robust use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no meaning beyond the input schema, which has 8 parameters and 0% schema description coverage. Parameters like 'negative_prompt', 'steps', and 'guidance_scale' are undocumented in both the schema and description, leaving their purposes unclear. This fails to compensate for the low coverage, resulting in poor parameter understanding for the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Generate an image based on a prompt' clearly states the verb ('Generate') and resource ('image'), making the purpose understandable. However, it's somewhat vague about the specific method or technology (e.g., AI model type), and with no siblings, it doesn't need differentiation but lacks depth. This meets the minimum viable standard without being tautological or misleading.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool, such as scenarios for image generation, alternatives, or prerequisites. It merely states the basic function without context, leaving the agent to infer usage. This is a clear gap, as it doesn't help distinguish from potential implicit alternatives or set expectations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jaokuohsuan/draw-things-mcp-cursor'

If you have feedback or need assistance with the MCP directory API, please join our Discord server