Draw Things MCP

Overview Schema Related Servers Score Discussions

generateImage

Create custom images from text descriptions using AI. Specify prompts, dimensions, and parameters to generate visual content.

Instructions

Generate an image based on a prompt

Input Schema

TableJSON Schema

Name	Required	Description	Default
`prompt`	No
`negative_prompt`	No
`width`	No
`height`	No
`steps`	No
`seed`	No
`guidance_scale`	No
`random_string`	No

Implementation Reference

src/services/drawThingsService.ts:134-254 (handler)

Core handler function that validates input parameters, merges with defaults, calls Draw Things API for txt2img generation, saves the image to disk, and returns formatted result or error.

async generateImage(
  inputParams: Partial<ImageGenerationParams> = {}
): Promise<DrawThingsGenerationResult> {
  try {
    // handle input params
    let params: Partial<ImageGenerationParams> = {};

    // validate params
    try {
      const validationResult = validateImageGenerationParams(inputParams);
      if (validationResult.valid) {
        params = inputParams;
      } else {
        console.error("parameter validation failed, use default params");
      }
    } catch (error) {
      console.error("parameter validation error:", error);
    }

    // handle random_string special case
    if (
      params.random_string &&
      (!params.prompt || Object.keys(params).length === 1)
    ) {
      params.prompt = params.random_string;
      delete params.random_string;
    }

    // ensure prompt
    if (!params.prompt) {
      params.prompt = inputParams.prompt || defaultParams.prompt;
    }

    // merge params
    const requestParams = {
      ...defaultParams,
      ...params,
      seed: params.seed ?? Math.floor(Math.random() * 2147483647),
    };

    console.error(`use prompt: "${requestParams.prompt}"`);

    // send request to Draw Things API
    console.error("send request to Draw Things API...");
    const response = await this.axios.post(
      "/sdapi/v1/txt2img",
      requestParams
    );

    // handle response
    if (
      !response.data ||
      !response.data.images ||
      response.data.images.length === 0
    ) {
      throw new Error("API did not return image data");
    }

    // handle image data
    const imageData = response.data.images[0];

    // format image data
    const formattedImageData = imageData.startsWith("data:image/")
      ? imageData
      : `data:image/png;base64,${imageData}`;

    console.error("image generation success");
    
    // record the start time of image generation
    const startTime = Date.now() - 2000; // assume the image generation took 2 seconds
    const endTime = Date.now();
    
    // automatically save the generated image
    const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
    const defaultFileName = `generated-image-${timestamp}.png`;
    
    // save the generated image
    const imagePath = await this.saveImage({
      base64Data: formattedImageData,
      fileName: defaultFileName
    });
    
    return {
      isError: false,
      imageData: formattedImageData,
      imagePath: imagePath,
      metadata: {
        alt: `Image generated from prompt: ${requestParams.prompt}`,
        inference_time_ms: endTime - startTime,
      }
    };
  } catch (error) {
    console.error("image generation error:", error);

    // error message
    let errorMessage = "unknown error";

    if (error instanceof Error) {
      errorMessage = error.message;
    }

    // handle axios error
    const axiosError = error as any;
    if (axiosError.response) {
      errorMessage = `API error: ${axiosError.response.status} - ${
        axiosError.response.data?.error || axiosError.message
      }`;
    } else if (axiosError.code === "ECONNREFUSED") {
      errorMessage =
        "cannot connect to Draw Things API. please ensure Draw Things is running and API is enabled.";
    } else if (axiosError.code === "ETIMEDOUT") {
      errorMessage =
        "connection to Draw Things API timeout. image generation may take longer, or API not responding.";
    }

    return {
      isError: true,
      errorMessage,
    };
  }
}

src/index.ts:155-233 (registration)

MCP server.tool registration for 'generateImage', defines Zod input schema, handles MCP params, delegates to service handler, and returns standardized MCP content response.

server.tool(
  "generateImage",
  "Generate an image based on a prompt",
  paramsSchema,
  async (mcpParams: any) => {
    try {
      log("Received image generation request");
      log(`mcpParams====== ${JSON.stringify(mcpParams)}`);
      // handle ai prompts
      const parameters =
        mcpParams?.params?.arguments || mcpParams?.arguments || mcpParams || {};

      if (parameters.prompt) {
        log(`Using provided prompt: ${parameters.prompt}`);
      } else {
        log("No prompt provided, using default");
        parameters.prompt = "A cute dog";
      }

      // Generate image
      const result: ImageGenerationResult =
        await drawThingsService.generateImage(parameters);

      // Handle generation result
      if (result.isError) {
        log(`Error generating image: ${result.errorMessage}`);
        throw new Error(result.errorMessage || "Unknown error");
      }

      if (!result.imageData && (!result.images || result.images.length === 0)) {
        log("No image data returned from generation");
        throw new Error("No image data returned from generation");
      }

      const imageData =
        result.imageData ||
        (result.images && result.images.length > 0
          ? result.images[0]
          : undefined);
      if (!imageData) {
        log("No valid image data available");
        throw new Error("No valid image data available");
      }

      log("Successfully generated image, returning directly via MCP");

      // calculate the difference between the start and end time (example value)
      const startTime = Date.now() - 2000; // assume the image generation took 2 seconds
      const endTime = Date.now();

      // build the response format
      const responseData = {
        image_paths: result.imagePath ? [result.imagePath] : [],
        metadata: {
          alt: `Image generated from prompt: ${parameters.prompt}`,
          inference_time_ms:
            result.metadata?.inference_time_ms || endTime - startTime,
        },
      };

      return {
        content: [
          {
            type: "text",
            text: JSON.stringify(responseData, null, 2),
          },
        ],
      };
    } catch (error) {
      log(
        `Error handling image generation: ${
          error instanceof Error ? error.message : String(error)
        }`
      );
      await logError(error);
      throw error;
    }
  }
);

src/services/schemas.ts:6-29 (schema)

TypeScript interface defining the input parameters for image generation, used for type safety in the handler.

export interface ImageGenerationParams {
  // basic params
  prompt?: string;
  negative_prompt?: string;

  // size params
  width?: number;
  height?: number;

  // generate control params
  steps?: number;
  seed?: number;
  guidance_scale?: number;

  // model params
  model?: string;
  sampler?: string;

  // MCP special params
  random_string?: string;

  // allow other params
  [key: string]: any;
}

src/index.ts:144-153 (schema)

Zod schema for input validation in the MCP tool registration.

const paramsSchema = {
  prompt: z.string().optional(),
  negative_prompt: z.string().optional(),
  width: z.number().optional(),
  height: z.number().optional(),
  steps: z.number().optional(),
  seed: z.number().optional(),
  guidance_scale: z.number().optional(),
  random_string: z.string().optional(),
};

src/services/defaultParams.ts:4-71 (helper)

Default parameters used by the handler when user params are incomplete.

export const defaultParams: ImageGenerationParams = {
  speed_up_with_guidance_embed: true,
  motion_scale: 127,
  image_guidance: 1.5,
  tiled_decoding: false,
  decoding_tile_height: 640,
  negative_prompt_for_image_prior: true,
  batch_size: 1,
  decoding_tile_overlap: 128,
  separate_open_clip_g: false,
  hires_fix_height: 960,
  decoding_tile_width: 640,
  diffusion_tile_height: 1024,
  num_frames: 14,
  stage_2_guidance: 1,
  t5_text_encoder_decoding: true,
  mask_blur_outset: 0,
  resolution_dependent_shift: true,
  model: "flux_1_schnell_q5p.ckpt",
  hires_fix: false,
  strength: 1,
  loras: [],
  diffusion_tile_width: 1024,
  diffusion_tile_overlap: 128,
  original_width: 512,
  seed: -1,
  zero_negative_prompt: false,
  upscaler_scale: 0,
  steps: 8,
  upscaler: null,
  mask_blur: 1.5,
  sampler: "DPM++ 2M AYS",
  width: 320,
  negative_original_width: 512,
  batch_count: 1,
  refiner_model: null,
  shift: 1,
  stage_2_shift: 1,
  open_clip_g_text: null,
  crop_left: 0,
  controls: [],
  start_frame_guidance: 1,
  original_height: 512,
  image_prior_steps: 5,
  guiding_frame_noise: 0.019999999552965164,
  clip_weight: 1,
  clip_skip: 1,
  crop_top: 0,
  negative_original_height: 512,
  preserve_original_after_inpaint: true,
  separate_clip_l: false,
  guidance_embed: 3.5,
  negative_aesthetic_score: 2.5,
  aesthetic_score: 6,
  clip_l_text: null,
  hires_fix_strength: 0.699999988079071,
  guidance_scale: 7.5,
  stochastic_sampling_gamma: 0.3,
  seed_mode: "Scale Alike",
  target_width: 512,
  hires_fix_width: 960,
  tiled_diffusion: false,
  fps: 5,
  refiner_start: 0.8500000238418579,
  height: 512,
  prompt: "A cute koala sitting on a eucalyptus tree, watercolor style, beautiful lighting, detailed",
  negative_prompt: "deformed, distorted, unnatural pose, extra limbs, blurry, low quality, ugly, bad anatomy, poor details, mutated, text, watermark"
};

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It only states the action ('Generate an image') without mentioning any behavioral traits like rate limits, authentication needs, output format, or potential side effects (e.g., resource usage). For a tool with 8 parameters and no annotation coverage, this is inadequate, as it leaves critical operational details unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence, 'Generate an image based on a prompt', which is front-loaded and wastes no words. It efficiently communicates the core function without unnecessary elaboration, making it easy to parse quickly. This is optimal for brevity, though it may sacrifice detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (8 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain the return values, error conditions, or how parameters interact, leaving significant gaps for the agent to operate effectively. While conciseness is high, it lacks the necessary context to handle the tool's full scope, making it insufficient for robust use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no meaning beyond the input schema, which has 8 parameters and 0% schema description coverage. Parameters like 'negative_prompt', 'steps', and 'guidance_scale' are undocumented in both the schema and description, leaving their purposes unclear. This fails to compensate for the low coverage, resulting in poor parameter understanding for the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Generate an image based on a prompt' clearly states the verb ('Generate') and resource ('image'), making the purpose understandable. However, it's somewhat vague about the specific method or technology (e.g., AI model type), and with no siblings, it doesn't need differentiation but lacks depth. This meets the minimum viable standard without being tautological or misleading.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool, such as scenarios for image generation, alternatives, or prerequisites. It merely states the basic function without context, leaving the agent to infer usage. This is a clear gap, as it doesn't help distinguish from potential implicit alternatives or set expectations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

generateImageC

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jaokuohsuan/draw-things-mcp-cursor'

If you have feedback or need assistance with the MCP directory API, please join our Discord server