Skip to main content
Glama

generate_image

Create images from text descriptions using Stable Diffusion through the Draw Things app, saving generated files to disk with customizable parameters for control.

Instructions

Generate an image from a text prompt using the Draw Things app. The image will be saved to disk and the file path returned.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText description of the image to generate
negative_promptNoElements to exclude from the generated image
widthNoWidth of the generated image in pixels (default: 512)
heightNoHeight of the generated image in pixels (default: 512)
stepsNoNumber of inference steps (default: 20)
cfg_scaleNoClassifier-free guidance scale (default: 7.5)
seedNoRandom seed for reproducibility (-1 for random)
modelNoModel filename to use for generation (use list_models to see available models)
output_pathNoCustom file path to save the generated image

Implementation Reference

  • The main handler function that executes the generate_image tool logic, calling the DrawThingsClient to generate and save images based on parameters.
    export async function generateImage(
      client: DrawThingsClient,
      params: z.infer<typeof generateImageSchema>
    ): Promise<{ type: "text"; text: string }[]> {
      try {
        // Check if server is running first
        const status = await client.checkStatus();
        if (!status.running) {
          return [
            {
              type: "text",
              text: `Error: ${status.message}`,
            },
          ];
        }
    
        // Generate the image
        const response = await client.txt2img({
          prompt: params.prompt,
          negative_prompt: params.negative_prompt,
          width: params.width,
          height: params.height,
          steps: params.steps,
          cfg_scale: params.cfg_scale,
          seed: params.seed,
          model: params.model,
        });
    
        if (!response.images || response.images.length === 0) {
          return [
            {
              type: "text",
              text: "Error: No images were generated",
            },
          ];
        }
    
        // Save the image(s)
        const savedPaths: string[] = [];
        for (let i = 0; i < response.images.length; i++) {
          const outputPath =
            params.output_path && response.images.length === 1
              ? params.output_path
              : undefined;
          const path = await client.saveImage(response.images[i], outputPath);
          savedPaths.push(path);
        }
    
        return [
          {
            type: "text",
            text: JSON.stringify(
              {
                success: true,
                message: `Generated ${savedPaths.length} image(s)`,
                files: savedPaths,
                prompt: params.prompt,
                parameters: {
                  width: params.width || 512,
                  height: params.height || 512,
                  steps: params.steps || 20,
                  cfg_scale: params.cfg_scale || 7.5,
                  seed: params.seed ?? -1,
                },
              },
              null,
              2
            ),
          },
        ];
      } catch (error) {
        const message = error instanceof Error ? error.message : String(error);
        return [
          {
            type: "text",
            text: `Error generating image: ${message}`,
          },
        ];
      }
    }
  • Zod schema defining the input parameters and validation for the generate_image tool.
    export const generateImageSchema = z.object({
      prompt: z.string().describe("Text description of the image to generate"),
      negative_prompt: z
        .string()
        .optional()
        .describe("Elements to exclude from the generated image"),
      width: z
        .number()
        .int()
        .min(64)
        .max(2048)
        .optional()
        .describe("Width of the generated image in pixels (default: 512)"),
      height: z
        .number()
        .int()
        .min(64)
        .max(2048)
        .optional()
        .describe("Height of the generated image in pixels (default: 512)"),
      steps: z
        .number()
        .int()
        .min(1)
        .max(150)
        .optional()
        .describe("Number of inference steps (default: 20)"),
      cfg_scale: z
        .number()
        .min(1)
        .max(30)
        .optional()
        .describe("Classifier-free guidance scale (default: 7.5)"),
      seed: z
        .number()
        .int()
        .optional()
        .describe("Random seed for reproducibility (-1 for random)"),
      model: z
        .string()
        .optional()
        .describe("Model filename to use for generation (use list_models to see available models)"),
      output_path: z
        .string()
        .optional()
        .describe("Custom file path to save the generated image"),
    });
  • src/index.ts:57-65 (registration)
    Registration of the generate_image tool on the MCP server, linking the name, description, schema, and handler.
    server.tool(
      "generate_image",
      generateImageDescription,
      generateImageSchema.shape,
      async (params) => {
        const result = await generateImage(client, params);
        return { content: result };
      }
    );
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states that the image 'will be saved to disk and the file path returned,' which is useful context about output behavior. However, it doesn't mention potential side effects like disk space usage, performance implications (e.g., time-intensive generation), or error conditions (e.g., invalid prompts).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. It avoids unnecessary words and gets straight to the point. However, it could be slightly more structured by explicitly separating the action from the output behavior for even clearer readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 9 parameters and no output schema, the description is minimally adequate. It covers the basic purpose and output behavior but lacks details on error handling, performance characteristics, or integration with sibling tools. Without annotations or an output schema, more context would be helpful for an AI agent to use this tool effectively in varied scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so all parameters are documented in the schema. The description adds no additional parameter semantics beyond what's in the schema. It doesn't explain relationships between parameters (e.g., how 'steps' affects quality vs. speed) or provide usage examples. The baseline of 3 is appropriate given the comprehensive schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate an image'), the resource involved ('from a text prompt'), and the tool used ('using the Draw Things app'). It distinguishes itself from sibling tools like 'transform_image' by focusing on generation rather than modification, and from 'check_status' and 'get_config' by being a creative operation rather than informational.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention any prerequisites, constraints, or scenarios where other tools might be more appropriate. For example, it doesn't clarify if 'transform_image' should be used for editing existing images instead of generating new ones.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/james-see/mcp-drawthings'

If you have feedback or need assistance with the MCP directory API, please join our Discord server