Skip to main content
Glama
chrisurf

DALL-E 3 MCP Server

by chrisurf

generate_image

Generate images from text prompts with DALL-E 3, specifying size, quality, and vivid or natural style.

Instructions

Generate an image using DALL-E 3

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText prompt for image generation
output_pathYesFull path where the image should be saved
sizeNoImage size1024x1024
qualityNoImage qualityhd
styleNoImage stylevivid

Implementation Reference

  • TypeScript interface defining the input schema for the generate_image tool: prompt (string), output_path (string), and optional size, quality, style fields.
    interface GenerateImageArgs {
      prompt: string;
      output_path: string;
      size?: '1024x1024' | '1024x1792' | '1792x1024';
      quality?: 'standard' | 'hd';
      style?: 'vivid' | 'natural';
    }
  • src/index.ts:63-116 (registration)
    Tool registration in setupToolHandlers(): lists 'generate_image' tool in ListToolsRequestSchema handler (lines 67-103) and routes incoming CallToolRequestSchema with name 'generate_image' to the generateImage method (lines 107-115).
    private setupToolHandlers(): void {
      this.server.setRequestHandler(ListToolsRequestSchema, async () => {
        return {
          tools: [
            {
              name: 'generate_image',
              description: 'Generate an image using DALL-E 3',
              inputSchema: {
                type: 'object',
                properties: {
                  prompt: {
                    type: 'string',
                    description: 'Text prompt for image generation',
                  },
                  output_path: {
                    type: 'string',
                    description: 'Full path where the image should be saved',
                  },
                  size: {
                    type: 'string',
                    enum: ['1024x1024', '1024x1792', '1792x1024'],
                    default: '1024x1024',
                    description: 'Image size',
                  },
                  quality: {
                    type: 'string',
                    enum: ['standard', 'hd'],
                    default: 'hd',
                    description: 'Image quality',
                  },
                  style: {
                    type: 'string',
                    enum: ['vivid', 'natural'],
                    default: 'vivid',
                    description: 'Image style',
                  },
                },
                required: ['prompt', 'output_path'],
              },
            },
          ],
        };
      });
    
      this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
        const { name, arguments: args } = request.params;
    
        if (name === 'generate_image') {
          return await this.generateImage(args as unknown as GenerateImageArgs);
        } else {
          throw new McpError(ErrorCode.MethodNotFound, `Unknown tool: ${name}`);
        }
      });
    }
  • The main handler function generateImage that executes the tool logic: validates params, calls OpenAI DALL-E 3 API, downloads the generated image, saves it to disk, and returns success/failure response.
      private async generateImage(args: GenerateImageArgs) {
        const {
          prompt,
          output_path,
          size = '1024x1024',
          quality = 'hd',
          style = 'vivid',
        } = args;
    
        if (!prompt) {
          throw new McpError(ErrorCode.InvalidParams, 'Missing required parameter: prompt');
        }
    
        if (!output_path) {
          throw new McpError(ErrorCode.InvalidParams, 'Missing required parameter: output_path');
        }
    
        const apiKey = process.env.OPENAI_API_KEY;
        if (!apiKey) {
          throw new McpError(ErrorCode.InternalError, 'OPENAI_API_KEY environment variable not set');
        }
    
        try {
          console.error('[DALL-E 3] Starting image generation...');
          console.error('[DALL-E 3] Prompt:', prompt);
          console.error('[DALL-E 3] Output path:', output_path);
    
          const response = await fetch('https://api.openai.com/v1/images/generations', {
            method: 'POST',
            headers: {
              Authorization: `Bearer ${apiKey}`,
              'Content-Type': 'application/json',
            },
            body: JSON.stringify({
              model: 'dall-e-3',
              prompt,
              n: 1,
              size,
              quality,
              style,
            }),
          });
    
          if (!response.ok) {
            const errorText = await response.text();
            console.error('[DALL-E 3] API Error:', errorText);
            throw new McpError(ErrorCode.InternalError, `OpenAI API error: ${response.status} ${response.statusText} - ${errorText}`);
          }
    
          const data = (await response.json()) as OpenAIImageResponse;
          const imageUrl = data.data[0]?.url;
          const revisedPrompt = data.data[0]?.revised_prompt;
    
          if (!imageUrl) {
            throw new McpError(ErrorCode.InternalError, 'No image URL returned from OpenAI API');
          }
    
          console.error('[DALL-E 3] Generated image URL:', imageUrl);
          console.error('[DALL-E 3] Revised prompt:', revisedPrompt);
    
          const imageResponse = await fetch(imageUrl);
          if (!imageResponse.ok) {
            throw new McpError(ErrorCode.InternalError, `Failed to download image: ${imageResponse.status} ${imageResponse.statusText}`);
          }
    
          const imageBuffer = await imageResponse.arrayBuffer();
    
          let finalOutputPath = output_path;
          const stats = await stat(output_path).catch(() => null);
    
          if (stats?.isDirectory() || output_path.endsWith('/') || output_path.endsWith('\\')) {
            const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
            const promptSlug = prompt.toLowerCase().replace(/[^a-z0-9]+/g, '-').replace(/^-+|-+$/g, '').substring(0, 50);
            const filename = `dalle3-${promptSlug}-${timestamp}.png`;
            finalOutputPath = path.join(output_path, filename);
            console.error(`[DALL-E 3] Directory detected, using filename: ${filename}`);
          }
    
          const outputDir = path.dirname(finalOutputPath);
          await mkdir(outputDir, { recursive: true });
          await writeFile(finalOutputPath, Buffer.from(imageBuffer));
    
          const imageSizeKB = Math.round(imageBuffer.byteLength / 1024);
    
          console.error(`[DALL-E 3] ✅ Image saved successfully to: ${finalOutputPath}`);
          console.error(`[DALL-E 3] 📏 Image size: ${imageSizeKB} KB`);
    
          return {
            content: [
              {
                type: 'text',
                text: `✅ Image generated successfully!
    
    **Original Prompt:** ${prompt}
    **Revised Prompt:** ${revisedPrompt || 'N/A'}
    **Image URL:** ${imageUrl}
    **Saved to:** ${finalOutputPath}
    **Size:** ${size}
    **Quality:** ${quality}
    **Style:** ${style}
    **File Size:** ${imageSizeKB} KB
    
    The image has been saved to your specified location and is ready to use.`,
              },
            ],
          };
        } catch (error) {
          console.error('[DALL-E 3] Error:', error);
          if (error instanceof McpError) throw error;
          throw new McpError(ErrorCode.InternalError, `Failed to generate image: ${error instanceof Error ? error.message : 'Unknown error'}`);
        }
      }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description should disclose behavioral traits like API dependency, cost, rate limits, or side effects. Only states 'using DALL-E 3' without elaboration. Fails to inform agent of external service implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise sentence that is front-loaded and contains no unnecessary words. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and 5 parameters, the description fails to explain return behavior, error handling, or the effect of parameters like size/quality/style. Incomplete for a non-trivial generation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for all parameters. Description adds no additional parameter context beyond the schema, achieving baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it generates an image using DALL-E 3, with a specific verb and resource. No siblings exist, so differentiation is not needed.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool or alternatives. Since there are no siblings, lack of exclusion criteria is less critical, but still no usage context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chrisurf/imagegen-mcp-d3'

If you have feedback or need assistance with the MCP directory API, please join our Discord server