Skip to main content
Glama

generate

Create images from text prompts using Flux AI models, with customizable aspect ratios and dimensions for visual content generation.

Instructions

Generate an image from a text prompt

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText prompt for image generation
modelNoModel to use for generationflux.1.1-pro
aspect_ratioNoAspect ratio of the output image
widthNoImage width (ignored if aspect-ratio is set)
heightNoImage height (ignored if aspect-ratio is set)
outputNoOutput filenamegenerated.jpg

Implementation Reference

  • MCP tool handler for 'generate': validates args, builds CLI command, executes Python fluxcli.py generate, returns output.
    case 'generate': {
        const args = request.params.arguments as GenerateArgs;
        // Validate required fields
        const prompt = this.validateRequiredString(args.prompt, 'prompt');
    
        // Validate optional numeric fields
        const width = this.validateNumber(args.width, 'width', 256, 2048);
        const height = this.validateNumber(args.height, 'height', 256, 2048);
    
        const cmdArgs = ['generate'];
        cmdArgs.push('--prompt', prompt);
        if (args.model) cmdArgs.push('--model', args.model);
        if (args.aspect_ratio) cmdArgs.push('--aspect-ratio', args.aspect_ratio);
        if (width) cmdArgs.push('--width', width.toString());
        if (height) cmdArgs.push('--height', height.toString());
        if (args.output) cmdArgs.push('--output', args.output);
    
        const output = await this.runPythonCommand(cmdArgs);
        return {
            content: [{ type: 'text', text: output }],
        };
    }
  • src/index.ts:131-168 (registration)
    Registration of the 'generate' tool in ListTools response, including name, description, and input schema.
    {
        name: 'generate',
        description: 'Generate an image from a text prompt',
        inputSchema: {
            type: 'object',
            properties: {
                prompt: {
                    type: 'string',
                    description: 'Text prompt for image generation',
                },
                model: {
                    type: 'string',
                    description: 'Model to use for generation',
                    enum: ['flux.1.1-pro', 'flux.1-pro', 'flux.1-dev', 'flux.1.1-ultra'],
                    default: 'flux.1.1-pro',
                },
                aspect_ratio: {
                    type: 'string',
                    description: 'Aspect ratio of the output image',
                    enum: ['1:1', '4:3', '3:4', '16:9', '9:16'],
                },
                width: {
                    type: 'number',
                    description: 'Image width (ignored if aspect-ratio is set)',
                },
                height: {
                    type: 'number',
                    description: 'Image height (ignored if aspect-ratio is set)',
                },
                output: {
                    type: 'string',
                    description: 'Output filename',
                    default: 'generated.jpg',
                },
            },
            required: ['prompt'],
        },
    },
  • TypeScript interface defining the input arguments for the 'generate' tool.
    export interface GenerateArgs {
      prompt: string;
      model?: FluxModel;
      aspect_ratio?: AspectRatio;
      width?: number;
      height?: number;
      output?: string;
    }
  • Helper method to execute the Python CLI script (fluxcli.py) with given arguments, capturing output or error.
    private async runPythonCommand(args: string[]): Promise<string> {
        return new Promise((resolve, reject) => {
            // Validate arguments
            if (!args || args.length === 0) {
                reject(new Error('No command arguments provided'));
                return;
            }
    
            // Use python from virtual environment if available
            const pythonPath = process.env.VIRTUAL_ENV ?
                `${process.env.VIRTUAL_ENV}/bin/python` : 'python3';
    
            const childProcess = spawn(pythonPath, ['fluxcli.py', ...args], {
                cwd: this.fluxPath,
                env: process.env, // Pass through all environment variables
            });
    
            let output = '';
            let errorOutput = '';
    
            childProcess.stdout?.on('data', (data) => {
                output += data.toString();
            });
    
            childProcess.stderr?.on('data', (data) => {
                errorOutput += data.toString();
            });
    
            childProcess.on('error', (error) => {
                reject(new Error(`Failed to spawn Python process: ${error.message}`));
            });
    
            childProcess.on('close', (code) => {
                if (code === 0) {
                    resolve(output);
                } else {
                    reject(new Error(`Flux command failed (exit code ${code}): ${errorOutput}`));
                }
            });
        });
    }
  • Core image generation logic in Python CLI: constructs payload for FLUX API endpoint based on model, submits request, polls for result URL.
    def generate_image(self, prompt: str, model: str = "flux.1.1-pro", width: int = None, height: int = None, aspect_ratio: str = None) -> Optional[str]:
        """Generate an image using any FLUX model."""
        endpoint = {
            "flux.1.1-pro": "/v1/flux-pro-1.1",
            "flux.1-pro": "/v1/flux-pro",
            "flux.1-dev": "/v1/flux-dev",
            "flux.1.1-ultra": "/v1/flux-pro-1.1-ultra",
        }.get(model)
        
        if not endpoint:
            raise ValueError(f"Unknown model: {model}")
        
        # Set default dimensions based on aspect ratio if provided
        if aspect_ratio:
            if aspect_ratio == '1:1':
                width, height = 1024, 1024
            elif aspect_ratio == '4:3':
                width, height = 1024, 768
            elif aspect_ratio == '3:4':
                width, height = 768, 1024
            elif aspect_ratio == '16:9':
                width, height = 1024, 576
            elif aspect_ratio == '9:16':
                width, height = 576, 1024
        else:
            # Use defaults if neither aspect ratio nor dimensions are provided
            width = width or 1024
            height = height or 768
        
        payload = {
            "prompt": prompt,
            "width": width,
            "height": height,
            "aspect_ratio": aspect_ratio if aspect_ratio else None
        }
        response = requests.post(
            f"{self.base_url}{endpoint}",
            json=payload,
            headers=self.headers
        )
        
        task_id = response.json().get('id')
        if not task_id:
            print("Failed to start generation task")
            return None
            
        result = self.get_task_result(task_id)
        if result and result.get('result', {}).get('sample'):
            return result['result']['sample']
        return None
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure but offers minimal information. It mentions generation but doesn't cover critical aspects like whether this is a read-only or destructive operation, potential rate limits, authentication needs, or what the output entails (e.g., image format, storage location). This leaves significant gaps in understanding the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded with a single, clear sentence that directly states the tool's core function. There is no wasted language or redundancy, making it efficient and easy to parse, though this brevity contributes to gaps in other dimensions like guidelines and transparency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a 6-parameter image generation tool with no annotations and no output schema, the description is incomplete. It fails to address behavioral traits, usage context, or output details (e.g., what is returned, error handling), leaving the agent under-informed for effective tool invocation in a real-world scenario.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter semantics beyond what the input schema already provides, as schema description coverage is 100%. The schema thoroughly documents all 6 parameters, including enums for 'model' and 'aspect_ratio', defaults, and dependencies (e.g., 'width'/'height' ignored if 'aspect-ratio' set). Thus, the description meets the baseline but doesn't enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('generate') and resource ('image from a text prompt'), making it immediately understandable. However, it doesn't differentiate from sibling tools like 'img2img' or 'inpaint' which likely also generate images but from different inputs, leaving room for potential confusion about when to choose this specific tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'control', 'img2img', or 'inpaint'. It lacks context about prerequisites, such as needing a text prompt as input, or exclusions, like not being suitable for image-to-image transformations. This absence leaves the agent without clear direction for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jmanhype/mcp-flux-studio'

If you have feedback or need assistance with the MCP directory API, please join our Discord server