Skip to main content
Glama
RamboRogers

FAL Image/Video MCP Server

by RamboRogers

hidream

Generate high-resolution images from text prompts using FAL AI models, with customizable sizes and batch options for creative projects.

Instructions

HiDream I1 - High-resolution image generation

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText prompt for image generation
image_sizeNolandscape_4_3
num_imagesNo

Implementation Reference

  • src/index.ts:107-107 (registration)
    The registration entry for the 'hidream' tool in the MODEL_REGISTRY.imageGeneration array. This defines the tool's ID, FAL endpoint, name, and description used for dynamic tool registration and execution.
    { id: 'hidream', endpoint: 'fal-ai/hidream-i1-full', name: 'HiDream I1', description: 'High-resolution image generation' },
  • Dynamically generates the input schema for image generation tools like 'hidream', including parameters such as prompt, image_size, num_images, and model-specific options.
    if (category === 'imageGeneration') {
      baseSchema.inputSchema.properties = {
        prompt: { type: 'string', description: 'Text prompt for image generation' },
        image_size: { type: 'string', enum: ['square_hd', 'square', 'portrait_4_3', 'portrait_16_9', 'landscape_4_3', 'landscape_16_9'], default: 'landscape_4_3' },
        num_images: { type: 'number', default: 1, minimum: 1, maximum: 4 },
      };
      baseSchema.inputSchema.required = ['prompt'];
      
      // Add model-specific parameters
      if (model.id.includes('flux') || model.id.includes('stable_diffusion')) {
        baseSchema.inputSchema.properties.num_inference_steps = { type: 'number', default: 25, minimum: 1, maximum: 50 };
        baseSchema.inputSchema.properties.guidance_scale = { type: 'number', default: 3.5, minimum: 1, maximum: 20 };
      }
      if (model.id.includes('stable_diffusion') || model.id === 'ideogram_v3') {
        baseSchema.inputSchema.properties.negative_prompt = { type: 'string', description: 'Negative prompt' };
      }
    } else if (category === 'textToVideo') {
  • Dispatch logic in the tool call handler that routes 'hidream' calls (found in imageGeneration registry) to the specific handleImageGeneration function.
    // Determine category and handle accordingly
    if (MODEL_REGISTRY.imageGeneration.find(m => m.id === name)) {
      return await this.handleImageGeneration(args, model);
    } else if (MODEL_REGISTRY.textToVideo.find(m => m.id === name)) {
      return await this.handleTextToVideo(args, model);
    } else if (MODEL_REGISTRY.imageToVideo.find(m => m.id === name)) {
      return await this.handleImageToVideo(args, model);
    }
  • Core handler implementation for 'hidream': extracts arguments, configures FAL client, calls fal.subscribe('fal-ai/hidream-i1-full'), processes image outputs (downloads, data URLs, auto-open), and returns formatted content.
    private async handleImageGeneration(args: any, model: any) {
      const {
        prompt,
        image_size = 'landscape_4_3',
        num_inference_steps = 25,
        guidance_scale = 3.5,
        num_images = 1,
        negative_prompt,
        safety_tolerance,
        raw,
      } = args;
    
      try {
        // Configure FAL client lazily with query config override
        configureFalClient(this.currentQueryConfig);
        const inputParams: any = { prompt };
        
        // Add common parameters
        if (image_size) inputParams.image_size = image_size;
        if (num_images > 1) inputParams.num_images = num_images;
        
        // Add model-specific parameters based on model capabilities
        if (model.id.includes('flux') || model.id.includes('stable_diffusion')) {
          if (num_inference_steps) inputParams.num_inference_steps = num_inference_steps;
          if (guidance_scale) inputParams.guidance_scale = guidance_scale;
        }
        if ((model.id.includes('stable_diffusion') || model.id === 'ideogram_v3') && negative_prompt) {
          inputParams.negative_prompt = negative_prompt;
        }
        if (model.id.includes('flux_pro') && safety_tolerance) {
          inputParams.safety_tolerance = safety_tolerance;
        }
        if (model.id === 'flux_pro_ultra' && raw !== undefined) {
          inputParams.raw = raw;
        }
    
        const result = await fal.subscribe(model.endpoint, { input: inputParams });
        const imageData = result.data as FalImageResult;
    
        const processedImages = await downloadAndProcessImages(imageData.images, model.id);
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({
                model: model.name,
                id: model.id,
                endpoint: model.endpoint,
                prompt,
                images: processedImages,
                metadata: inputParams,
                download_path: DOWNLOAD_PATH,
                data_url_settings: {
                  enabled: ENABLE_DATA_URLS,
                  max_size_mb: Math.round(MAX_DATA_URL_SIZE / 1024 / 1024),
                },
                autoopen_settings: {
                  enabled: AUTOOPEN,
                  note: AUTOOPEN ? "Files automatically opened with default application" : "Auto-open disabled"
                },
              }, null, 2),
            },
          ],
        };
      } catch (error) {
        throw new Error(`${model.name} generation failed: ${error}`);
      }
    }
  • Helper function used to retrieve the model configuration (endpoint etc.) for 'hidream' by its ID during tool dispatch.
    function getModelById(id: string) {
      const allModels = getAllModels();
      return allModels.find(model => model.id === id);
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. While 'High-resolution image generation' implies a generative operation, the description doesn't disclose important behavioral traits: whether this requires authentication, rate limits, costs, what happens when generation fails, output format, or any side effects. For a generative AI tool with zero annotation coverage, this is a significant gap in behavioral transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just 5 words ('HiDream I1 - High-resolution image generation'). It's front-loaded with the essential purpose statement and contains zero wasted words. Every element earns its place, making it efficient for an agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of an image generation tool with 3 parameters, no annotations, and no output schema, the description is insufficiently complete. It doesn't explain what 'HiDream I1' means, doesn't describe the output (format, quality, limitations), and provides no context about when this tool should be chosen over the many alternatives. For a generative tool in a crowded namespace, more contextual information is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no parameter information beyond what's in the schema. With schema description coverage at only 33% (only the 'prompt' parameter has a description), the description doesn't compensate for the undocumented 'image_size' and 'num_images' parameters. However, the schema itself provides good structure with enums and defaults for these parameters, establishing a baseline understanding. The description adds no value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'High-resolution image generation', which is a specific verb+resource combination. However, it doesn't distinguish this tool from its many sibling image generation tools (like flux_dev, hunyuan_image, imagen4, etc.), which all appear to perform similar functions. The description lacks differentiation that would help an agent choose between these alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus the many alternatives. With 22 sibling tools including numerous image generation options (flux_dev, hunyuan_image, imagen4, pixverse_image, etc.), the agent receives no help in selecting this specific tool. There's no mention of when this tool is preferred, what makes it unique, or any prerequisites for its use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/RamboRogers/fal-image-video-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server