Skip to main content
Glama
156554395

Doubao Image/Video Generation MCP Server

by 156554395

generate_image

Generate images from text prompts, transform existing images with new descriptions, or blend multiple reference images to create new visuals using Doubao Seedream models.

Instructions

使用豆包 Seedream 模型生成图片

支持功能:

  • 文生图: 使用文本提示词生成图片

  • 图生图: 使用输入图片和提示词生成新图片

  • 多图融合: 使用多张参考图片融合生成新图片

参数说明:

  • prompt: 图片描述文本 (必需)

  • endpoint_id: 推理接入点 ID (推荐) 在火山引擎控制台创建推理接入点后获取的 Endpoint ID

  • model: 模型名称 (可选,默认: doubao-seedream-4-5)

    • doubao-seedream-4-5: 最新 4.0 模型,支持 4K 分辨率

    • doubao-seedream-3-0-t2i: 3.0 文生图模型 注意: 直接使用模型名称可能需要账户权限,推荐使用 endpoint_id

  • size: 图片尺寸 (可选,默认: 2560x1440) 支持的尺寸: 2560x1440, 2048x2048, 2304x1728, 1728x2304, 1440x2560, 2496x1664, 1664x2496, 3024x1296

  • image_url: 参考图片 URL (可选,用于图生图)

  • ref_image_urls: 多张参考图片 URL 数组 (可选,用于多图融合)

  • req_key: 请求标识 (可选,用于追踪)

重要提示: 如果遇到 InvalidEndpointOrModel.NotFound 错误,请在火山引擎控制台创建推理接入点,并使用 endpoint_id 参数

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYes图片描述文本
endpoint_idNo推理接入点 ID (在火山引擎控制台创建)
modelNo模型选择,默认: doubao-seedream-4-5
sizeNo图片尺寸,默认: 2560x1440 (注意: 豆包 API 要求图片至少 3686400 像素)
image_urlNo参考图片 URL (图生图)
ref_image_urlsNo多张参考图片 URL 数组 (多图融合)
req_keyNo请求标识
watermarkNo是否添加水印,默认: false
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses important behavioral traits: the tool generates images using a specific AI model, supports multiple generation modes, has default values for optional parameters, requires specific permissions for direct model use, and includes error handling guidance for InvalidEndpointOrModel.NotFound. It doesn't mention rate limits, costs, or output format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (功能, 参数说明, 重要提示), uses bullet points for readability, and every sentence adds value. It's appropriately sized for an 8-parameter tool with multiple functionalities and includes troubleshooting information without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex image generation tool with 8 parameters, 100% schema coverage, and no output schema, the description is quite complete. It covers purpose, usage modes, parameter semantics, and error handling. The main gap is the lack of output format explanation (e.g., what the tool returns - image URL, base64, etc.), which is important since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds significant value by explaining parameter purposes beyond the schema: it clarifies that prompt is required, endpoint_id is recommended over model for permission reasons, model has specific version differences, size has resolution requirements, and image_url/ref_image_urls enable specific generation modes. It also notes the watermark parameter (present in schema but not mentioned in description) is a minor gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '使用豆包 Seedream 模型生成图片' (use Doubao Seedream model to generate images). It specifies three distinct functions (text-to-image, image-to-image, multi-image fusion) and distinguishes itself from sibling tools (generate_video, query_video_task) by focusing on image generation rather than video.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use different parameters (e.g., image_url for image-to-image, ref_image_urls for multi-image fusion). It mentions an alternative approach (using endpoint_id vs. model name) and includes troubleshooting guidance for errors. However, it doesn't explicitly state when NOT to use this tool versus sibling tools or other alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/156554395/doubao-image-video-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server