Skip to main content
Glama
Suixinlei

Tongyi Wanxiang MCP Server

by Suixinlei

wanx-t2i-image-generation-result

Fetch text-to-image generation results using Alibaba Cloud's Tongyi Wanxiang API. Designed for integration via TypeScript MCP server to process task IDs and retrieve visual outputs.

Instructions

获取阿里云万相文生图大模型的文生图结果

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
task_idYes

Implementation Reference

  • The core handler function for the "wanx-t2i-image-generation-result" tool. It takes a task_id, polls the task status until completion using the helper pollTaskUntilDone, and returns the results as a text content block with JSON stringified output.
    async ({ task_id }) => {
      const result = await pollTaskUntilDone(task_id);
      return {
        content: [{ type: "text", text: JSON.stringify(result.output.results) }],
      };
    }
  • Zod schema for the tool input parameters: requires a 'task_id' string.
    { task_id: z.string() },
  • src/index.ts:40-50 (registration)
    Registration of the MCP tool "wanx-t2i-image-generation-result" using McpServer.tool(), specifying name, Chinese description, input schema, and handler function.
    server.tool(
      "wanx-t2i-image-generation-result",
      "获取阿里云万相文生图大模型的文生图结果",
      { task_id: z.string() },
      async ({ task_id }) => {
        const result = await pollTaskUntilDone(task_id);
        return {
          content: [{ type: "text", text: JSON.stringify(result.output.results) }],
        };
      }
    );
  • Helper function that polls the Alibaba Cloud Wanxiang task status using getTaskStatus until it succeeds, fails, or times out. Directly called by the tool handler.
    export const pollTaskUntilDone = async (taskId: string) => {
      let retries = 0;
    
      while (retries < config.maxRetries) {
        const taskData = await getTaskStatus(taskId);
        const status = taskData.output.task_status;
    
        if (status === "SUCCEEDED" || status === "FAILED") {
          return taskData;
        }
    
        // 等待一段时间后再次查询
        await new Promise((resolve) => setTimeout(resolve, config.pollingInterval));
        retries++;
      }
    
      throw new Error("Task polling timeout");
    };
  • Supporting helper that queries the task status from the API endpoint `/tasks/{taskId}` using axios and the configured API key. Called by pollTaskUntilDone.
    export const getTaskStatus = async (taskId: string) => {
      try {
        const apiKey = config.api.apiKey;
    
        if (!apiKey) {
          throw new Error("API key is not configured");
        }
    
        const response = await axios.get(`${config.api.baseUrl}/tasks/${taskId}`, {
          headers: {
            Authorization: `Bearer ${apiKey}`,
          },
        });
    
        return response.data;
      } catch (error: any) {
        if (error.response) {
          throw new Error(
            error.response.data.message || "Failed to get task status"
          );
        }
        throw error;
      }
    };
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool 'gets' results, implying a read-only operation, but doesn't specify behavioral traits such as whether it requires authentication, rate limits, what happens if the task_id is invalid (e.g., errors, null returns), or the format of the results (e.g., image data, JSON metadata). This leaves significant gaps in understanding how the tool behaves in practice.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose without unnecessary words. It's appropriately sized for a simple tool, though it could be more front-loaded with key details (e.g., clarifying 'result' as image retrieval). There's no wasted text, earning a high score for conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a result-retrieval operation with 1 parameter), lack of annotations, no output schema, and 0% schema description coverage, the description is incomplete. It doesn't explain the parameter's semantics, behavioral aspects like error handling or output format, or how it integrates with sibling tools. This makes it inadequate for an AI agent to use the tool correctly without additional context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 1 parameter (task_id) with 0% description coverage, meaning the schema provides no semantic information. The description doesn't add any meaning beyond the schema—it doesn't explain what 'task_id' is (e.g., an ID from a prior generation request), its format, or how to obtain it. With low schema coverage and no compensation in the description, this falls short of the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool '获取阿里云万相文生图大模型的文生图结果' which translates to 'Get the text-to-image generation results of Alibaba Cloud Wanxiang text-to-image model.' This specifies the action (get results) and resource (text-to-image generation results), but it's somewhat vague about what exactly 'results' entails (e.g., images, metadata, status). It doesn't clearly distinguish from sibling tools like 'wanx-t2i-image-generation' (which likely initiates generation) or 'wanx-t2v-video-generation-result' (which handles video results).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a task_id from a previous generation request), exclusions, or how it relates to siblings like 'wanx-t2i-image-generation' (presumably for initiating tasks) or 'wanx-t2v-video-generation-result' (for video results). Usage is implied only through the name 'result,' but no explicit context is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Suixinlei/tongyi-wanx-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server