process_image_to_description
Generate text descriptions from images by analyzing visual content and specifying detail level or focus areas to extract meaningful information.
Instructions
处理图像并生成描述
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| image_url | Yes | 图像URL | |
| detail_level | No | 描述详细程度:简洁(brief)、标准(standard)或详细(detailed) | standard |
| focus | No | 描述重点,例如:物体、场景、人物、情感等 |
Implementation Reference
- src/index.ts:251-295 (handler)The main handler for the 'process_image_to_description' tool. It extracts image_url, detail_level, and focus parameters, downloads the image, constructs a descriptive prompt based on parameters, calls the processImageWithQwen helper to get the AI description, and returns the result as text content or error.case "process_image_to_description": { const imageUrl = String(request.params.arguments?.image_url); const detailLevel = String(request.params.arguments?.detail_level || "standard"); const focus = String(request.params.arguments?.focus || ""); if (!imageUrl) { throw new Error("图像URL是必需的"); } try { // 下载图像 const imagePath = await downloadImage(imageUrl); // 构建提示词 let prompt = "请描述这张图片"; if (detailLevel === "brief") { prompt += ",给出简洁的描述"; } else if (detailLevel === "detailed") { prompt += ",提供详细的描述,包括细节、背景和上下文"; } if (focus) { prompt += `,重点关注${focus}`; } // 处理图像 const result = await processImageWithQwen(imagePath, prompt); return { content: [{ type: "text", text: result }] }; } catch (error: any) { return { content: [{ type: "text", text: `处理图像失败: ${error.message}` }], isError: true }; } }
- src/index.ts:181-201 (schema)Input schema definition for the tool, specifying parameters: image_url (required), detail_level (enum with default), focus (optional). Defines validation for tool arguments.inputSchema: { type: "object", properties: { image_url: { type: "string", description: "图像URL" }, detail_level: { type: "string", description: "描述详细程度:简洁(brief)、标准(standard)或详细(detailed)", enum: ["brief", "standard", "detailed"], default: "standard" }, focus: { type: "string", description: "描述重点,例如:物体、场景、人物、情感等", default: "" } }, required: ["image_url"] }
- src/index.ts:178-202 (registration)Tool registration in the ListToolsRequestSchema handler, defining name, description, and input schema for discovery by MCP clients.{ name: "process_image_to_description", description: "处理图像并生成描述", inputSchema: { type: "object", properties: { image_url: { type: "string", description: "图像URL" }, detail_level: { type: "string", description: "描述详细程度:简洁(brief)、标准(standard)或详细(detailed)", enum: ["brief", "standard", "detailed"], default: "standard" }, focus: { type: "string", description: "描述重点,例如:物体、场景、人物、情感等", default: "" } }, required: ["image_url"] } }
- src/index.ts:74-131 (helper)Core helper function that calls the Qwen VL API: reads image to base64, sends prompt + image to API, extracts text response, cleans up temp file. Used by both image processing tools.async function processImageWithQwen(imagePath: string, prompt: string): Promise<string> { try { // 读取图像文件 const imageBuffer = fs.readFileSync(imagePath); const base64Image = imageBuffer.toString('base64'); // 准备请求数据 const requestData = { model: "qwen-vl-plus", input: { messages: [ { role: "user", content: [ { type: "text", text: prompt }, { type: "image", image: base64Image } ] } ] }, parameters: {} }; // 发送请求到API const response = await axios.post(API_ENDPOINT, requestData, { headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' } }); // 处理响应 if (response.data && response.data.output && response.data.output.text) { return response.data.output.text; } else { throw new Error("API响应格式不正确"); } } catch (error: any) { console.error("调用qwen2.5-vl模型API失败:", error); if (error.response) { console.error("API响应:", error.response.data); } throw new Error(`处理图像失败: ${error.message}`); } finally { // 清理临时文件 try { fs.unlinkSync(imagePath); } catch (e) { console.error("清理临时文件失败:", e); } } }
- src/index.ts:47-66 (helper)Helper to download image from URL to temp file using stream, generates unique filename, returns path. Used in tool handlers.async function downloadImage(imageUrl: string): Promise<string> { try { // 为图像生成唯一文件名 const filename = `${Date.now()}-${Math.random().toString(36).substring(2, 15)}.jpg`; const filePath = path.join(TEMP_DIR, filename); // 下载图像 const response = await axios({ method: 'GET', url: imageUrl, responseType: 'stream' }); await streamPipeline(response.data, fs.createWriteStream(filePath)); return filePath; } catch (error: any) { console.error("下载图像失败:", error); throw new Error(`下载图像失败: ${error.message}`); } }