Skip to main content
Glama
duke0317

Image Processing MCP Server

by duke0317

crop_image

Crop images by specifying pixel coordinates to remove unwanted areas or focus on specific sections. Define left, top, right, and bottom boundaries to extract desired portions from image files or base64 data.

Instructions

裁剪图片

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_sourceYes图片源,可以是文件路径或base64编码的图片数据
leftYes裁剪区域左边界坐标(像素)
topYes裁剪区域上边界坐标(像素)
rightYes裁剪区域右边界坐标(像素)
bottomYes裁剪区域下边界坐标(像素)

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Core handler function that loads the image, validates crop coordinates, performs the crop using PIL Image.crop, and returns JSON result with base64 output.
    async def crop_image(image_data: str, left: int, top: int, right: int, bottom: int) -> list[TextContent]:
        """
        裁剪图片
        
        Args:
            image_data: 图片数据(base64编码)
            left, top, right, bottom: 裁剪坐标
            
        Returns:
            裁剪后的图片数据
        """
        try:
            # 验证参数
            if not image_data:
                raise ValidationError("图片数据不能为空")
            
            # 加载图片
            image = processor.load_image(image_data)
            image_width, image_height = image.size
            
            # 验证裁剪坐标
            if not validate_crop_coordinates(left, top, right, bottom, image_width, image_height):
                raise ValidationError(f"无效的裁剪坐标: ({left}, {top}, {right}, {bottom}), 图片尺寸: {image_width}x{image_height}")
            
            # 裁剪图片
            cropped_image = image.crop((left, top, right, bottom))
            
            # 输出裁剪后的图片
            output_info = processor.output_image(cropped_image, f"crop_{left}_{top}_{right}_{bottom}")
            
            result = {
                "success": True,
                "message": f"图片裁剪成功: ({left}, {top}, {right}, {bottom})",
                "data": {
                    **output_info,
                    "original_size": (image_width, image_height),
                    "crop_box": (left, top, right, bottom),
                    "cropped_size": cropped_image.size
                }
            }
            
            return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False))]
            
        except ValidationError as e:
            error_result = {
                "success": False,
                "error": f"参数验证失败: {str(e)}"
            }
            return [TextContent(type="text", text=json.dumps(error_result, ensure_ascii=False))]
            
        except Exception as e:
            error_result = {
                "success": False,
                "error": f"图片裁剪失败: {str(e)}"
            }
            return [TextContent(type="text", text=json.dumps(error_result, ensure_ascii=False))]
  • main.py:188-204 (registration)
    Registers the crop_image tool with the FastMCP server (@mcp.tool()), providing input schema via Annotated Fields, and delegates execution to the transform_crop_image handler.
    @mcp.tool()
    def crop_image(
        image_source: Annotated[str, Field(description="图片源,可以是文件路径或base64编码的图片数据")],
        left: Annotated[int, Field(description="裁剪区域左边界坐标(像素)", ge=0)],
        top: Annotated[int, Field(description="裁剪区域上边界坐标(像素)", ge=0)],
        right: Annotated[int, Field(description="裁剪区域右边界坐标(像素)", gt=0)],
        bottom: Annotated[int, Field(description="裁剪区域下边界坐标(像素)", gt=0)]
    ) -> str:
        """裁剪图片"""
        try:
            result = safe_run_async(transform_crop_image(image_source, left, top, right, bottom))
            return result[0].text
        except Exception as e:
            return json.dumps({
                "success": False,
                "error": f"裁剪图片失败: {str(e)}"
            }, ensure_ascii=False, indent=2)
  • Defines the input schema for the crop_image tool as part of get_transform_tools(), including properties and validation rules (though registration uses pydantic in main.py).
    Tool(
        name="crop_image",
        description="裁剪图片",
        inputSchema={
            "type": "object",
            "properties": {
                "image_data": {
                    "type": "string",
                    "description": "图片数据(base64编码)"
                },
                "left": {
                    "type": "integer",
                    "description": "左边界坐标",
                    "minimum": 0
                },
                "top": {
                    "type": "integer",
                    "description": "上边界坐标",
                    "minimum": 0
                },
                "right": {
                    "type": "integer",
                    "description": "右边界坐标"
                },
                "bottom": {
                    "type": "integer",
                    "description": "下边界坐标"
                }
            },
            "required": ["image_data", "left", "top", "right", "bottom"]
        }
    ),
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. '裁剪图片' only states the action without any information about side effects (e.g., whether it modifies the original image, creates a new file, requires specific permissions, or has rate limits). For a mutation tool with zero annotation coverage, this is a critical gap that leaves the agent guessing about important behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While concise with just two characters, this is under-specification rather than effective brevity. The description fails to convey necessary information and doesn't use its limited space efficiently—it could include key details without becoming verbose. It's too terse to be helpful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (image manipulation with 5 required parameters), lack of annotations, and presence of an output schema, the description is incomplete. It doesn't address behavioral aspects, usage context, or output expectations. The output schema might cover return values, but the description should still provide operational context, which it fails to do.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters clearly documented in the input schema (image_source, left, top, right, bottom). The description adds no additional meaning beyond what the schema provides—it doesn't explain coordinate systems, units, or relationships between parameters. Given the high schema coverage, the baseline score of 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '裁剪图片' (crop image) is a tautology that merely restates the tool name in Chinese without adding any meaningful clarification. It doesn't specify what resource is being cropped (images), what the output is, or how it differs from sibling tools like 'resize_image' or 'create_thumbnail_grid'. The purpose is implied but not explicitly stated beyond the name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are many sibling tools for image manipulation (e.g., resize_image, rotate_image, apply_crop might be implied but not listed), but the description offers no context, prerequisites, or comparisons. It's completely silent on usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/duke0317/ps-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server