Image Tools - Background Removal, Upscaling & Face Restoration
Server Details
Background removal, 4x upscaling, and face restoration via GPU
- Status
- Unhealthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- fasuizu-br/speech-ai-examples
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Score is being calculated. Check back soon.
Available Tools
4 toolscheck_image_serviceCheck Image ServiceARead-onlyIdempotentInspect
Check health status of Image API services and loaded models.
Returns: dict with keys: - status (str): 'healthy' or error state - models (dict): Loaded model status per capability - version (str): API version
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover safety profile (readOnly, idempotent, non-destructive, openWorld). Description adds valuable return value documentation detailing the dict structure with keys (status, models, version) and their semantics ('healthy' vs error states), which annotations do not cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Appropriately sized with clear docstring-style formatting. Returns section is structured and readable. No redundant text, though the multiline format with indentation consumes vertical space unnecessarily.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a zero-parameter health check tool. The manual documentation of return values compensates adequately for the absence of a formal output schema, providing sufficient completeness for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Zero parameters present; baseline score applies per rubric. Description correctly does not invent parameter documentation where none exists.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Check' with clear resource 'health status of Image API services and loaded models'. Effectively distinguishes from siblings (remove_background, restore_face, upscale_image) which are image processing operations, while this is a service health diagnostic.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or when-not-to-use guidance provided. Usage is implied by the nature of the function (health check vs image manipulation), but lacks guidance such as 'call before processing operations' or troubleshooting contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_backgroundRemove BackgroundARead-onlyIdempotentInspect
Remove the background from an image.
Uses BiRefNet segmentation to precisely separate foreground from background. Returns a base64-encoded image with transparent background (PNG) or white background (WebP). Sub-500ms latency on GPU.
Args: image_base64: Base64-encoded image data (PNG, JPEG, or WebP). output_format: Output format -- 'png' (with transparency) or 'webp'.
Returns: dict with keys: - image_base64 (str): Base64-encoded result image - format (str): Output image format - original_size (dict): Original width and height - processing_ms (int): Processing time in milliseconds
| Name | Required | Description | Default |
|---|---|---|---|
| image_base64 | Yes | Base64-encoded image data. Supports PNG, JPEG, and WebP formats. | |
| output_format | No | Output image format: 'png' (default, with transparency) or 'webp' | png |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds substantial context beyond annotations: specifies BiRefNet algorithm, sub-500ms GPU latency, and detailed return structure (compensating for lack of output schema). Does not cover error cases or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections (summary, technical details, Args, Returns). The Returns dictionary is verbose but necessary given no output schema exists. Information is front-loaded with the primary action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive given constraints: provides full output structure documentation to compensate for missing output schema, covers all parameters via high-coverage schema, and includes implementation/performance details for informed usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, providing complete parameter documentation. The Args section in the description largely mirrors schema definitions without adding significant semantic value (e.g., no usage examples or edge case guidance).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Remove the background') and resource ('from an image'), distinguishes from siblings (restore_face, upscale_image) by focusing on segmentation/removal rather than restoration or scaling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through output format specifications (PNG for transparency, WebP for white background), but lacks explicit when-to-use guidance or comparisons to sibling tools (e.g., when to choose this over other image processing options).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_faceRestore FaceARead-onlyIdempotentInspect
Restore and enhance faces in an image using GFPGAN.
Detects all faces via RetinaFace, restores quality (fixes blur, noise, compression artifacts), and pastes them back. Optionally enhances the background using Real-ESRGAN. GPU-accelerated, sub-3s latency.
Args: image_base64: Base64-encoded image data containing faces (PNG, JPEG, WebP). upscale: Output upscale factor -- 1 to 4 (default: 2). enhance_background: Whether to enhance background with Real-ESRGAN (default: true).
Returns: dict with keys: - image (str): Base64-encoded restored image - format (str): Output image format - width (int): Output width - height (int): Output height - upscale (int): Scale factor applied - processing_time_ms (float): Processing time in milliseconds
| Name | Required | Description | Default |
|---|---|---|---|
| upscale | No | Output upscale factor: 1-4 (default: 2) | |
| image_base64 | Yes | Base64-encoded image data containing one or more faces. | |
| enhance_background | No | Enhance background with Real-ESRGAN (default: true) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Rich behavioral disclosure beyond annotations: details specific algorithms (RetinaFace detection, GFPGAN restoration, Real-ESRGAN background), lists artifact types fixed (blur, noise, compression), discloses performance characteristics (GPU-accelerated, sub-3s latency), and documents return structure. Annotations only cover safety/idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured and front-loaded: purpose statement first, followed by technical mechanism, Args block, and Returns block. Every sentence earns its place; the Returns section is essential given no formal output schema exists. No redundant fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a 3-parameter image processing tool. Documents input requirements, operational constraints (GPU acceleration, latency), specific algorithms, and complete return value structure. Fully compensates for absence of output schema with detailed Returns documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. Description adds file format constraints (PNG, JPEG, WebP) not present in schema's image_base64 description, and clarifies the technical mechanism (Real-ESRGAN) for background enhancement. Provides useful parameter context beyond structured schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description opens with specific verb-resource combination ('Restore and enhance faces') and identifies the specific technology (GFPGAN). It clearly distinguishes from sibling 'upscale_image' by focusing specifically on face restoration rather than general image upscaling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context that this tool is for face restoration specifically, implying when to use it (images containing faces needing quality fixes). However, lacks explicit comparison to siblings (e.g., 'use upscale_image instead for general upscaling without face detection').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
upscale_imageUpscale ImageARead-onlyIdempotentInspect
Upscale image resolution using Real-ESRGAN.
Enhances image resolution by 2x or 4x using GPU-accelerated Real-ESRGAN super-resolution. Processes in tiles (256x256) to manage VRAM. Maximum output dimension: 8192x8192.
Args: image_base64: Base64-encoded image data (PNG, JPEG, or WebP). scale: Upscale factor -- 2 or 4 (default: 4).
Returns: dict with keys: - image (str): Base64-encoded upscaled image - format (str): Output image format - width (int): Output width - height (int): Output height - scale (int): Scale factor applied - processing_time_ms (float): Processing time in milliseconds
| Name | Required | Description | Default |
|---|---|---|---|
| scale | No | Upscale factor: 2 or 4 (default: 4) | |
| image_base64 | Yes | Base64-encoded image data. Supports PNG, JPEG, and WebP formats. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds substantial behavioral context beyond the annotations, including: tile-based processing (256x256) for VRAM management, GPU acceleration, and the maximum output dimension constraint (8192x8192). It also documents the return structure (dict with image, format, dimensions, timing) which is not present in structured annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description uses a clean docstring format with clear sections (one-line summary, technical details, Args, Returns). Every sentence provides distinct value: technology identification, implementation details (tiling/VRAM), constraints (max dimensions), and I/O specifications. No waste or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of comprehensive annotations and the detailed description covering inputs, constraints, and return values (despite no formal output_schema), the definition is nearly complete. It could benefit from mentioning error conditions (e.g., what happens if max dimension is exceeded) or authentication requirements, but covers the essential behavioral contract well.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The Args section essentially mirrors the schema descriptions (base64 formats, scale values) without adding significant new semantic meaning such as validation rules, format nuances, or the relationship between input size and VRAM requirements. It meets but does not exceed what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (upscale/enhance resolution), the technology used (Real-ESRGAN), and the specific scaling factors (2x or 4x). It distinguishes from siblings like 'remove_background' and 'restore_face' by focusing on general super-resolution rather than specific image editing tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description clearly explains what the tool does, it does not provide explicit guidance on when to use this versus the sibling 'restore_face' tool, which might also use super-resolution techniques. No 'when-not-to-use' or alternative selection criteria are provided, though the specific mention of Real-ESRGAN and general upscaling implies its intended use case.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!