Skip to main content
Glama
phuihock
by phuihock

object_detection

Identify and locate objects within images using computer vision technology. Upload an image URL to detect visual elements automatically.

Instructions

Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_urlYes

Implementation Reference

  • The handler function for the 'object_detection' tool. It takes an image URL, prompts a vision-language model (gpt-4o-mini by default) to detect objects, and returns the model's response as a string, formatted as JSON.
    async def object_detection(image_url: str) -> str:
        """Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model."""
        model = DEFAULT_MODELS["object_detection"]
        try:
            response = await client.chat.completions.create(
                model=model,
                messages=[
                    {
                        "role": "user",
                        "content": [
                            {
                                "type": "text",
                                "text": "Analyze this image and detect all objects present. Provide a detailed list of objects you can see, their approximate locations if possible, and confidence scores. Format as JSON."
                            },
                            {
                                "type": "image_url",
                                "image_url": {"url": image_url}
                            }
                        ]
                    }
                ],
                max_tokens=500,
            )
            if response.choices:
                return response.choices[0].message.content
            else:
                return "No objects detected"
        except Exception as e:
            return f"Error detecting objects: {type(e).__name__}: {str(e)}"
  • Registers the object_detection tool with the FastMCP server using the @app.tool() decorator, conditionally based on the ENABLED_TOOLS configuration.
    if "all" in ENABLED_TOOLS or "object_detection" in ENABLED_TOOLS:
        @app.tool()
  • Configuration for the default model used by the object_detection tool.
    "object_detection": os.getenv("MODEL_OBJECT_DETECTION", "openai/gpt-4o-mini"),
  • Function signature and docstring defining the input (image_url: str) and output (str) schema for the tool.
    async def object_detection(image_url: str) -> str:
        """Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model."""

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/phuihock/mcp-deeinfra'

If you have feedback or need assistance with the MCP directory API, please join our Discord server