object_detection
Identify and locate objects within images using computer vision technology. Upload an image URL to detect visual elements automatically.
Instructions
Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| image_url | Yes |
Implementation Reference
- src/mcp_deepinfra/server.py:154-182 (handler)The handler function for the 'object_detection' tool. It takes an image URL, prompts a vision-language model (gpt-4o-mini by default) to detect objects, and returns the model's response as a string, formatted as JSON.async def object_detection(image_url: str) -> str: """Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model.""" model = DEFAULT_MODELS["object_detection"] try: response = await client.chat.completions.create( model=model, messages=[ { "role": "user", "content": [ { "type": "text", "text": "Analyze this image and detect all objects present. Provide a detailed list of objects you can see, their approximate locations if possible, and confidence scores. Format as JSON." }, { "type": "image_url", "image_url": {"url": image_url} } ] } ], max_tokens=500, ) if response.choices: return response.choices[0].message.content else: return "No objects detected" except Exception as e: return f"Error detecting objects: {type(e).__name__}: {str(e)}"
- src/mcp_deepinfra/server.py:152-153 (registration)Registers the object_detection tool with the FastMCP server using the @app.tool() decorator, conditionally based on the ENABLED_TOOLS configuration.if "all" in ENABLED_TOOLS or "object_detection" in ENABLED_TOOLS: @app.tool()
- src/mcp_deepinfra/server.py:37-37 (helper)Configuration for the default model used by the object_detection tool."object_detection": os.getenv("MODEL_OBJECT_DETECTION", "openai/gpt-4o-mini"),
- src/mcp_deepinfra/server.py:154-155 (schema)Function signature and docstring defining the input (image_url: str) and output (str) schema for the tool.async def object_detection(image_url: str) -> str: """Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model."""