object_detection
Identify and locate objects within images using AI-powered computer vision. Upload an image URL to detect various objects and their positions automatically.
Instructions
Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| image_url | Yes | ||
| model | No |
Implementation Reference
- src/mcp_deepinfra/server.py:153-182 (handler)Handler function for the 'object_detection' tool, registered with @app.tool(). It takes an image_url, uses a vision-language model to analyze the image for objects via DeepInfra's OpenAI-compatible API, and returns the model's response as a string (expected JSON).@app.tool() async def object_detection(image_url: str) -> str: """Detect objects in an image using DeepInfra OpenAI-compatible API with multimodal model.""" model = DEFAULT_MODELS["object_detection"] try: response = await client.chat.completions.create( model=model, messages=[ { "role": "user", "content": [ { "type": "text", "text": "Analyze this image and detect all objects present. Provide a detailed list of objects you can see, their approximate locations if possible, and confidence scores. Format as JSON." }, { "type": "image_url", "image_url": {"url": image_url} } ] } ], max_tokens=500, ) if response.choices: return response.choices[0].message.content else: return "No objects detected" except Exception as e: return f"Error detecting objects: {type(e).__name__}: {str(e)}"
- src/mcp_deepinfra/server.py:37-37 (helper)Default model configuration for the object_detection tool in the DEFAULT_MODELS dictionary."object_detection": os.getenv("MODEL_OBJECT_DETECTION", "openai/gpt-4o-mini"),
- src/mcp_deepinfra/server.py:152-152 (registration)Conditional enabling of the object_detection tool based on ENABLED_TOOLS environment variable.if "all" in ENABLED_TOOLS or "object_detection" in ENABLED_TOOLS: