Provides multimodal analysis capabilities for images, audio, and video files using Perplexity's API to answer questions about media content
Perception-MCP
A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.
Prerequisites
- Python 3.11+
- uv
- A fal.ai account & API key
- A Perplexity account & API key
Installation
Usage
Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json
file:
Tools
Perception-MCP provides the following tools:
query_image
: Answer a question about an image's contentsquery_audio
: Answer a question about an audio file's contentsquery_video
: Answer a question about a video's contents
Development
Running tests
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables asking questions about image, audio, or video files using state-of-the-art multimodal models. Powered by fal.ai for advanced media analysis and understanding capabilities.
Related MCP Servers
- -securityAlicense-qualityA powerful server that integrates the Moondream vision model to enable advanced image analysis, including captioning, object detection, and visual question answering, through the Model Context Protocol, compatible with AI assistants like Claude and Cline.Last updated -18Apache 2.0
- AsecurityAlicenseAqualityEnables querying documents through a Langflow backend using natural language questions, providing an interface to interact with Langflow document Q\&A flows.Last updated -114MIT License
- -securityFlicense-qualityProvides chat and image analysis capabilities through OpenRouter.ai's diverse model ecosystem, enabling both text conversations and powerful multimodal image processing with various AI models.Last updated -208
- AsecurityAlicenseAqualityHigh-performance MCP server that enables generation of images and videos using FAL AI models with automatic downloads to your local machine.Last updated -24504MIT License