Provides multimodal analysis capabilities for images, audio, and video files using Perplexity's API to answer questions about media content
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Perception-MCPwhat's happening in this video clip?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Perception-MCP
A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.
Prerequisites
Python 3.11+
A fal.ai account & API key
A Perplexity account & API key
Related MCP server: Langflow Document Q&A Server
Installation
git clone --recurse-submodules https://github.com/lintyourcode/perception-mcp.git
cd perception-mcp
cp mcp_agent.secrets_template.yaml mcp_agent.secrets.yaml
$EDITOR mcp_agent.secrets.yamlUsage
Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json file:
{
"mcpServers": {
"perception-mcp": {
"command": "fastmcp",
"args": ["run", "perception-mcp", "serve"]
}
}
}Tools
Perception-MCP provides the following tools:
query_image: Answer a question about an image's contentsquery_audio: Answer a question about an audio file's contentsquery_video: Answer a question about a video's contents
Development
Running tests
uv run pytest -qThis server cannot be installed
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.