Provides multimodal analysis capabilities for images, audio, and video files using Perplexity's API to answer questions about media content
Perception-MCP
A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.
Prerequisites
Python 3.11+
A fal.ai account & API key
A Perplexity account & API key
Related MCP server: Langflow Document Q&A Server
Installation
Usage
Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json file:
Tools
Perception-MCP provides the following tools:
query_image: Answer a question about an image's contentsquery_audio: Answer a question about an audio file's contentsquery_video: Answer a question about a video's contents