Skip to main content
Glama

Perception-MCP

by lintyourcode

Perception-MCP

A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.

Prerequisites

Installation

git clone --recurse-submodules https://github.com/lintyourcode/perception-mcp.git cd perception-mcp cp mcp_agent.secrets_template.yaml mcp_agent.secrets.yaml $EDITOR mcp_agent.secrets.yaml

Usage

Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json file:

{ "mcpServers": { "perception-mcp": { "command": "fastmcp", "args": ["run", "perception-mcp", "serve"] } } }

Tools

Perception-MCP provides the following tools:

  • query_image: Answer a question about an image's contents
  • query_audio: Answer a question about an audio file's contents
  • query_video: Answer a question about a video's contents

Development

Running tests

uv run pytest -q
-
security - not tested
F
license - not found
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Enables asking questions about image, audio, or video files using state-of-the-art multimodal models. Powered by fal.ai for advanced media analysis and understanding capabilities.

  1. Prerequisites
    1. Installation
      1. Usage
        1. Tools
          1. Development
            1. Running tests

          Related MCP Servers

          • -
            security
            A
            license
            -
            quality
            A powerful server that integrates the Moondream vision model to enable advanced image analysis, including captioning, object detection, and visual question answering, through the Model Context Protocol, compatible with AI assistants like Claude and Cline.
            Last updated -
            18
            Apache 2.0
          • A
            security
            A
            license
            A
            quality
            Enables querying documents through a Langflow backend using natural language questions, providing an interface to interact with Langflow document Q\&A flows.
            Last updated -
            1
            14
            MIT License
            • Apple
          • -
            security
            F
            license
            -
            quality
            Provides chat and image analysis capabilities through OpenRouter.ai's diverse model ecosystem, enabling both text conversations and powerful multimodal image processing with various AI models.
            Last updated -
            20
            8
            • Apple
            • Linux
          • A
            security
            A
            license
            A
            quality
            High-performance MCP server that enables generation of images and videos using FAL AI models with automatic downloads to your local machine.
            Last updated -
            24
            50
            4
            MIT License
            • Apple
            • Linux

          View all related MCP servers

          MCP directory API

          We provide all the information about MCP servers via our MCP API.

          curl -X GET 'https://glama.ai/api/mcp/v1/servers/lintyourcode/perception-mcp'

          If you have feedback or need assistance with the MCP directory API, please join our Discord server