Skip to main content
Glama

MCP Video Recognition Server

README.md4.11 kB
# MCP Video Recognition Server An MCP (Model Context Protocol) server that provides tools for image, audio, and video recognition using Google's Gemini AI. <a href="https://glama.ai/mcp/servers/@mario-andreschak/mcp_video_recognition"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@mario-andreschak/mcp_video_recognition/badge" alt="Video Recognition Server MCP server" /> </a> ## Features - **Image Recognition**: Analyze and describe images using Google Gemini AI - **Audio Recognition**: Analyze and transcribe audio using Google Gemini AI - **Video Recognition**: Analyze and describe videos using Google Gemini AI ## Prerequisites - Node.js 18 or higher - Google Gemini API key ## Installation ### Manual Installation 1. Clone the repository: ```bash git clone https://github.com/yourusername/mcp-video-recognition.git cd mcp-video-recognition ``` 2. Install dependencies: ```bash npm install ``` 3. Build the project: ```bash npm run build ``` ### Installing in [FLUJO](https://github.com/mario-andreschak/FLUJO/) 1. Click Add Server 2. Copy & Paste Github URL into FLUJO 3. Click Parse, Clone, Install, Build and Save. ### Installing via Configuration Files To integrate this MCP server with Cline or other MCP clients via configuration files: 1. Open your Cline settings: - In VS Code, go to File -> Preferences -> Settings - Search for "Cline MCP Settings" - Click "Edit in settings.json" 2. Add the server configuration to the `mcpServers` object: ```json { "mcpServers": { "video-recognition": { "command": "node", "args": [ "/path/to/mcp-video-recognition/dist/index.js" ], "disabled": false, "autoApprove": [] } } } ``` 3. Replace `/path/to/mcp-video-recognition/dist/index.js` with the actual path to the `index.js` file in your project directory. Use forward slashes (/) or double backslashes (\\\\) for the path on Windows. 4. Save the settings file. Cline should automatically connect to the server. ## Configuration The server is configured using environment variables: - `GOOGLE_API_KEY` (required): Your Google Gemini API key - `TRANSPORT_TYPE`: Transport type to use (`stdio` or `sse`, defaults to `stdio`) - `PORT`: Port number for SSE transport (defaults to 3000) - `LOG_LEVEL`: Logging level (`verbose`, `debug`, `info`, `warn`, `error`, defaults to `info`) ## Usage ### Starting the Server #### With stdio Transport (Default) ```bash GOOGLE_API_KEY=your_api_key npm start ``` #### With SSE Transport ```bash GOOGLE_API_KEY=your_api_key TRANSPORT_TYPE=sse PORT=3000 npm start ``` ### Using the Tools The server provides three tools that can be called by MCP clients: #### Image Recognition ```json { "name": "image_recognition", "arguments": { "filepath": "/path/to/image.jpg", "prompt": "Describe this image in detail", "modelname": "gemini-2.0-flash" } } ``` #### Audio Recognition ```json { "name": "audio_recognition", "arguments": { "filepath": "/path/to/audio.mp3", "prompt": "Transcribe this audio", "modelname": "gemini-2.0-flash" } } ``` #### Video Recognition ```json { "name": "video_recognition", "arguments": { "filepath": "/path/to/video.mp4", "prompt": "Describe what happens in this video", "modelname": "gemini-2.0-flash" } } ``` ### Tool Parameters All tools accept the following parameters: - `filepath` (required): Path to the media file to analyze - `prompt` (optional): Custom prompt for the recognition (defaults to "Describe this content") - `modelname` (optional): Gemini model to use for recognition (defaults to "gemini-2.0-flash") ## Development ### Running in Development Mode ```bash GOOGLE_API_KEY=your_api_key npm run dev ``` ### Project Structure - `src/index.ts`: Entry point - `src/server.ts`: MCP server implementation - `src/tools/`: Tool implementations - `src/services/`: Service implementations (Gemini API) - `src/types/`: Type definitions - `src/utils/`: Utility functions ## License MIT

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp_video_recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server