226,405 tools. Last updated 2026-06-23 02:15

"A server for enabling an AI agent to view and analyze video files" matching MCP tools:

create_flow
Avocado AI
Create a new Avocado AI Flow pre-built with a node-graph pipeline, and return its id and direct URL so the user can open it on the canvas. You design the whole pipeline: pass the nodes and edges and the server validates socket compatibility, aligns video models to the input shape, lays the graph out left-to-right, and adds a caption per step. Edges reference nodes by 0-based index in the `nodes` array. This creates (does not run) the flow — the user runs it from the editor. Use the capability map below to choose node types, models, and handles: You are Avo, a senior creative-workflow designer inside Avocado AI's Flow editor. The user describes a creative goal; you respond with a node-graph proposal that the editor previews on the canvas. Think like a production director: design the FULL pipeline needed to get a polished result, not the minimum number of nodes. DESIGN PRINCIPLES — build capable, complete pipelines: - Match the pipeline's ambition to the request. A throwaway test is 2-3 nodes; a real deliverable (an ad, a UGC video, a product shot, a music video) is usually 5-12 nodes. Use up to 24 when it genuinely helps. - Prefer multi-stage quality: generate → refine (imageEditor) → upscale → animate, rather than a single generate node. Add an upscale step before any final image/video deliverable. - Use BRANCHING and FAN-OUT. One output can feed many nodes: e.g. one hero image → three different video models for variations the user can pick from; one script → both a voiceover and the video prompt. - Use PARALLEL TRACKS that converge: e.g. a voice track and an image track both feeding a lip-sync video; or a music track plus a visuals track. - Use the `llm` node to do creative thinking inside the graph — write or expand a script, brainstorm a prompt, turn a rough idea into a detailed image/video prompt — then wire its text output into the next node. - Pick the BEST model for each step (see the menus below). Don't leave everything on defaults — choosing models is a big part of the value. - Set per-node settings (aspect ratio, resolution, duration, voice, variations) when the request implies them (e.g. 'vertical' → 9:16, 'short' → duration 5, '3 options' → variations 3 or three branches). HARD RULES: - Use only the node types listed below. Never invent new ones. - Every edge must connect compatible socket types (text→text, image→image, audio→audio, video→video). - Give every runnable node a short `stepLabel` ('Step N — …') — it renders as a caption beneath that node. - `stickyNote` is only for standalone notes; never use it to caption a node (use `stepLabel`). Optionally add ONE stickyNote describing the workflow. - Any schema field you don't need must be `null` (numbers like `variations` too). MODEL MENUS (set the node's `model` to one of these ids): image (text-to-image) — `model` ids: • fal-ai/nano-banana-2 — fast, strong all-rounder (default) • fal-ai/gpt-image-2 — best instruction-following & legible text • fal-ai/bytedance/seedream/v5/lite/text-to-image — photoreal • fal-ai/flux-pro/v1.1-ultra — high detail / fidelity • fal-ai/nano-banana-pro — premium quality • fal-ai/recraft/v4/text-to-image — design, brand, vector-style • fal-ai/ideogram/v3 — posters & typography imageEditor (image + prompt → edited image) — `model` ids: • fal-ai/nano-banana-2/edit — default, multi-image (up to 14 inputs) • openai/gpt-image-2/edit — precise instruction edits • fal-ai/bytedance/seedream/v5/lite/edit — photoreal edits • fal-ai/flux-pro/kontext/max/text-to-image — style / context transfer • fal-ai/gemini-25-flash-image/edit — fast edits (the `image` input accepts MULTIPLE connections for compositing/restyle) imageUpscale (image → larger image) — `model` ids: • fal-ai/topaz/upscale/image — best quality (default) • fal-ai/recraft-crisp-upscale, fal-ai/clarity-upscaler, fal-ai/crystal-upscaler llm (text → text) — `model` ids: claude-haiku (default), gpt-4o-mini, kimi-k2, seed-1.8. Put the instruction in `prompt`. voice (text → speech) — pick a `voice` by name: Sarah (cheerful), Roger (deep), Laura (soft), Charlie (warm), George (bold), Callum (energetic), River (calm), Liam (reliable). The script comes from an upstream text/llm node wired into `in` — do NOT put the script in the voice node's prompt. music (text → music) — set `duration` to one of 30,60,90,120,180,240,300 (seconds). Put the music description in `prompt`. videoUpscale (video → sharper video) — add after a video node for final deliverables. No model field. VIDEO node — choose `model` to match the input shape (it drives which input handles the node renders): • Text → video: `kling3-pro`, `sora-2`, `veo3-1-fast`, `seedance-2.0-t2v`. Wire text to `prompt`. • Image → video (I2V): `veo3-1-fast`, `kling3-pro`, `seedance-2.0-i2v`, `hailuo-pro`. Wire the image to `image`. For keyframe models (`kling-o1`, `veo3-1`) wire `start-frame` + `end-frame`. • Lip-sync / talking-head: `fabric` (image + audio, NO prompt — never wire text into Fabric) or `infinitalk` (prompt + image + audio). Wire audio to `audio`. Audio-over-stills narration: `ltx2-audio`. • Multi-image reference / character consistency: `vidu` (≤7), `veo3-1-ref` (≤10), `kling-elements` (2-4 ordered frames), `happy-horse-ref` (≤9). Wire EACH image to the SAME `ref-images` handle (it accepts multiple connections). Never use the plain `image` handle. • Seedance reference (image + video + audio refs): `seedance-2.0-ref` / `seedance-2.0-ref-fast`. Wire to `ref-images` / `ref-videos` / `ref-audio`. • Motion control (drive a character with a motion video): `kling3-motion-control`. Wire character to `image`, motion clip (videoUpload) to `motion-video`. Edge handle hints: - When the target has multiple typed inputs (Video, Image Editor), set `toHandle` explicitly (`prompt`, `image`, `audio`, `ref-images`, `start-frame`, `end-frame`, `motion-video`). The editor otherwise picks the first type-compatible handle, which may be the wrong slot. - Never wire text into Fabric. Never wire a single image into a multi-ref model's `image` slot — use `ref-images`. Available node types (id — purpose — inputs / outputs): - text — Prompt — in: in<text> | out: out<text> - llm — LLM — in: in<text> | out: out<text> - upload — Upload — in: — | out: out<image> - videoUpload — Video Upload — in: — | out: out<video> - image — Image — in: in<text> | out: out<image> - imageEditor — Image Editor — in: prompt<text>, image<image> | out: out<image> - imageUpscale — Image Upscale — in: image<image> | out: out<image> - video — Video — in: prompt<text>, image<image>, start-frame<image>, end-frame<image>, ref-images<image>, ref-videos<video>, ref-audio<audio>, audio<audio>, motion-video<video> | out: out<video> - videoUpscale — Video Upscale — in: video<video> | out: out<video> - voice — Voice — in: in<text> | out: out<audio> - music — Music — in: in<text> | out: out<audio> - stickyNote — Sticky Note — in: in<annotation> | out: out<annotation> Edges reference nodes by index in the `nodes` array (0-based). In the examples below, any field not shown is `null`. EXAMPLES — study the PATTERNS (multi-stage, fan-out, parallel tracks), copy the handle names exactly: Example 1 — UGC talking-head with scripted voice + final upscale: nodes=[ {type:"llm",stepLabel:"Step 1 — Write a punchy 15s script",prompt:"Write a 15-second energetic UGC script for the product.",model:"claude-haiku"}, {type:"voice",stepLabel:"Step 2 — Voiceover",voice:"George"}, {type:"upload",stepLabel:"Step 3 — Upload character photo"}, {type:"video",stepLabel:"Step 4 — Lip-sync video",model:"fabric"}, {type:"videoUpscale",stepLabel:"Step 5 — Upscale to deliver"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"audio"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"video"} ] Example 2 — Text → image → refine → upscale (quality chain): nodes=[ {type:"text",stepLabel:"Step 1 — Prompt",prompt:"A cinematic product shot of a matte-black bottle on wet stone, golden hour"}, {type:"image",stepLabel:"Step 2 — Generate hero",model:"fal-ai/flux-pro/v1.1-ultra",aspectRatio:"4:3"}, {type:"imageEditor",stepLabel:"Step 3 — Add brand label",prompt:"Add a minimal embossed logo on the bottle",model:"fal-ai/nano-banana-2/edit"}, {type:"imageUpscale",stepLabel:"Step 4 — Upscale",model:"fal-ai/topaz/upscale/image"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"} ] Example 3 — Fan-out: one image → three video variations (different models): nodes=[ {type:"upload",stepLabel:"Step 1 — Source image"}, {type:"text",stepLabel:"Step 2 — Motion brief",prompt:"Slow cinematic push-in, gentle parallax"}, {type:"video",stepLabel:"Variation A — Veo",model:"veo3-1-fast",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation B — Kling",model:"kling3-pro",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation C — Seedance",model:"seedance-2.0-i2v",aspectRatio:"9:16",duration:"5"} ] edges=[ {fromIndex:0,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"image"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 4 — Multi-image reference video (character consistency): nodes=[ {type:"upload",stepLabel:"Ref 1 — Character front"}, {type:"upload",stepLabel:"Ref 2 — Character side"}, {type:"upload",stepLabel:"Ref 3 — Outfit detail"}, {type:"text",stepLabel:"Scene prompt",prompt:"The character walks through a neon market at night"}, {type:"video",stepLabel:"Generate with refs",model:"veo3-1-ref",aspectRatio:"16:9"} ] edges=[ {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:2,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 5 — Music video: parallel music + visuals tracks converging: nodes=[ {type:"music",stepLabel:"Track 1 — Score",prompt:"Dreamy lo-fi beat, 90 BPM",duration:"60"}, {type:"text",stepLabel:"Track 2 — Scene",prompt:"A lone astronaut drifting past a glowing planet"}, {type:"image",stepLabel:"Keyframe",model:"fal-ai/nano-banana-pro",aspectRatio:"16:9"}, {type:"video",stepLabel:"Animate",model:"ltx2-audio",aspectRatio:"16:9"} ] edges=[ {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"in"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"audio"} ] Return only the structured object — no prose, no markdown.
Connector
transcribe_media
Botverse
Transcribe an already-uploaded video/audio file (from get_upload_url) into a speaker-labelled transcript. Same one-call pipeline and options as transcribe_from_url (attendee naming, srt/vtt, formatted docx/pdf). Use for local files or files larger than a URL fetch allows (up to 2 GB). CONSENT: you must have all parties' consent. Poll get_job_status (live stage) until complete, then get_download_url / get_output_content. ~$5 per hour of audio.
Connector
analyze_files
AXIS Toolbox — Agentic Commerce Codebase Intelligence
Analyze source files directly and generate the full 137-artifact AXIS bundle without using GitHub. Returns snapshot_id plus artifact listing; use this for local, generated, or unsaved code. Requires Authorization: Bearer <api_key>. Use analyze_repo for GitHub URLs or improve_my_agent_with_axis for recommendation-first agent hardening.
Connector
improve_my_agent_with_axis
AXIS Toolbox — Agentic Commerce Codebase Intelligence
Analyze an agent codebase and return a prioritized AXIS hardening plan. Requires Authorization: Bearer <api_key>; this creates a snapshot and may return auth, quota, file-limit, or validation errors. Example: pass your agent source files to see missing AGENTS.md, CLAUDE.md, and MCP config gaps. Use this when you want recommendations and missing-context detection. Use analyze_files instead when you want the full artifact bundle directly.
Connector
publish_draft
prxhub
Single-call publish by draft_id. Build the draft with start_draft → add_sources → add_claims → set_synthesis, then call publish_draft({ draft_id }). The server compiles, signs, uploads, and returns the published bundle URL. Requires an authenticated agent account — register via register_agent + register_agent_poll first if your MCP session isn't already bound to an agent. Bundle size cap is 50 MB. prxhub signs a server-side agent attestation into `attestations/agent.<keyId>.sig.json` inside the stored tarball, so verifiers can confirm the bundle was published by this agent without trusting client-side crypto.
Connector
fvs_download_final_video
Future Video Studio
Download a completed Future Video Studio final render URL to a local file. Use this only after fvs_get_render_status or fvs_get_paid_render_status returns a final_video_url for a completed render. The tool performs an unauthenticated HTTPS GET to that signed URL and writes the response bytes to output_path on the MCP server's local filesystem. It does not call the FVS Agent API, spend wallet credits, require FVS_AGENT_API_KEY, cancel jobs, or modify remote render state. Side effects and constraints: output_path is a local filesystem path for the MCP server process, parent directories are created, existing files are not replaced unless overwrite is true, and large videos may take minutes to download. The request timeout is 600 seconds. Use a fresh status check to refresh expired signed URLs, and do not pass arbitrary or untrusted URLs.
Connector

Matching MCP Servers

Street View MCP
Location Services Developer Tools
vlad-ds
A
license
A
quality
D
maintenance
A server that enables AI models to fetch and display Google Street View imagery, allowing users to create virtual tours by viewing streets and landmarks from anywhere.
Last updated 2025-05-14
4
7
MIT
Video to Text MCP Server
Multimedia Processing Audio Processing Speech Processing
strzhao
A
license
B
quality
D
maintenance
Enables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.
Last updated 2026-01-13
2
2
ISC

Matching MCP Connectors

Agent to Agent communication
Connect any two AI agents and let them talk directly. Claude to Claude, Claude to OpenAI, or any MCP-compatible agent
ctaylor86-mcp-video-download-serverOAuth
Connect your video workflows to cloud storage. Organize and access video assets across projects wi…

intel_agent
TunnelMind Data API
Probes a domain for known AI agent integration signals: `llms.txt`, `ai.txt`, `/.well-known/ai-plugin.json`, `openapi.json`, `swagger.json`, MCP manifest, MCP SSE endpoint. Returns a score based on the count of signals detected. Use this to assess whether a domain is ready for agent-to-agent interaction. Use this tool when: - You want to know whether a domain exposes an MCP server or OpenAPI spec for agents. - You are cataloguing the AI-agent-ready surface of a set of domains. - You need to decide whether to attempt programmatic API access to a domain. Do NOT use this tool when: - You need tracker/surveillance data about the domain — use `get_domain` instead. - You need the robots.txt AI crawler policy — use `intel_robots` instead. - You need HTTP security posture — use `intel_http` instead. Inputs: - `domain` (query, required): Domain to probe. Returns: - Boolean flags per signal (`llms_txt`, `ai_plugin`, `openapi`, `mcp_manifest`, `mcp_endpoint`, `mcp_sse`). - `agent_surface_score`: integer 0-8, count of signals detected. Cost: - Free. No API key required. Latency: - Typical: 2-5s (parallel probes), p99: 8s.
Connector
generate_video
Avocado AI
Generate an AI video. Thirteen models: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), seedance-2.0-ref / -ref-fast (REFERENCE-to-video: locks character/style across generations from reference images — pass reference_image_urls and/or reference_file_ids; ideal for keeping a Storyboard Studio character consistent), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video — supplying image_url or file_id automatically picks image mode), grok-imagine-video-v1-5 (480p/720p, 1-15s, REQUIRES an image — image-to-video only), happy-horse-t2v (Happy Horse text-to-video, 720p/1080p, 3-15s, with native audio + lip-sync), happy-horse-i2v (Happy Horse image-to-video, REQUIRES an image, 720p/1080p, 3-15s). For image-to-video on any host: call prepare_image_upload first, then pass the returned file_id here. Renders take 2-10 minutes; the inline result card polls for completion. Pricing is per-second, varies by model and resolution.
Connector
assets.list
AI Design Blueprint
Public — list downloadable doctrine and agent asset artifacts (skill packs, rule packs, MCP setup snippets) the user can drop into their AI coding tool to import the Blueprint as native skill/rule files. Returns a list of assets with name, format (one of: zip / md / markdown / mdc / json / toml / text — the full vocabulary), pack_version, download_url, and platform target (Claude Code, Cursor, Codex, Gemini, Qwen). The response also carries `count` (length of `assets`) for symmetry with principles.list / clusters.list / guides.list. WHEN TO CALL: the user asks how to bring the Blueprint into their coding agent, or wants to install it as a local skill/rule file. WHEN NOT TO CALL: for the live MCP tools themselves — those are already available through this server. For doctrine content, prefer principles.list/get and guides.list/get. BEHAVIOR: read-only, idempotent, no auth required. Asset artefacts are regenerated on every deploy from the canonical doctrine.
Connector
files_read
DialogBrain
Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
Connector
files_read
dialogbrain
Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
Connector
get_upload_url
Botverse
Get a presigned PUT URL to upload any file — video, audio, or document (markdown, HTML, DOCX, etc.). The URL expires in 15 minutes. PUT raw file bytes directly to the URL. After upload, pass the object_key to transcode_video (for video) or convert_file (for documents). IMPORTANT: this flow needs direct outbound network access to Botverse's S3 bucket. In sandboxed agent environments (claude.ai, sandboxed desktop apps, Cursor) that route traffic through a proxy allowlist, the PUT is blocked and the upload fails. In those environments do NOT use this tool — use convert_content or transcode_content (inline content, body under 500 KB) for files you already have, or convert_from_url / transcode_from_url for anything available at a public URL. Neither needs an upload step.
Connector
create_upload_session
SendIt
Create a browser upload link for media files. ALWAYS use this when the user shares an image or video in chat — their file is local and cannot be passed directly to publish_content. WORKFLOW: 1. Call this tool to get an uploadUrl 2. Give the user the link to open in their browser and upload their file 3. After upload, call get_upload_session to get the public media URL(s) 4. Use the returned URL with publish_content or schedule_content Supports up to 20 files per session. Expires in 15 minutes.
Connector
ask_video
Reka
Ask a question about one or more videos with visual analysis. Most effective on focused time ranges — use start/end to specify the segment to analyze. BEFORE calling this tool, read the reka://docs/guide resource for recommended workflows. In most cases, you should first: - search_videos to find WHEN something happens, then pass those timestamps here as start/end - segment_video to detect and locate specific objects - get_transcript to read what was said For single-video questions, pass video_id with start/end. For cross-video questions, pass videos — a list of video references with start/end each. For follow-up questions, pass conversation_id from the previous response. You can add start/end to drill into a specific moment while keeping the conversation context. Requires qa_only or full pipeline.
Connector
videos_generate
dialogbrain
Generate a short video (5-10s) from a text prompt using BytePlus Seedance. Optionally accepts up to 12 image file IDs from the user's attached files (visible in the [ATTACHMENTS] block) as `reference_file_ids` for style and composition. Returns immediately with a job_id; the video is delivered back via continuation when the job completes (~30-90s for fast model, ~2-5min for pro). Reference images are temporarily re-hosted on a third-party CDN (imgbb) for the duration of generation and deleted on completion — don't submit confidential references. Gated behind a workspace opt-in flag.
Connector
send
RogerThat
Send a message to another agent on the channel you joined, or to 'all' to broadcast. Requires a prior join() in this session. The 'to' field accepts: a callsign ('front'), an index ('#1' or '1') from roster(), or 'all'. If omitted, defaults to 'all' (broadcast — walkie-talkie default). Optional `priority` tags urgency (min|low|default|high|urgent). Optional `suggested_replies` hints up to 4 canned replies that human-in-the-loop UIs (like the /remote phone view) render as tappable chips — agent receivers can read them too and pick one. Optional `attachments` carries up to 4 small inline files (≤512KB base64 total) — designed for sporadic screenshots / PDFs; bigger files should be hosted externally and pasted as a URL. Optional `kind`: set 'status' to send an ephemeral 'working on it' signal instead of a normal message (see the `kind` field).
Connector
content_catalog
mcp-knowledge
Browse the Gapup gold-standard content catalogue — video games, films, TV series and music. Returns franchises with their works (title, release year). When to use this tool: an agent needs structured, audited metadata for a cultural franchise, wants to resolve a title to a canonical entity, or browses a domain's catalogue before requesting enrichment. Inputs: a content domain and an optional case-insensitive name filter. Each franchise id can be passed to content_enrichment for its fine-grained tag profile.
Connector
malware_get_cross_server_routes
website-search
Get Lenny Zeltser's Malware cross-server handoff routes — when this MCP server can't fulfill a request, which other MCP servers (or fallback workflows) to consult. Surfaces a compact subset of `malware_load_context`. This server never requests your sample, analysis notes, or indicators and instructs your AI to keep them local—guidelines and the report template flow to your AI for local analysis.
Connector
assessment_review_report
website-search
Get Lenny Zeltser's expert criteria for reviewing an existing security assessment report or brief. Surfaces the 17 info-assessment review items across five groups (Key Takeaways, Assessment Scope, Prioritized Findings, Remediation Suggestions, Assessment Methodology), cross-cutting criteria, the risk-adjusted severity model, anti-patterns, and a pointer to rating_score_writing for a numeric score. This server never requests your assessment notes or report and instructs your AI to keep them local—the templates and guidelines flow to your AI for local analysis.
Connector
assessment_get_cross_server_routes
website-search
Get Lenny Zeltser's Security Assessment cross-server handoff routes — when this MCP server can't fulfill a request, which other MCP servers (or fallback workflows) to consult. Surfaces a compact subset of `assessment_load_context`. This server never requests your assessment notes or report and instructs your AI to keep them local—the templates and guidelines flow to your AI for local analysis.
Connector