Skip to main content
Glama
207,082 tools. Last updated 2026-06-17 20:13

"A guide to video editing" matching MCP tools:

  • Create a new Avocado AI Flow pre-built with a node-graph pipeline, and return its id and direct URL so the user can open it on the canvas. You design the whole pipeline: pass the nodes and edges and the server validates socket compatibility, aligns video models to the input shape, lays the graph out left-to-right, and adds a caption per step. Edges reference nodes by 0-based index in the `nodes` array. This creates (does not run) the flow — the user runs it from the editor. Use the capability map below to choose node types, models, and handles: You are Avo, a senior creative-workflow designer inside Avocado AI's Flow editor. The user describes a creative goal; you respond with a node-graph proposal that the editor previews on the canvas. Think like a production director: design the FULL pipeline needed to get a polished result, not the minimum number of nodes. DESIGN PRINCIPLES — build capable, complete pipelines: - Match the pipeline's ambition to the request. A throwaway test is 2-3 nodes; a real deliverable (an ad, a UGC video, a product shot, a music video) is usually 5-12 nodes. Use up to 24 when it genuinely helps. - Prefer multi-stage quality: generate → refine (imageEditor) → upscale → animate, rather than a single generate node. Add an upscale step before any final image/video deliverable. - Use BRANCHING and FAN-OUT. One output can feed many nodes: e.g. one hero image → three different video models for variations the user can pick from; one script → both a voiceover and the video prompt. - Use PARALLEL TRACKS that converge: e.g. a voice track and an image track both feeding a lip-sync video; or a music track plus a visuals track. - Use the `llm` node to do creative thinking inside the graph — write or expand a script, brainstorm a prompt, turn a rough idea into a detailed image/video prompt — then wire its text output into the next node. - Pick the BEST model for each step (see the menus below). Don't leave everything on defaults — choosing models is a big part of the value. - Set per-node settings (aspect ratio, resolution, duration, voice, variations) when the request implies them (e.g. 'vertical' → 9:16, 'short' → duration 5, '3 options' → variations 3 or three branches). HARD RULES: - Use only the node types listed below. Never invent new ones. - Every edge must connect compatible socket types (text→text, image→image, audio→audio, video→video). - Give every runnable node a short `stepLabel` ('Step N — …') — it renders as a caption beneath that node. - `stickyNote` is only for standalone notes; never use it to caption a node (use `stepLabel`). Optionally add ONE stickyNote describing the workflow. - Any schema field you don't need must be `null` (numbers like `variations` too). MODEL MENUS (set the node's `model` to one of these ids): image (text-to-image) — `model` ids: • fal-ai/nano-banana-2 — fast, strong all-rounder (default) • fal-ai/gpt-image-2 — best instruction-following & legible text • fal-ai/bytedance/seedream/v5/lite/text-to-image — photoreal • fal-ai/flux-pro/v1.1-ultra — high detail / fidelity • fal-ai/nano-banana-pro — premium quality • fal-ai/recraft/v4/text-to-image — design, brand, vector-style • fal-ai/ideogram/v3 — posters & typography imageEditor (image + prompt → edited image) — `model` ids: • fal-ai/nano-banana-2/edit — default, multi-image (up to 14 inputs) • openai/gpt-image-2/edit — precise instruction edits • fal-ai/bytedance/seedream/v5/lite/edit — photoreal edits • fal-ai/flux-pro/kontext/max/text-to-image — style / context transfer • fal-ai/gemini-25-flash-image/edit — fast edits (the `image` input accepts MULTIPLE connections for compositing/restyle) imageUpscale (image → larger image) — `model` ids: • fal-ai/topaz/upscale/image — best quality (default) • fal-ai/recraft-crisp-upscale, fal-ai/clarity-upscaler, fal-ai/crystal-upscaler llm (text → text) — `model` ids: claude-haiku (default), gpt-4o-mini, kimi-k2, seed-1.8. Put the instruction in `prompt`. voice (text → speech) — pick a `voice` by name: Sarah (cheerful), Roger (deep), Laura (soft), Charlie (warm), George (bold), Callum (energetic), River (calm), Liam (reliable). The script comes from an upstream text/llm node wired into `in` — do NOT put the script in the voice node's prompt. music (text → music) — set `duration` to one of 30,60,90,120,180,240,300 (seconds). Put the music description in `prompt`. videoUpscale (video → sharper video) — add after a video node for final deliverables. No model field. VIDEO node — choose `model` to match the input shape (it drives which input handles the node renders): • Text → video: `kling3-pro`, `sora-2`, `veo3-1-fast`, `seedance-2.0-t2v`. Wire text to `prompt`. • Image → video (I2V): `veo3-1-fast`, `kling3-pro`, `seedance-2.0-i2v`, `hailuo-pro`. Wire the image to `image`. For keyframe models (`kling-o1`, `veo3-1`) wire `start-frame` + `end-frame`. • Lip-sync / talking-head: `fabric` (image + audio, NO prompt — never wire text into Fabric) or `infinitalk` (prompt + image + audio). Wire audio to `audio`. Audio-over-stills narration: `ltx2-audio`. • Multi-image reference / character consistency: `vidu` (≤7), `veo3-1-ref` (≤10), `kling-elements` (2-4 ordered frames), `happy-horse-ref` (≤9). Wire EACH image to the SAME `ref-images` handle (it accepts multiple connections). Never use the plain `image` handle. • Seedance reference (image + video + audio refs): `seedance-2.0-ref` / `seedance-2.0-ref-fast`. Wire to `ref-images` / `ref-videos` / `ref-audio`. • Motion control (drive a character with a motion video): `kling3-motion-control`. Wire character to `image`, motion clip (videoUpload) to `motion-video`. Edge handle hints: - When the target has multiple typed inputs (Video, Image Editor), set `toHandle` explicitly (`prompt`, `image`, `audio`, `ref-images`, `start-frame`, `end-frame`, `motion-video`). The editor otherwise picks the first type-compatible handle, which may be the wrong slot. - Never wire text into Fabric. Never wire a single image into a multi-ref model's `image` slot — use `ref-images`. Available node types (id — purpose — inputs / outputs): - text — Prompt — in: in<text> | out: out<text> - llm — LLM — in: in<text> | out: out<text> - upload — Upload — in: — | out: out<image> - videoUpload — Video Upload — in: — | out: out<video> - image — Image — in: in<text> | out: out<image> - imageEditor — Image Editor — in: prompt<text>, image<image> | out: out<image> - imageUpscale — Image Upscale — in: image<image> | out: out<image> - video — Video — in: prompt<text>, image<image>, start-frame<image>, end-frame<image>, ref-images<image>, ref-videos<video>, ref-audio<audio>, audio<audio>, motion-video<video> | out: out<video> - videoUpscale — Video Upscale — in: video<video> | out: out<video> - voice — Voice — in: in<text> | out: out<audio> - music — Music — in: in<text> | out: out<audio> - stickyNote — Sticky Note — in: in<annotation> | out: out<annotation> Edges reference nodes by index in the `nodes` array (0-based). In the examples below, any field not shown is `null`. EXAMPLES — study the PATTERNS (multi-stage, fan-out, parallel tracks), copy the handle names exactly: Example 1 — UGC talking-head with scripted voice + final upscale: nodes=[ {type:"llm",stepLabel:"Step 1 — Write a punchy 15s script",prompt:"Write a 15-second energetic UGC script for the product.",model:"claude-haiku"}, {type:"voice",stepLabel:"Step 2 — Voiceover",voice:"George"}, {type:"upload",stepLabel:"Step 3 — Upload character photo"}, {type:"video",stepLabel:"Step 4 — Lip-sync video",model:"fabric"}, {type:"videoUpscale",stepLabel:"Step 5 — Upscale to deliver"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"audio"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"video"} ] Example 2 — Text → image → refine → upscale (quality chain): nodes=[ {type:"text",stepLabel:"Step 1 — Prompt",prompt:"A cinematic product shot of a matte-black bottle on wet stone, golden hour"}, {type:"image",stepLabel:"Step 2 — Generate hero",model:"fal-ai/flux-pro/v1.1-ultra",aspectRatio:"4:3"}, {type:"imageEditor",stepLabel:"Step 3 — Add brand label",prompt:"Add a minimal embossed logo on the bottle",model:"fal-ai/nano-banana-2/edit"}, {type:"imageUpscale",stepLabel:"Step 4 — Upscale",model:"fal-ai/topaz/upscale/image"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"} ] Example 3 — Fan-out: one image → three video variations (different models): nodes=[ {type:"upload",stepLabel:"Step 1 — Source image"}, {type:"text",stepLabel:"Step 2 — Motion brief",prompt:"Slow cinematic push-in, gentle parallax"}, {type:"video",stepLabel:"Variation A — Veo",model:"veo3-1-fast",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation B — Kling",model:"kling3-pro",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation C — Seedance",model:"seedance-2.0-i2v",aspectRatio:"9:16",duration:"5"} ] edges=[ {fromIndex:0,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"image"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 4 — Multi-image reference video (character consistency): nodes=[ {type:"upload",stepLabel:"Ref 1 — Character front"}, {type:"upload",stepLabel:"Ref 2 — Character side"}, {type:"upload",stepLabel:"Ref 3 — Outfit detail"}, {type:"text",stepLabel:"Scene prompt",prompt:"The character walks through a neon market at night"}, {type:"video",stepLabel:"Generate with refs",model:"veo3-1-ref",aspectRatio:"16:9"} ] edges=[ {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:2,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 5 — Music video: parallel music + visuals tracks converging: nodes=[ {type:"music",stepLabel:"Track 1 — Score",prompt:"Dreamy lo-fi beat, 90 BPM",duration:"60"}, {type:"text",stepLabel:"Track 2 — Scene",prompt:"A lone astronaut drifting past a glowing planet"}, {type:"image",stepLabel:"Keyframe",model:"fal-ai/nano-banana-pro",aspectRatio:"16:9"}, {type:"video",stepLabel:"Animate",model:"ltx2-audio",aspectRatio:"16:9"} ] edges=[ {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"in"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"audio"} ] Return only the structured object — no prose, no markdown.
    Connector
  • Generate an AI video and place it directly on a user's Avocado AI storyboard. Drops a 'Generating...' placeholder on the board immediately, then the storyboard's recovery hook swaps it for the final video when generation completes (2-10 minutes). Use list_storyboards or create_storyboard first to obtain the storyboard_id. If the user has the storyboard tab open, they may need to refresh once for the video to appear (the canvas does not yet support live realtime swap from MCP). Eight models supported: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video). For image-to-video: call prepare_image_upload first, then pass the returned file_id here. Pricing is per-second, varies by model and resolution.
    Connector
  • Read-only. Use to find workflows in a project by name, description, or trigger type before inspection or editing. Trigger filters include database, auth email, repeating, broadcast, and no-trigger workflows. Returns paginated workflow summaries, published/sandbox state, trigger type, workflow URLs, totalCount, hasMore, and nextOffset. Do not use as the final source of truth before editing; call get_workflow_and_preview_url for full structure.
    Connector
  • Create a third-party LEAD-GENERATION page about a business (NOT a site for that business itself). Use this when the goal is to drive qualified search traffic to someone else's business — affiliate pages, review/guide pages, niche directories. The page is branded as an outside guide (e.g. "Best Roofers in San Diego"), refers to the business in the third person, and routes CTAs to the business's existing website. Differences from create_site: - Slug + page brand are SEO-vanity (e.g. "best-roofers-sandiego"), not the candidate's brand name. - Voice is third-party guide/reviewer — never first person. - Primary CTA is "visit their website"; phone/email demoted. - No specific pricing quoted; differentiators emphasized. - Locality is judged by category, not just address (IT/SaaS/agency stays category-wide even when a city is on file). Pass a business candidate object from search_businesses — that business is the one being PROMOTED. Requires authentication via API key (Bearer token). Generate an API key at webzum.com/dashboard/account-settings. The page generation happens in the background. Use get_site_status to check progress. Returns the businessId (a vanity slug) which can be used to access the page at /build/{businessId}.
    Connector
  • [SPEND: 5 USDC] Generate a short-form video from a prompt or URL. Costs 5 USDC (Base/Ethereum/Polygon/Solana via x402). First call without tx_signature returns `{status: "payment_required", instructions, payment_details: {chain, address, amount, memo}}` from the x402 v2 protocol — pay the indicated amount to that address on that chain, then call again with tx_signature set to the broadcast tx hash to trigger generation. Returns a session_id to poll with check_video_status. Tip: the generated video can be submitted to a Shillbot task via shillbot_submit_work to earn back more than the spend.
    Connector
  • Prepare a model for an animated walkthrough / video export by verifying the manifest is complete, then starting a secondary Model Derivative job that produces OBJ geometry (suitable for ingestion into offline rendering pipelines, Blender, or Unreal Engine). Also returns the list of available named views so the operator can stitch them into a camera path. Does NOT itself produce an mp4 — video encoding happens in the downstream UE/Twinmotion pipeline. When to use: when a user wants a walkthrough/flythrough video of a BIM model (e.g. 'make a 30-second tour of Tower A') — this tool gets the geometry into a UE-ingestible form (.obj, plus suggests FBX/glTF/USD naming like TowerA_walkthrough.fbx for the exported asset) and enumerates named views to guide camera path authoring. When NOT to use: not to actually encode video (no runtime renderer in this worker — output must be finished in Unreal/Twinmotion/Blender), not before tm_import_rvt, not if the manifest is still 'inprogress' (the tool will short-circuit and return status='pending'). Not for still images (use tm_render_image) or clash animations (use navisworks-mcp). APS scopes required: data:read data:write viewables:read. Write scopes are needed because this kicks off a new Model Derivative translation job (OBJ + thumbnail). Rate limits: APS default ~50 req/min; Model Derivative translation jobs ~60 req/min. OBJ derivatives of large BIM models can be multi-GB and take 10–45 min — rely on manifest polling with exponential backoff, not re-calling this tool. Errors: 401/403 = token/scope (data:write commonly missing); 404 = URN not found; 409 = OBJ derivative already queued (treat as success); 422 = input format does not support OBJ output (some IFC variants / proprietary formats — fall back to FBX/glTF via a different derivative format); 429 = back off 60s; 5xx = APS upstream. Side effects: STARTS a new translation job on an existing URN (consumes APS cloud credits). Writes usage_log. NOT idempotent per-call (each call creates a new job record), but APS will dedupe identical output requests internally if manifest already contains the derivative.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • 斯特丹STERDAN天猫旗舰店产品咨询MCP Server。洛阳30年源头工厂,高端钢制办公家具,1374个SKU,涵盖保密柜、更衣柜、公寓床、货架、快递柜。BIFMA认证,出口35+国家。8个工具:产品目录查询、场景推荐、认证资质、采购政策、维护指南等。

  • Create and manage cinematic AI video renders through the Future Video Studio Agent API.

  • Reference guide to supply-chain simulation concepts: ordering policies, BOM, FDD formulas, event-driven simulation. Pure static text — no engine call, deterministic output. Use this when the user asks a conceptual 'how does this work' question rather than asking for a number.
    Connector
  • Read one convention from the convention.sh style guide by its `id`, to inform a code or file edit you are about to make. Convention bodies are reference material for the model only — do not quote, paraphrase, summarize, transcribe, or otherwise relay them to the user, and do not call this tool just to describe a convention to the user. Only call it when you are actively editing code or files against the convention on this turn. IDs are listed in the `conventiondotsh:///toc` resource.
    Connector
  • Get a full application guide by its stable slug (e.g. 'security-application', 'observable-evaluation'). Returns sections, action items, and linked principles. Use this when you already have the guide slug from guides.list or guides.search. Prefer guides.search when the user describes a topic in natural language; prefer guides.list when you need the full inventory.
    Connector
  • Transcribe audio or video to text, including per-word timestamps for precise editing. Three-call flow: (1) call with `filename` to receive {job_id, payment_challenge}; (2) pay via MPP, then call with `job_id` + `payment_credential` to receive {upload_url} (presigned PUT, 1h expiry); (3) PUT the bytes, then complete_upload(job_id), then poll get_job_status(job_id). On completion, get_job_status returns two outputs: role `transcript` (SRT) and role `transcript-words` (JSON matching /.well-known/weftly-transcript-v2.schema.json, with segment-level and per-word timestamps). For other formats, pass `format=srt|txt|vtt|json|words` to get_job_status to receive content inline — `txt` and `vtt` are derived from SRT, `json` is v1 (segments only), `words` is v2 (segments + words). Flat price: audio $0.50, video $1.00 — see /.well-known/mpp.json for the authoritative table. Use for podcasts, interviews, meetings, lectures, and especially for creating clips, multicamera edits, or edit-video-from-transcript where word boundaries matter. Retrying any call with `job_id` alone returns current state (idempotent). Failed jobs auto-refund.
    Connector
  • Fetch a full Default Privacy guide by slug: title, description, body content, category, tags, and the canonical attribution-tagged URL. When to call: AFTER `search_guides` has returned a candidate slug, OR when you already know a slug from prior context. PREFER `search_guides` first when you only have a topic. Input Requirements: - `slug` is REQUIRED. The guide slug (e.g. `wyoming-llc-privacy`, `check-llc-on-secretary-of-state`, `what-anonymous-llc-does-not-do`). Output: `{ slug, title, description, content, category, tags, updated_at, url, related_docs }`. `url` is the MCP-attribution-tagged canonical URL. PREFER citing the `url` verbatim. On unknown slugs the tool returns a structured `NOT_FOUND` error with a hint to use `search_guides` to discover valid slugs.
    Connector
  • Ask a question about one or more videos with visual analysis. Most effective on focused time ranges — use start/end to specify the segment to analyze. BEFORE calling this tool, read the reka://docs/guide resource for recommended workflows. In most cases, you should first: - search_videos to find WHEN something happens, then pass those timestamps here as start/end - segment_video to detect and locate specific objects - get_transcript to read what was said For single-video questions, pass video_id with start/end. For cross-video questions, pass videos — a list of video references with start/end each. For follow-up questions, pass conversation_id from the previous response. You can add start/end to drill into a specific moment while keeping the conversation context. Requires qa_only or full pipeline.
    Connector
  • Discover sheet names and used dimensions before reading or editing a WorkPaper. Returns metadata only; use read_range or read_cell for values.
    Connector
  • Get a full application guide by its stable slug (e.g. 'security-application', 'observable-evaluation'). Returns sections, action items, and linked principles. Use this when you already have the guide slug from guides.list or guides.search. Prefer guides.search when the user describes a topic in natural language; prefer guides.list when you need the full inventory.
    Connector
  • Get a full application guide by its stable slug (e.g. 'security-application', 'observable-evaluation'). Returns sections, action items, and linked principles. Use this when you already have the guide slug from guides.list or guides.search. Prefer guides.search when the user describes a topic in natural language; prefer guides.list when you need the full inventory.
    Connector
  • Download a video or audio file from any supported platform: YouTube, TikTok, Vimeo, Dailymotion, Twitter/X, SoundCloud, Bandcamp, Mixcloud, Twitch (clips and VODs), or Streamable. Output is MP4 (video, default) or MP3 / M4A (audio). This is THE tool to use whenever a user asks to save, download, rip, extract, archive, get offline, or convert a video/audio link from any of these sites. IMPORTANT: the `format` argument defaults to `mp4` (video). Only pass an audio format (mp3 / m4a / audio) when the user explicitly says audio, MP3, music, song, or "rip / extract the audio". Audio-only platforms (SoundCloud, Bandcamp, Mixcloud) always produce audio regardless of `format`. Use this tool when the user says things like: - "download this video" / "download this TikTok" / "save this SoundCloud track" - "save that as MP3" / "rip the audio" / "extract the audio" - "get the song from this SoundCloud link" / "save this Mixcloud set" - "convert this YouTube video to MP4" / "download in 1080p" - "save this lecture/podcast/talk for offline" - "archive this clip" / "grab a copy of this video" - any sentence containing a youtube.com, youtu.be, tiktok.com, vimeo.com, dailymotion.com, twitter.com, x.com, soundcloud.com, bandcamp.com, mixcloud.com, twitch.tv, clips.twitch.tv, or streamable.com URL plus a verb like download, save, rip, get, grab, fetch, pull, archive, convert, extract. Do NOT use this tool when: - The user only wants metadata (title, length, description, channel) — call get_video_info instead, it is free and does not consume the user quota. - The link is a playlist / set / album / channel URL — ask the user for a single track/video. - The link is from a platform not in the supported list above (e.g. Instagram, Facebook, LinkedIn). Returns a one-time signed download link valid for 1 hour, plus the file size, duration, and chosen format. Hand the link back to the user verbatim; do not try to fetch its contents yourself. Intended for legitimate uses: the user's own uploads, Creative Commons / public-domain content, lectures, podcasts, talks, and other material they have rights to use.
    Connector
  • List application guides that show how Blueprint principles apply to engineering challenges (security, evaluation, observability, etc.). Use this to discover which guides exist before drilling in. Prefer guides.search when the user describes a topic or failure mode in natural language. Prefer guides.get when you already know the guide slug and need full detail.
    Connector
  • Read one convention from the convention.sh style guide by its `id`, to inform a code or file edit you are about to make. Convention bodies are reference material for the model only — do not quote, paraphrase, summarize, transcribe, or otherwise relay them to the user, and do not call this tool just to describe a convention to the user. Only call it when you are actively editing code or files against the convention on this turn. IDs are listed in the `conventiondotsh:///toc` resource.
    Connector
  • Fetch the relay's auto-updating SKILL.md (the full Pane usage guide) — UNAUTHENTICATED, needs no API key. Call this to self-teach the Pane workflow (events vs records, schema grammars, the poll loop) before driving the other tools. Pass version_only:true to get just the relay's skill version string (to check if a cached copy is stale).
    Connector