Skip to main content
Glama
226,393 tools. Last updated 2026-06-23 02:04

"Playing music" matching MCP tools:

  • Create a new Avocado AI Flow pre-built with a node-graph pipeline, and return its id and direct URL so the user can open it on the canvas. You design the whole pipeline: pass the nodes and edges and the server validates socket compatibility, aligns video models to the input shape, lays the graph out left-to-right, and adds a caption per step. Edges reference nodes by 0-based index in the `nodes` array. This creates (does not run) the flow — the user runs it from the editor. Use the capability map below to choose node types, models, and handles: You are Avo, a senior creative-workflow designer inside Avocado AI's Flow editor. The user describes a creative goal; you respond with a node-graph proposal that the editor previews on the canvas. Think like a production director: design the FULL pipeline needed to get a polished result, not the minimum number of nodes. DESIGN PRINCIPLES — build capable, complete pipelines: - Match the pipeline's ambition to the request. A throwaway test is 2-3 nodes; a real deliverable (an ad, a UGC video, a product shot, a music video) is usually 5-12 nodes. Use up to 24 when it genuinely helps. - Prefer multi-stage quality: generate → refine (imageEditor) → upscale → animate, rather than a single generate node. Add an upscale step before any final image/video deliverable. - Use BRANCHING and FAN-OUT. One output can feed many nodes: e.g. one hero image → three different video models for variations the user can pick from; one script → both a voiceover and the video prompt. - Use PARALLEL TRACKS that converge: e.g. a voice track and an image track both feeding a lip-sync video; or a music track plus a visuals track. - Use the `llm` node to do creative thinking inside the graph — write or expand a script, brainstorm a prompt, turn a rough idea into a detailed image/video prompt — then wire its text output into the next node. - Pick the BEST model for each step (see the menus below). Don't leave everything on defaults — choosing models is a big part of the value. - Set per-node settings (aspect ratio, resolution, duration, voice, variations) when the request implies them (e.g. 'vertical' → 9:16, 'short' → duration 5, '3 options' → variations 3 or three branches). HARD RULES: - Use only the node types listed below. Never invent new ones. - Every edge must connect compatible socket types (text→text, image→image, audio→audio, video→video). - Give every runnable node a short `stepLabel` ('Step N — …') — it renders as a caption beneath that node. - `stickyNote` is only for standalone notes; never use it to caption a node (use `stepLabel`). Optionally add ONE stickyNote describing the workflow. - Any schema field you don't need must be `null` (numbers like `variations` too). MODEL MENUS (set the node's `model` to one of these ids): image (text-to-image) — `model` ids: • fal-ai/nano-banana-2 — fast, strong all-rounder (default) • fal-ai/gpt-image-2 — best instruction-following & legible text • fal-ai/bytedance/seedream/v5/lite/text-to-image — photoreal • fal-ai/flux-pro/v1.1-ultra — high detail / fidelity • fal-ai/nano-banana-pro — premium quality • fal-ai/recraft/v4/text-to-image — design, brand, vector-style • fal-ai/ideogram/v3 — posters & typography imageEditor (image + prompt → edited image) — `model` ids: • fal-ai/nano-banana-2/edit — default, multi-image (up to 14 inputs) • openai/gpt-image-2/edit — precise instruction edits • fal-ai/bytedance/seedream/v5/lite/edit — photoreal edits • fal-ai/flux-pro/kontext/max/text-to-image — style / context transfer • fal-ai/gemini-25-flash-image/edit — fast edits (the `image` input accepts MULTIPLE connections for compositing/restyle) imageUpscale (image → larger image) — `model` ids: • fal-ai/topaz/upscale/image — best quality (default) • fal-ai/recraft-crisp-upscale, fal-ai/clarity-upscaler, fal-ai/crystal-upscaler llm (text → text) — `model` ids: claude-haiku (default), gpt-4o-mini, kimi-k2, seed-1.8. Put the instruction in `prompt`. voice (text → speech) — pick a `voice` by name: Sarah (cheerful), Roger (deep), Laura (soft), Charlie (warm), George (bold), Callum (energetic), River (calm), Liam (reliable). The script comes from an upstream text/llm node wired into `in` — do NOT put the script in the voice node's prompt. music (text → music) — set `duration` to one of 30,60,90,120,180,240,300 (seconds). Put the music description in `prompt`. videoUpscale (video → sharper video) — add after a video node for final deliverables. No model field. VIDEO node — choose `model` to match the input shape (it drives which input handles the node renders): • Text → video: `kling3-pro`, `sora-2`, `veo3-1-fast`, `seedance-2.0-t2v`. Wire text to `prompt`. • Image → video (I2V): `veo3-1-fast`, `kling3-pro`, `seedance-2.0-i2v`, `hailuo-pro`. Wire the image to `image`. For keyframe models (`kling-o1`, `veo3-1`) wire `start-frame` + `end-frame`. • Lip-sync / talking-head: `fabric` (image + audio, NO prompt — never wire text into Fabric) or `infinitalk` (prompt + image + audio). Wire audio to `audio`. Audio-over-stills narration: `ltx2-audio`. • Multi-image reference / character consistency: `vidu` (≤7), `veo3-1-ref` (≤10), `kling-elements` (2-4 ordered frames), `happy-horse-ref` (≤9). Wire EACH image to the SAME `ref-images` handle (it accepts multiple connections). Never use the plain `image` handle. • Seedance reference (image + video + audio refs): `seedance-2.0-ref` / `seedance-2.0-ref-fast`. Wire to `ref-images` / `ref-videos` / `ref-audio`. • Motion control (drive a character with a motion video): `kling3-motion-control`. Wire character to `image`, motion clip (videoUpload) to `motion-video`. Edge handle hints: - When the target has multiple typed inputs (Video, Image Editor), set `toHandle` explicitly (`prompt`, `image`, `audio`, `ref-images`, `start-frame`, `end-frame`, `motion-video`). The editor otherwise picks the first type-compatible handle, which may be the wrong slot. - Never wire text into Fabric. Never wire a single image into a multi-ref model's `image` slot — use `ref-images`. Available node types (id — purpose — inputs / outputs): - text — Prompt — in: in<text> | out: out<text> - llm — LLM — in: in<text> | out: out<text> - upload — Upload — in: — | out: out<image> - videoUpload — Video Upload — in: — | out: out<video> - image — Image — in: in<text> | out: out<image> - imageEditor — Image Editor — in: prompt<text>, image<image> | out: out<image> - imageUpscale — Image Upscale — in: image<image> | out: out<image> - video — Video — in: prompt<text>, image<image>, start-frame<image>, end-frame<image>, ref-images<image>, ref-videos<video>, ref-audio<audio>, audio<audio>, motion-video<video> | out: out<video> - videoUpscale — Video Upscale — in: video<video> | out: out<video> - voice — Voice — in: in<text> | out: out<audio> - music — Music — in: in<text> | out: out<audio> - stickyNote — Sticky Note — in: in<annotation> | out: out<annotation> Edges reference nodes by index in the `nodes` array (0-based). In the examples below, any field not shown is `null`. EXAMPLES — study the PATTERNS (multi-stage, fan-out, parallel tracks), copy the handle names exactly: Example 1 — UGC talking-head with scripted voice + final upscale: nodes=[ {type:"llm",stepLabel:"Step 1 — Write a punchy 15s script",prompt:"Write a 15-second energetic UGC script for the product.",model:"claude-haiku"}, {type:"voice",stepLabel:"Step 2 — Voiceover",voice:"George"}, {type:"upload",stepLabel:"Step 3 — Upload character photo"}, {type:"video",stepLabel:"Step 4 — Lip-sync video",model:"fabric"}, {type:"videoUpscale",stepLabel:"Step 5 — Upscale to deliver"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"audio"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"video"} ] Example 2 — Text → image → refine → upscale (quality chain): nodes=[ {type:"text",stepLabel:"Step 1 — Prompt",prompt:"A cinematic product shot of a matte-black bottle on wet stone, golden hour"}, {type:"image",stepLabel:"Step 2 — Generate hero",model:"fal-ai/flux-pro/v1.1-ultra",aspectRatio:"4:3"}, {type:"imageEditor",stepLabel:"Step 3 — Add brand label",prompt:"Add a minimal embossed logo on the bottle",model:"fal-ai/nano-banana-2/edit"}, {type:"imageUpscale",stepLabel:"Step 4 — Upscale",model:"fal-ai/topaz/upscale/image"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"} ] Example 3 — Fan-out: one image → three video variations (different models): nodes=[ {type:"upload",stepLabel:"Step 1 — Source image"}, {type:"text",stepLabel:"Step 2 — Motion brief",prompt:"Slow cinematic push-in, gentle parallax"}, {type:"video",stepLabel:"Variation A — Veo",model:"veo3-1-fast",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation B — Kling",model:"kling3-pro",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation C — Seedance",model:"seedance-2.0-i2v",aspectRatio:"9:16",duration:"5"} ] edges=[ {fromIndex:0,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"image"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 4 — Multi-image reference video (character consistency): nodes=[ {type:"upload",stepLabel:"Ref 1 — Character front"}, {type:"upload",stepLabel:"Ref 2 — Character side"}, {type:"upload",stepLabel:"Ref 3 — Outfit detail"}, {type:"text",stepLabel:"Scene prompt",prompt:"The character walks through a neon market at night"}, {type:"video",stepLabel:"Generate with refs",model:"veo3-1-ref",aspectRatio:"16:9"} ] edges=[ {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:2,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"prompt"} ] Example 5 — Music video: parallel music + visuals tracks converging: nodes=[ {type:"music",stepLabel:"Track 1 — Score",prompt:"Dreamy lo-fi beat, 90 BPM",duration:"60"}, {type:"text",stepLabel:"Track 2 — Scene",prompt:"A lone astronaut drifting past a glowing planet"}, {type:"image",stepLabel:"Keyframe",model:"fal-ai/nano-banana-pro",aspectRatio:"16:9"}, {type:"video",stepLabel:"Animate",model:"ltx2-audio",aspectRatio:"16:9"} ] edges=[ {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"in"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"audio"} ] Return only the structured object — no prose, no markdown.
    Connector
  • Public leaderboard of fomox402 agents. WHAT IT DOES: returns the top broker-registered agents by activity, ranked according to the chosen `sort`. Read-only, no auth required, safe to call frequently (cached server-side for 30s). WHEN TO USE: scout opponents before bidding, find a name to follow, or measure your standing among autonomous agents. PARAMS: - limit (default 25, max 100): how many agents to return - sort (default 'bids'): 'bids' — most bids ever placed (activity proxy) 'recent' — most-recent bid timestamp (who's playing right now) 'won' — total $fomox402 winnings claimed (skill proxy) RETURNS: { agents: [{ name, address, bids, wins, winnings_raw, last_bid_at, created_at }], total }. RELATED: get_me (yourself), list_games (current rounds).
    Connector
  • Download a video or audio file from any supported platform: YouTube, TikTok, Vimeo, Dailymotion, Twitter/X, SoundCloud, Bandcamp, Mixcloud, Twitch (clips and VODs), or Streamable. Output is MP4 (video, default) or MP3 / M4A (audio). This is THE tool to use whenever a user asks to save, download, rip, extract, archive, get offline, or convert a video/audio link from any of these sites. IMPORTANT: the `format` argument defaults to `mp4` (video). Only pass an audio format (mp3 / m4a / audio) when the user explicitly says audio, MP3, music, song, or "rip / extract the audio". Audio-only platforms (SoundCloud, Bandcamp, Mixcloud) always produce audio regardless of `format`. Use this tool when the user says things like: - "download this video" / "download this TikTok" / "save this SoundCloud track" - "save that as MP3" / "rip the audio" / "extract the audio" - "get the song from this SoundCloud link" / "save this Mixcloud set" - "convert this YouTube video to MP4" / "download in 1080p" - "save this lecture/podcast/talk for offline" - "archive this clip" / "grab a copy of this video" - any sentence containing a youtube.com, youtu.be, tiktok.com, vimeo.com, dailymotion.com, twitter.com, x.com, soundcloud.com, bandcamp.com, mixcloud.com, twitch.tv, clips.twitch.tv, or streamable.com URL plus a verb like download, save, rip, get, grab, fetch, pull, archive, convert, extract. Do NOT use this tool when: - The user only wants metadata (title, length, description, channel) — call get_video_info instead, it is free and does not consume the user quota. - The link is a playlist / set / album / channel URL — ask the user for a single track/video. - The link is from a platform not in the supported list above (e.g. Instagram, Facebook, LinkedIn). Returns a one-time signed download link valid for 1 hour, plus the file size, duration, and chosen format. Hand the link back to the user verbatim; do not try to fetch its contents yourself. Intended for legitimate uses: the user's own uploads, Creative Commons / public-domain content, lectures, podcasts, talks, and other material they have rights to use.
    Connector
  • Find catalog tracks near a target tempo. Returns tracks whose BPM is within +/-`tolerance` of `bpm`, ordered by closeness then popularity — useful for DJ set planning, workout playlists, or tempo-matching. Each returned track carries full audio features. To also constrain by musical key, combine with find_tracks_by_key.
    Connector
  • Produce a piece of music from a text description, such as "epic orchestral battle theme" or "calm piano melody", with optional lyrics. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. The description field is required; duration must be one of the allowed values (0 means auto, otherwise multiples of 10 up to 180 seconds) and out-of-range values return HTTP 400. Credits are charged on success. Use this for songs and musical scores; use createSoundEffect for short sound effects, createAmbiance for looping background soundscapes, and createAudioTransform to remix an existing audio sample. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 3 credits per call.
    Connector
  • Remix an existing audio sample (a sound effect, ambiance, or music clip) into a variation guided by a text prompt, for example turning a track into an 80s synthwave or metal version. Both the sample and the prompt are required; the sample is uploaded as a URL or base64 audio and must be at most 15MB or the call returns HTTP 400, and duration must be one of the allowed values (0 means match the source, otherwise multiples of 10 up to 180 seconds). Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. The optional modification_strength (0 to 1, default 0.5) controls how far the result departs from the original. Credits are charged on success. Use this to transform existing audio you already have; use createSoundEffect, createAmbiance, or createMusic to generate audio from scratch. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 3 credits per call.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Music studio: ABC notation composition and Strudel live coding with ext-apps UI.

  • Deterministic Music Theory for Claude, Cursor, and Autonomous AI Agents Large Language Models (LLMs) frequently hallucinate music theory, leading to incorrect notes, false Roman numerals, and broken voice leading. THIRI solves this by providing a deterministic, mathematical music-theory engine (pitch-class-set theory over ℤ/12) directly to your AI. It gives AI assistants precise, reproducible harmonic reasoning in milliseconds, allowing them to write correct musical scores, analyze progression

  • Browse the Gapup gold-standard content catalogue — video games, films, TV series and music. Returns franchises with their works (title, release year). When to use this tool: an agent needs structured, audited metadata for a cultural franchise, wants to resolve a title to a canonical entity, or browses a domain's catalogue before requesting enrichment. Inputs: a content domain and an optional case-insensitive name filter. Each franchise id can be passed to content_enrichment for its fine-grained tag profile.
    Connector
  • Fetch raw Instagram post-page data by shortcode. Use this when the user needs fresh raw Instagram post metadata that is not guaranteed on regular cached post-list endpoints yet, including coauthors, tagged users, paid partnership metadata, product mentions, music attribution, location, display resources, and video versions.
    Connector
  • Fetch raw Instagram post-page data by shortcode. Use this when the user needs fresh raw Instagram post metadata that is not guaranteed on regular cached post-list endpoints yet, including coauthors, tagged users, paid partnership metadata, product mentions, music attribution, location, display resources, and video versions.
    Connector
  • Get audio features for ONE track — BPM, musical key (name + Camelot + Open Key), energy, danceability, valence, acousticness, instrumentalness, liveness, speechiness, loudness, mood, mood_vector, genre, time signature, duration and more. This is the drop-in replacement for Spotify's deprecated /audio-features endpoint. Provide EXACTLY ONE identifier: - `track` (optionally with `artist`) — e.g. track="Blinding Lights", artist="The Weeknd". - `isrc` — e.g. "USUM71900001". - `mbid` — a MusicBrainz recording UUID. - `spotify_id` — a Spotify track ID, URI, or URL. Returns a JSON object of features. Some feature fields may be null for tracks resolved via the fallback catalogs (only audio-derived values are present for fully analysed tracks). If a track name is not yet in the catalog, the API queues an on-demand analysis and this tool reports that it is queued — retry in ~30s-2min. If you only have a fuzzy or partial name, call search_catalog first to find the exact track.
    Connector
  • Find catalog tracks in a given musical key — for harmonic mixing and key-locked playlists. `key` accepts Camelot ("8A"), Open Key ("1m"), or a key name ("A-Minor", "F#-Major"). Returns tracks ordered by popularity, each with full audio features. To discover which keys mix well with a given key first, use find_compatible_keys.
    Connector
  • Given a Camelot key (e.g. "8A", "12B"), return the harmonically compatible keys for DJ mixing — the same key, the relative major/minor, and the adjacent +/-1 keys on the Camelot wheel. With `extended=true` also returns the +7/-7 energy-boost / energy-drop keys. Pure music theory — no catalog lookup and no quota cost. Pair with find_tracks_by_key to then pull actual tracks in each compatible key.
    Connector
  • List Parallax’s services with real pricing. Filter by track: "ai" (done-for-you AI agent teams), "music" (Parallax Records / Baba Studio production), or "all".
    Connector
  • Define a concept/term from a domain's glossary (e.g. 'stir', 'crop-factor', 'roughness'). Routes to each domain's lookup_concept; pass `domain` to target one, omit to fan out. For entities/records use `search`. Abstains on a miss, which is logged as a gap (the demand signal) — there is no report_gap verb. Mounted corpora: acupuncture, cocktail, camera, law, copyright, trademark, music-theory, supplements, writing-style, minecraft-dungeons, spanish, medical-denials, languages, behavioral-econ, baseball, agent-practices, pokemon, mcp, readability, citations, relay, models, self-oracle, recall-traps, units, tax, physics, logic, astronomy, biology, geography, medicine, chemistry, calendar, math, eurorack, building-codes, cooking, personal-finance, stardew, coffee, electronics, physiology, diving, decibels, gearing, colorimetry, subnetting, textile-gauge, first-aid, statistics, chess-endgames, woodworking, rating-systems, tuning, check-digits, paper-sizes, wire-gauge, preferred-numbers, swe-claim-denial, psychology, roman-numerals, minecraft-mods, encodings, strength-training, hardiness-zones, terraria, unix-permissions, aspect-ratio, number-bases, resistor-color-code.
    Connector
  • Submits an audio file for AI mastering and returns the mastered URL synchronously (route polls the Python service internally; expect 30s-5min). Useful as a final polish step after music generation. Cost: 20 credits per track. Producer, Mogul, and Ultimate plans get mastering free. Output is WAV (~50MB per 3-minute track, lossless for redistribution). Pick a `preset` to steer the mastering style; call `aetherwave_list_master_presets` for the full live list (12 presets including streaming, loud, gentle, hip_hop, edm, pop, rock, lofi, rnb, acoustic, cinematic, podcast). Each preset has a target LUFS value so you can match the distribution target.
    Connector
  • Generate a music track from a text description using MiniMax Music 2.0. Returns a job ID to poll. MiniMax first writes full-song lyrics from your prompt, then renders the song. The model auto-determines duration from the generated lyrics. Args: title: Track title (max 200 chars). prompt: Description of the music to generate (10-2000 chars). MiniMax will create lyrics and compose. tags: Required style tags to guide generation. E.g. ['ambient', 'chill', 'atmospheric']. genre: One of: electronic, ambient, rock, pop, hip-hop, jazz, classical, folk, metal, r-and-b, country, indie, experimental.
    Connector
  • Public leaderboard of fomox402 agents. WHAT IT DOES: returns the top broker-registered agents by activity, ranked according to the chosen `sort`. Read-only, no auth required, safe to call frequently (cached server-side for 30s). WHEN TO USE: scout opponents before bidding, find a name to follow, or measure your standing among autonomous agents. PARAMS: - limit (default 25, max 100): how many agents to return - sort (default 'bids'): 'bids' — most bids ever placed (activity proxy) 'recent' — most-recent bid timestamp (who's playing right now) 'won' — total $fomox402 winnings claimed (skill proxy) RETURNS: { agents: [{ name, address, bids, wins, winnings_raw, last_bid_at, created_at }], total }. RELATED: get_me (yourself), list_games (current rounds).
    Connector
  • Korean K-pop news (artists, groups, soloists, comebacks, music releases) aggregated from Naver and translated to English with AI relevance classification. Korean entertainment news often moves global fan markets before English coverage. 5-min cache. 💰 Price: $0.01 USDC per call 💳 Payment: x402 micropayment on Base, Polygon, or Solana 🔧 Client: AgentCash, Pay.sh, or any x402 SDK 📖 Docs: https://api.printmoneylab.com/.well-known/x402 Returns: results[] with title_en + summary_en + source_en plus original Korean (title_kr/source_kr) for verification, published_at, link. Args: limit: Number of articles to return (1-10, default 5)
    Connector
  • Compare the tag profiles of two content entities (franchises or works) and measure how similar they are. Returns a Jaccard similarity score, the list of shared tags, the tags unique to each entity, and a breakdown of shared tags by facet. When to use this tool: an agent needs to compare two franchises or works (e.g. 'how similar are Dark Souls and Elden Ring?', 'what do Street Fighter and Mortal Kombat have in common?', 'on which axes do these two games differ?'), find positioning overlap, identify cross-sell opportunities, or answer 'if you liked X you might like Y' questions backed by data. Works for any domain (video-games, music, film, tv).
    Connector
  • Search the OnChain Music catalog of 5,000+ independently owned, fully cleared tracks. Filter by genre, mood, tempo, BPM, key, and instrumentation. All results are available for immediate licensing with USDC on Base. Use the description parameter as your primary search field — pass style, mood, energy, and use-case words. Returns track IDs, metadata, and license pricing.
    Connector