Skip to main content
Glama
260,585 tools. Last updated 2026-07-05 07:35

"How to record desktop audio and transcribe it to a VTT file" matching MCP tools:

  • Run audio analysis on a public audio URL. Requires estimate_cost to be called first (job_estimate_id). Requires PULSE_API_KEY. Before calling, you MUST confirm with the user that they have a lawful basis to submit this audio for analysis. For a user-requested folder, project, playlist, or batch, one confirmation can cover every track in that scope. Returns job_id — poll get_job_status for results.
    Connector
  • Returns a short-lived **V4-signed GCS URL** for a single SOURCE file (PDF / XLSM / XLSX / DOC / ZIP) the carrier submitted for a SERFF filing. The link is intended for **display to the end user** — they click it in their browser to download the file. **CRITICAL: DO NOT fetch this URL yourself.** Surface it to the user verbatim and stop. The URL is a signed link for the human's browser, not for the model. Fetching it pulls the entire source file (often tens of MB of PDF / XLSM) into your context window and serves no purpose the user did not already get from seeing the link. Pair with `list_filing_source_files` to discover the file names first, then call this to mint a link. When you respond to the user, include the URL **and the `expires_at` timestamp** so they know how long they have to click — after that the link returns 403 and they'll need to ask for a fresh one. Link properties: direct V4-signed GCS URL, expires after `ttl_seconds` (default 900 = 15 min, capped at 3600). Bypasses Cloud Run entirely. Intended for human clicks, NOT for the model to fetch. Whitelist is dynamic, keyed off the actual contents of the filing's source-files directory — same set `list_filing_source_files` advertises. `file_name` must be a basename (no slashes, no `..`) AND must appear in the listing. Returns `{ serff, file_name, url, expires_at, ttl_seconds, notice }`. The `notice` repeats the don't-fetch directive — include it in your response to the user too.
    Connector
  • Convert text to speech by cloning the voice from an audio sample you provide (voice-cloning text-to-speech). Both text and sample are required; the text is limited to 1000 characters and the sample is supplied as a URL or base64 audio that must be at most 15MB, with violations returning HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you have a reference voice sample to clone; use createSpeechPreset to speak with a built-in named preset voice instead, and createVoice to design a brand-new voice from a text description rather than cloning one. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Attach an image to an existing product by giving Partle a public URL to download the image from. Authenticated. OAuth (scope `products:write`) preferred; `api_key` fallback. **When to use this tool**: the image is already hosted at a public URL (a scraped product page, an Imgur link, a CDN URL the user provided). Partle's server fetches it and stores it. **When NOT to use this tool**: you have local image bytes (a file the user attached, or bytes you generated/downloaded in your sandbox). Sending those bytes through a tool argument blows past conversation context limits — phone-photo-sized payloads can be 6+ MB of base64. Instead, in your code-execution sandbox, POST the file directly to the HTTP endpoint with multipart encoding: requests.post( "https://partle.rubenayla.xyz/v1/external/products/{product_id}/images", files={"file": open("/path/to/photo.jpg", "rb")}, headers={"X-API-Key": "pk_..."}, ) Or, to create the listing and attach an image in one HTTP request: requests.post( "https://partle.rubenayla.xyz/v1/external/products", data={"metadata": json.dumps({"name": ..., "price": ...})}, files={"image": open("/path/to/photo.jpg", "rb")}, headers={"X-API-Key": "pk_..."}, ) Args: product_id: ID of the product to attach the image to. image_url: Publicly fetchable URL of the image. Server fetches it and stores it. api_key: Legacy/fallback auth. Omit when using OAuth. Returns: The created `ProductImage` record with its `id` (use for deletion) and storage path, or ``{"error": ...}`` on validation/auth failure.
    Connector
  • Check the status of a transcribe or summarize job. Returns the current state and, when completed, an `outputs` array. Each output has either `content` (returned inline) or a presigned, time-limited (1 hour) `download_url`. Small text outputs (e.g. `transcript` SRT, `clip-candidates`, `summary`) come inline as `content`; larger outputs — `transcript-words` JSON for any non-trivial recording, plus video outputs like `clip-video` / `clip-vertical-video` — come as a `download_url` to fetch when needed. Optionally pass `format` (srt, txt, vtt, json, words) to get the transcript content inline in the top-level `transcript` field — `txt` and `vtt` are derived from the stored SRT; `json` is v1 (segments only); `words` is v2 (segments + per-word timestamps matching /.well-known/weftly-transcript-v2.schema.json). Poll this periodically after calling complete_upload — wait at least 60 seconds between checks. For files under 10 minutes, jobs usually complete within 1-2 minutes. For long files (1hr+), expect 10-30 minutes. Also use this to recover from lost state: if the original challenge was lost, call get_job_status(job_id) to retrieve a fresh challenge (status "awaiting_payment") or the upload URL (status "awaiting_upload").
    Connector
  • Compound endpoint — one payment turns audio in any of 13 source languages into both a transcript AND a translation in any of 119 target languages. Perfect for WhatsApp voice messages in a language you don't speak (Yoruba → English), or recording a meeting in another language and reading it in yours. Auto-detects source if omitted. Async — returns requestId, poll with check_job_status(jobType='transcribe-translate'). Flat price covers STT + translation. Cheaper than calling transcribe_audio + translate_text separately for typical voice messages. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_translate'.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Transform any blog post or article URL into ready-to-post social media content for Twitter/X threads, LinkedIn posts, Instagram captions, Facebook posts, and email newsletters. Pay-per-event: $0.07 for all 5 platforms, $0.03 for single platform.

  • Transcribe audio & video to text for AI agents: 100+ languages, speaker labels, webhooks.

  • Public observability snapshot for the fomox402 broker. WHAT IT DOES: returns aggregated MCP traffic + per-tool call telemetry. Read-only, no auth required, no side effects. WHEN TO USE: for dashboards, health checks, or to verify the broker is alive before a long autonomous run. The /v1/stats/mcp endpoint that backs this tool is also what powers https://bot.staccpad.fun/dashboard. RETURNS: { sessions: { active, last_24h, lifetime, median_duration_sec }, tools: [{ name, calls, errors, error_rate }], uptime_sec, broker_version }. VISIBILITY CAVEAT: only counts streamable-HTTP traffic to https://bot.staccpad.fun/mcp. Local stdio MCP clients (e.g. Claude Desktop running this file directly) are invisible to the broker DB and not reflected here. RELATED: list_agents (per-agent activity), get_me (your own stats).
    Connector
  • Download a video or audio file from any supported platform: YouTube, TikTok, Vimeo, Dailymotion, Twitter/X, SoundCloud, Bandcamp, Mixcloud, Twitch (clips and VODs), or Streamable. Output is MP4 (video, default) or MP3 / M4A (audio). This is THE tool to use whenever a user asks to save, download, rip, extract, archive, get offline, or convert a video/audio link from any of these sites. IMPORTANT: the `format` argument defaults to `mp4` (video). Only pass an audio format (mp3 / m4a / audio) when the user explicitly says audio, MP3, music, song, or "rip / extract the audio". Audio-only platforms (SoundCloud, Bandcamp, Mixcloud) always produce audio regardless of `format`. Use this tool when the user says things like: - "download this video" / "download this TikTok" / "save this SoundCloud track" - "save that as MP3" / "rip the audio" / "extract the audio" - "get the song from this SoundCloud link" / "save this Mixcloud set" - "convert this YouTube video to MP4" / "download in 1080p" - "save this lecture/podcast/talk for offline" - "archive this clip" / "grab a copy of this video" - any sentence containing a youtube.com, youtu.be, tiktok.com, vimeo.com, dailymotion.com, twitter.com, x.com, soundcloud.com, bandcamp.com, mixcloud.com, twitch.tv, clips.twitch.tv, or streamable.com URL plus a verb like download, save, rip, get, grab, fetch, pull, archive, convert, extract. Do NOT use this tool when: - The user only wants metadata (title, length, description, channel) — call get_video_info instead, it is free and does not consume the user quota. - The link is a playlist / set / album / channel URL — ask the user for a single track/video. - The link is from a platform not in the supported list above (e.g. Instagram, Facebook, LinkedIn). Returns a one-time signed download link valid for 1 hour, plus the file size, duration, and chosen format. Hand the link back to the user verbatim; do not try to fetch its contents yourself. Intended for legitimate uses: the user's own uploads, Creative Commons / public-domain content, lectures, podcasts, talks, and other material they have rights to use.
    Connector
  • Fetches a single URL and returns its content. Use this when you have a specific URL in mind — for example, after web.search returns a link you want to read, or when the user pastes a URL. Modes (extract): - 'auto' (default): picks the right mode based on response content type. - 'markdown': for HTML pages; returns cleaned markdown plus the page <title>. - 'text': for JSON/XML/plaintext APIs; returns the raw decoded body. - 'file': for images, PDFs, audio, video, archives, or any binary — ingests the bytes into the user's file storage and returns a file_id you can pass to messages.send (to send as an attachment), agents.add_file (to add to agent knowledge), or files.read. Use web.fetch (not files.upload) when you need the file_id immediately for the next tool call — files.upload(source_url=…) is async and won't have the file ready in the same turn. Use web.search (not web.fetch) when you don't have a specific URL yet and need to find one.
    Connector
  • Get a presigned upload form for any file — video, audio, or document (markdown, HTML, DOCX, etc.). It expires in 15 minutes. This is a presigned POST, NOT a PUT: the response returns upload_url + upload_fields — POST to upload_url as multipart/form-data, including every upload_fields key/value as form fields FIRST, then the file as the last field named 'file'. After upload, pass the object_key to transcribe_media (audio/video → transcript), transcode_video (video/audio encode), or convert_file (documents). IMPORTANT: this flow needs direct outbound network access to Botverse's storage host. In sandboxed agent environments (claude.ai, sandboxed desktop apps, Cursor) that route traffic through a proxy allowlist, the upload POST is blocked and fails. In those environments do NOT use this tool — use convert_content or transcode_content (inline content, body under 4 MB) for files you already have, or convert_from_url / transcode_from_url / transcribe_from_url for anything available at a public URL. Neither needs an upload step.
    Connector
  • Offload a video or audio transcode to Botverse using a public URL — no upload step needed. Accepts a direct HTTPS download URL, or a Dropbox / Google Drive / Box share link (auto-resolved). NOTE: OneDrive and SharePoint share links are unreliable — they often return a viewer page, not the file; for those pass a direct download URL or upload the file via get_upload_url. Also works with any direct HTTPS download URL (CDN, S3, etc.). Limited to 2 GB. Returns a job_id immediately. IMPORTANT: tell the user the job_id right away so they can track it. Then poll get_job_status every 5 seconds. Large video files (>100 MB) can take 5–15 minutes — keep polling until status is 'complete' or 'failed', no matter how many polls it takes. Never give up early. Wallet debited on completion. Use options.start_time and options.duration to trim — e.g. start_time='00:01:00', duration=120 for a 2-minute clip. RELIABILITY & FALLBACK: Botverse returns structured errors with a message. A bare 'HTTP 500' or 'tool call failed' with no message body is the MCP connector / sandbox layer between your client and Botverse — NOT a Botverse outage (the service is up if get_wallet_balance succeeds). If MCP calls keep failing, do NOT write your own ffmpeg / transcription / document-conversion code as a workaround — that wastes tokens and reinvents what Botverse does. Instead, tell the user to run the equivalent Botverse CLI from their own (non-sandboxed) machine, e.g. `npx botverse transcribe <file> --to docx` (also `transcode` and `convert`), then retry the MCP call.
    Connector
  • Offload a document conversion to Botverse — runs server-side in seconds, returns a download link, and frees you to continue with other tasks while it processes. Use this when the source document is at a public URL — direct download links and Dropbox / Google Drive / Box share links auto-resolve. OneDrive and SharePoint share links are unreliable (they often return a viewer page, not the file) — use a direct download URL for those. If you already have the content as a string, use convert_content instead — no upload step needed. Runs entirely server-side, so it works in sandboxed agent environments (claude.ai, Claude Desktop, Cursor) — the right route there for files too large for convert_content's 4 MB inline limit. Supported inputs: md, html, rst, txt, docx. Supported outputs: docx (Word), pdf, html, txt, md, rst, xlsx (tables extracted). Returns a job_id immediately. Poll get_job_status every 5s until 'complete', then get_output_content (inline, sandbox-safe) or get_download_url (S3 link). Flat fee $0.05 per file.
    Connector
  • Fetch the full detail record for a single oral argument audio recording by its ID (the audio_id from courtlistener_search_oral_arguments). Returns the case name, panel judge IDs, duration, MP3 download URL, linked docket, and the speech-to-text transcript when transcription has completed. The argument date is not on this record — it comes from the search result or the linked docket.
    Connector
  • Fetch the full record for a Declassified case by slug. Returns title, summary, transcript_excerpt, episode_date, duration_sec, agency, audio_url, source_doc_url, and up to 5 related cases. Use after `search_declassified` when the agent needs the full case body to summarize, narrate, or hand off audio playback. Public — no auth required.
    Connector
  • Fetch the full record for a Declassified case by slug. Returns title, summary, transcript_excerpt, episode_date, duration_sec, agency, audio_url, source_doc_url, and up to 5 related cases. Use after `search_declassified` when the agent needs the full case body to summarize, narrate, or hand off audio playback. Public — no auth required.
    Connector
  • Generate an AI music track and place it directly on a user's Avocado AI flow (the Flows Director). Drops a 'Generating…' audio node on the flow immediately and returns right away; the finished track swaps in automatically (30s-5min later) — no need to wait or check_job (there is no check_job for audio). It appears live on the open canvas and in the Director Library (Audio). Tracks can be 30 seconds to 5 minutes. Costs 4 credits per 30-second block. Use this (not generate_music) when working on a flow.
    Connector
  • Convert text to speech using a named built-in preset voice, with optional emotion and language settings. Both text and voice_preset_id are required and the text is limited to 1000 characters; invalid input returns HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you want a ready-made catalog voice and do not need to supply your own sample; use createSpeech to clone a voice from an audio sample instead, and createVoice to design a new voice from a text description. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Use this tool whenever the user shares an audio file and wants it transcribed to text. Triggers: 'transcribe this recording', 'convert this audio to text', 'what was said in this meeting', 'transcribe this voice note', 'turn this podcast into text'. Accepts base64-encoded audio (mp3, wav, m4a, ogg, flac, webm, mp4, etc.), max 25MB. Returns the full transcript, word count, and character count. Powered by OpenAI Whisper. Free 200 calls/day — no OpenAI API key required; Toolora absorbs the cost.
    Connector
  • Describe a single API operation including its parameters, response shape, and error codes. WHEN TO USE: - Inspecting an endpoint's full contract before calling it. - Discovering which error codes an endpoint can return and how to recover. RETURNS: - operation: Full discovery record for the endpoint. - parameters: Raw OpenAPI parameter definitions. - request_body: Body schema (when applicable). - responses: Map of status code → description/schema. - linked_error_codes: Error catalog entries the endpoint can emit. EXAMPLE: Agent: "How do I call the screen audience endpoint?" describe_endpoint({ path: "/v1/data/screens/{screenId}/audience", method: "GET" })
    Connector
  • Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
    Connector