Skip to main content
Glama
261,450 tools. Last updated 2026-07-05 13:08

"Using voice commands to interact with Claude on desktop" matching MCP tools:

  • Generates a voiceover from text using Hume Octave TTS. Audio uploaded to Spaces, signed URL returned (24h TTL by default). Charged in credits up-front based on script length (use quote_voiceover for a preview). Best for demo-video narration, tutorial audio, and any one-shot batch TTS. NOT a real-time conversational voice (use Hume EVI for that, different product). Voice options: pass voiceId for a specific Hume voice clone, or omit to use the deployment's default narrator (HUME_OCTAVE_VOICE_ID env var).
    Connector
  • Adds more data (and on some plans voice/SMS) to an existing eSIM the user already owns, without issuing a new eSIM. This SPENDS the wallet balance and cannot be undone. First call get_topup_options with the same ICCID to get valid top-up package codes and prices, then pass one here. Use when a user wants to extend an eSIM that is running low rather than buy a new one. Requires a connected agent wallet (OAuth or ak_live_ key).
    Connector
  • Search PikaSim PHONE-NUMBER eSIMs — plans that include a REAL carrier phone number (not VoIP) with voice calls, SMS, and data. US plans give a real +1 number on AT&T and T-Mobile; global plans cover 157 countries. Use this when a user wants to call or text, not just data. Each result shows its packageCode in [brackets] for purchase_phone_plan.
    Connector
  • Convert text to speech by cloning the voice from an audio sample you provide (voice-cloning text-to-speech). Both text and sample are required; the text is limited to 1000 characters and the sample is supplied as a URL or base64 audio that must be at most 15MB, with violations returning HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you have a reference voice sample to clone; use createSpeechPreset to speak with a built-in named preset voice instead, and createVoice to design a brand-new voice from a text description rather than cloning one. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Compound endpoint — one payment turns audio in any of 13 source languages into both a transcript AND a translation in any of 119 target languages. Perfect for WhatsApp voice messages in a language you don't speak (Yoruba → English), or recording a meeting in another language and reading it in yours. Auto-detects source if omitted. Async — returns requestId, poll with check_job_status(jobType='transcribe-translate'). Flat price covers STT + translation. Cheaper than calling transcribe_audio + translate_text separately for typical voice messages. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_translate'.
    Connector

Matching MCP Servers

  • F
    license
    -
    quality
    C
    maintenance
    Enables natural voice interaction with Claude Code through speech-to-text, supporting wake word activation and multiple backends like Whisper and Google. It allows users to execute commands and control their coding environment hands-free via their microphone.
    Last updated
    2
  • F
    license
    B
    quality
    D
    maintenance
    Enables Claude Code to send prompts to Claude Desktop using macOS automation and AppleScript. Supports conversation management and configurable response polling, though reading responses back is limited by Electron's accessibility APIs.
    Last updated
    2
    1

Matching MCP Connectors

  • Persistent project context for Claude. IANA-registered .faf format.

  • ship-on-friday MCP — wraps StupidAPIs (requires X-API-Key)

  • Public observability snapshot for the fomox402 broker. WHAT IT DOES: returns aggregated MCP traffic + per-tool call telemetry. Read-only, no auth required, no side effects. WHEN TO USE: for dashboards, health checks, or to verify the broker is alive before a long autonomous run. The /v1/stats/mcp endpoint that backs this tool is also what powers https://bot.staccpad.fun/dashboard. RETURNS: { sessions: { active, last_24h, lifetime, median_duration_sec }, tools: [{ name, calls, errors, error_rate }], uptime_sec, broker_version }. VISIBILITY CAVEAT: only counts streamable-HTTP traffic to https://bot.staccpad.fun/mcp. Local stdio MCP clients (e.g. Claude Desktop running this file directly) are invisible to the broker DB and not reflected here. RELATED: list_agents (per-agent activity), get_me (your own stats).
    Connector
  • FOR CLAUDE DESKTOP ONLY (with filesystem access). For Claude.ai/web: Use create_upload_session instead - it provides a browser upload link. Upload local media to cloud storage, returning a public HTTPS URL. WHEN TO USE: • Instagram, LinkedIn, Threads, X: REQUIRED for local files before calling publish_content • TikTok: NOT NEEDED - pass local path directly to publish_content SUPPORTED FORMATS: • Images: jpg, png, gif, webp (max 10MB) • Videos: mp4, mov, webm (max 100MB) Returns { url: 'https://...' } for use in publish_content mediaUrl parameter.
    Connector
  • Niche (nicheangle.com) story discovery: find stories worth writing about, then draft and publish platform-native social content (LinkedIn, X threads, Instagram, newsletter) from them. This is story discovery, not content generation: Niche reads primary sources, separates signal from noise, and clusters it into a ranked story slate with provenance, the editorial-intelligence step before any writing. Returns a session_id plus initial status; poll niche_session_state with the session_id until status is `cp1_awaiting_story` to read the slate. Brand profile: the run's voice/offer/CTA. You do NOT need niche_whoami to brand a run: OMIT `brand_id` and a single or default brand binds automatically (silently). On a MULTI-brand account, an omitted `brand_id` returns `brand_choice_required` with `brand_options[]` inline (the slate still lands). Ask the user which brand, then re-call with `brand_id` (or `brand_id:'none'` for a deliberately unbranded run); don't draft until one is chosen. Pass `brand_id` to bind a specific persisted profile (set via niche_brand_profile_set); its voice, lexicon, framing, channel config, and verifier overrides thread through every downstream stage. Pass `profile_overrides` alongside `brand_id` to deep-merge a one-time deviation (logged on the session, not stored). The effective profile is snapshotted at scan time; later updates to the persisted profile don't affect in-flight runs.
    Connector
  • Lists the Google Drive folders synced on this Mac (My Drive, Shared drives, per-account mounts). Start here to get valid paths for the other gdrive_* tools. Reads the folder Google Drive for Desktop already syncs — no Google API, no OAuth.
    Connector
  • Start a Camber agent chat. This is the tool to use for chatting with an agent. Agent runs can take minutes — longer than MCP tool timeouts allow (Claude Desktop cannot extend them). So this tool does NOT wait for the reply: it submits the message and returns immediately with a `conversation_id` and a clickable `chat_url`. The agent keeps working on the server after this returns. **You MUST follow up, the reply is NOT in this tool's result:** 1. After calling this tool you MUST tell the user the work is in progress and share the `chat_url` so they can watch it live. 2. Then immediately call the **`agents_chat_status`** tool with the returned `conversation_id` to get the agent's reply. That tool checks twice over 30 seconds, if the latest status is `running`, call it again. MUST NOT end your turn until `agents_chat_status` returns status `idle` (done) or `failed`. **One run per conversation:** continuing a `conversation_id` that is still `running` fails with a "still generating a response" error. Either wait and retry after `agents_chat_status` reports it finished, or call again with `stop=true` to interrupt the current run and send the new message.
    Connector
  • Convert text to speech using a named built-in preset voice, with optional emotion and language settings. Both text and voice_preset_id are required and the text is limited to 1000 characters; invalid input returns HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you want a ready-made catalog voice and do not need to supply your own sample; use createSpeech to clone a voice from an audio sample instead, and createVoice to design a new voice from a text description. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • List active voice calls in this workspace. Use before calls.make on a Telegram account (only one MTProto call per account at a time) to check whether the line is free.
    Connector
  • Block until a voice call ends (status changes from 'active') or timeout elapses. Returns ended=true with final state when the call has ended; ended=false on timeout (re-issue to keep waiting). The returned state includes `outcome` so callers can branch on pickup vs. no-answer (answered/no_answer/busy/declined/failed/unknown). Default timeout 90s; cap 110s — bounded by nginx proxy_read_timeout 120s on /mcp.
    Connector
  • MANDATORY first step whenever the user attached an image in chat (or pointed at a local file on disk) and wants edit_image or image-to-video generation. Returns a signed PUT URL plus a file_id. How the bytes get uploaded depends on WHERE you run, and the discriminator is network access to the URL, not shell access: (a) Claude.ai (web, desktop, or mobile app): this tool renders an inline upload widget. The user drops the image into it; it uploads from their browser and pushes the file_id back automatically. Your code-execution sandbox has NO network route to the signed URL — a chat attachment sitting on the sandbox filesystem does NOT mean you can upload it. NEVER attempt curl/fetch/Python uploads from a sandbox and never investigate domain allowlists; just ask the user (one short sentence) to drop the image into the widget, then stop and wait. (b) Claude Code / a CLI with a real shell on the user's machine: run the ready-made curl PUT from the response text. Then call edit_image or generate_video with file_id=<returned id>. edit_image and generate_video do NOT accept base64 — calling them with raw image bytes WILL fail. This tool is the only working path for chat attachments. Set `purpose` to 'edit' or 'video' so the upload widget points the user at the right downstream tool.
    Connector
  • Your saved voices — one tool for the whole voice library. Users speak plain language and never know ids: resolve every voice by NAME yourself (call action "list" first if unsure) and never ask the user for an id. action="list" returns every saved voice with voice_id, name, kind and ready — kind "reference" is an instant voice match saved from a clip and kind "clone" is a trained voice (both speak through generate_audio: pass the NAME as its voice param); kind "avatar" voices drive talking_avatar_video. action="create" saves a NEW reference voice from a clip: voice_name plus audio_url (e.g. the url upload_media returned) or audio_base64 (+ format) — free, ready instantly. action="rename" renames a saved voice (voice_id takes the id OR the current name, new_name is the new name). action="clone" registers a voice for talking_avatar_video from audio_sample_url + voice_name (charged 2 credits). action="delete" removes a voice by voice_id or name.
    Connector
  • Extract voice primitives (register / sentence rhythm / lexicon preferences / punctuation habits) from post-shaped text and persist onto the user's VoiceProfile. The voice primitives thread into content generation so generated copy matches the user's actual writing voice. Two input shapes: pass `posts` (list of pre-collected text snippets, ≥80 chars each) or pass `url` (the server scrapes post-shaped snippets from the page: Substack / Medium / blog / X profile). Inline posts win when both are given. Inline post-shaped snippets need to be the user's own writing, not press articles or marketing copy. Returns the extracted primitives + a diff of what changed on the stored VoiceProfile.
    Connector
  • Generate spoken audio from text: narration, a voiceover, a read-aloud script, or a multi-voice dialogue. Pass text (up to 2048 chars) — the words to be spoken. To speak in one of YOUR saved voices, pass voice with the voice NAME (or id): users speak plain language and never know ids, so resolve the name yourself (the voice tool, action "list", shows every saved voice) and never ask the user for an id. Reference voices, trained clones and preset voices are all routed correctly by kind. To match a voice instantly from a clip instead, pass reference_audio_url (a short clip) or up to 3 reference_audio_urls and address them as @Audio1, @Audio2, @Audio3 in the text for dialogue. Alternatively pass image_url to voice a scene from a picture (cannot combine with reference audio). Optional speech_rate (-50..100), pitch (-12..12), loudness (-50..100). Returns a playable audio_url, duration_seconds, and generation_id (also saved to your library).
    Connector
  • Design a new voice from a character description (such as "deep-voiced warrior" or "cheerful young girl") and have it speak a short line of text, returning a sample of that newly created voice. Both voice_description and text are required, the spoken text is limited to 200 characters or the call returns HTTP 400, and type selects "human" or "non-human" voices. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this to invent and audition a voice from a description; use createSpeech for text-to-speech that clones a specific voice from an audio sample, and createSpeechPreset for text-to-speech using a named preset voice. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Roll (regenerate) the personal proxy credential for a firewall. This invalidates the previous password and returns a new one with ready-to-use configuration commands. Only call this when the user explicitly needs new credentials — it will break any existing package manager configuration using the old password.
    Connector