Skip to main content
Glama
260,522 tools. Last updated 2026-07-05 07:02

"Tool for controlling a webpage with voice commands and mouse movements" matching MCP tools:

  • Generates a voiceover from text using Hume Octave TTS. Audio uploaded to Spaces, signed URL returned (24h TTL by default). Charged in credits up-front based on script length (use quote_voiceover for a preview). Best for demo-video narration, tutorial audio, and any one-shot batch TTS. NOT a real-time conversational voice (use Hume EVI for that, different product). Voice options: pass voiceId for a specific Hume voice clone, or omit to use the deployment's default narrator (HUME_OCTAVE_VOICE_ID env var).
    Connector
  • Retrieves direct links to STRING evidence pages for protein–protein interaction pairs. Use this tool only when a STRING evidence page/link is needed. To determine whether an interaction is supported, use `string_interactions_query_set`. It returns URLs linking to STRING’s evidence pages, which display the underlying data sources (experimental results, publications, and curated databases) supporting each predicted interaction. A URL can be generated even for unsupported pairs; the URL is not itself an interaction verdict. Parameters: - **identifier_a**: Query protein identifier (Protein A) - **identifiers_b**: One or more target protein identifiers (Protein B), separated by `%0d` - **species**: NCBI taxonomy ID (e.g. `9606` for human or `10090` for mouse) Typical user questions that should trigger this tool: - "Can you show me the STRING evidence for this interaction?" - "Show me the details supporting this interaction." - "What supports the interaction between TP53 and MDM2?" - "Where can I find the STRING evidence for this pair?"
    Connector
  • Search PikaSim PHONE-NUMBER eSIMs — plans that include a REAL carrier phone number (not VoIP) with voice calls, SMS, and data. US plans give a real +1 number on AT&T and T-Mobile; global plans cover 157 countries. Use this when a user wants to call or text, not just data. Each result shows its packageCode in [brackets] for purchase_phone_plan.
    Connector
  • Convert text to speech by cloning the voice from an audio sample you provide (voice-cloning text-to-speech). Both text and sample are required; the text is limited to 1000 characters and the sample is supplied as a URL or base64 audio that must be at most 15MB, with violations returning HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you have a reference voice sample to clone; use createSpeechPreset to speak with a built-in named preset voice instead, and createVoice to design a brand-new voice from a text description rather than cloning one. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Compound endpoint — one payment turns audio in any of 13 source languages into both a transcript AND a translation in any of 119 target languages. Perfect for WhatsApp voice messages in a language you don't speak (Yoruba → English), or recording a meeting in another language and reading it in yours. Auto-detects source if omitted. Async — returns requestId, poll with check_job_status(jobType='transcribe-translate'). Flat price covers STT + translation. Cheaper than calling transcribe_audio + translate_text separately for typical voice messages. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_translate'.
    Connector
  • Convert text to speech using a named built-in preset voice, with optional emotion and language settings. Both text and voice_preset_id are required and the text is limited to 1000 characters; invalid input returns HTTP 400. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this when you want a ready-made catalog voice and do not need to supply your own sample; use createSpeech to clone a voice from an audio sample instead, and createVoice to design a new voice from a text description. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Search the AI Tool Directory catalog: tool details, status checks (alive/acquired/deceased + cause and date), alternatives, and side-by-side comparisons. Read-only.

  • Measure voice/VoIP path quality -> estimated MOS + live network metrics

  • List recent execution traces for an agent — the same data as /admin/requests, scoped to one agent and readable by an LLM. Use this when an agent call timed out, drafted the wrong response, or you want to know which tool/LLM call burned the latency. Pair with `agents.trace_get` for full detail on a specific trace. Filters: `status`, `success`, `source` (single value or comma-separated: `agent,voice`), `date_from`/`date_to` (ISO-8601), pagination via `limit`/`offset`. Returns `returned_count`, `dropped_on_page` (should be 0 — positive means the backend agent_id predicate let something through), and `has_more`. Edge case: a raw page of all-dedup-dropped rows yields `returned_count=0, has_more=true`; re-call with `offset += limit`.
    Connector
  • Place a conversational voice-AI phone call to a business on a consumer's behalf and return a structured answer. THE differentiated capability: reach the ~60M long-tail SMBs that have NO API and NO booking page — only a phone number. An AI agent cannot pick up a phone and hold a conversation; this tool does. Give a plain-language objective; the voice AI navigates the call and extracts the answer. Business-directed (B2B), far less restricted than calling consumers — but the compliance gate still enforces recording consent per jurisdiction. Async: returns a call handle; poll get_outcome for the transcript + extracted fields. WHEN TO USE: Use when the target business has NO booking URL and NO API — only a phone number — and the consumer asked the agent to reach them (e.g. 'call this plumber and ask if they can come Tuesday', 'ask the salon if they take walk-ins this afternoon'). Also use to confirm details a booking page doesn't expose (real-time availability, custom quotes). WHEN NOT TO USE: Do NOT use when the business has a booking URL — use import_booking_url + schedule_appointment (cheaper, faster, deterministic). Do NOT use for calls to consumers/individuals (this tool is for reaching businesses). Do NOT use for marketing or telemarketing — the compliance gate and the B2B-only framing reject that. COST: $0.5 per_call LATENCY: ~45000ms EXECUTION: async_by_default (use get_outcome to retrieve result)
    Connector
  • Your saved voices — one tool for the whole voice library. Users speak plain language and never know ids: resolve every voice by NAME yourself (call action "list" first if unsure) and never ask the user for an id. action="list" returns every saved voice with voice_id, name, kind and ready — kind "reference" is an instant voice match saved from a clip and kind "clone" is a trained voice (both speak through generate_audio: pass the NAME as its voice param); kind "avatar" voices drive talking_avatar_video. action="create" saves a NEW reference voice from a clip: voice_name plus audio_url (e.g. the url upload_media returned) or audio_base64 (+ format) — free, ready instantly. action="rename" renames a saved voice (voice_id takes the id OR the current name, new_name is the new name). action="clone" registers a voice for talking_avatar_video from audio_sample_url + voice_name (charged 2 credits). action="delete" removes a voice by voice_id or name.
    Connector
  • Generate spoken audio from text: narration, a voiceover, a read-aloud script, or a multi-voice dialogue. Pass text (up to 2048 chars) — the words to be spoken. To speak in one of YOUR saved voices, pass voice with the voice NAME (or id): users speak plain language and never know ids, so resolve the name yourself (the voice tool, action "list", shows every saved voice) and never ask the user for an id. Reference voices, trained clones and preset voices are all routed correctly by kind. To match a voice instantly from a clip instead, pass reference_audio_url (a short clip) or up to 3 reference_audio_urls and address them as @Audio1, @Audio2, @Audio3 in the text for dialogue. Alternatively pass image_url to voice a scene from a picture (cannot combine with reference audio). Optional speech_rate (-50..100), pitch (-12..12), loudness (-50..100). Returns a playable audio_url, duration_seconds, and generation_id (also saved to your library).
    Connector
  • Design a new voice from a character description (such as "deep-voiced warrior" or "cheerful young girl") and have it speak a short line of text, returning a sample of that newly created voice. Both voice_description and text are required, the spoken text is limited to 200 characters or the call returns HTTP 400, and type selects "human" or "non-human" voices. Synchronous: the call blocks until generation finishes and returns a single audio result containing a URL; there is no separate polling step. Credits are charged on success. Use this to invent and audition a voice from a description; use createSpeech for text-to-speech that clones a specific voice from an audio sample, and createSpeechPreset for text-to-speech using a named preset voice. Pass an optional request_id to tag the result so you can locate it later via getAudioResults. Requires an API key (user scope). Credits: This endpoint consumes 1 credits per call.
    Connector
  • Roll (regenerate) the personal proxy credential for a firewall. This invalidates the previous password and returns a new one with ready-to-use configuration commands. Only call this when the user explicitly needs new credentials — it will break any existing package manager configuration using the old password.
    Connector
  • Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
    Connector
  • Text-to-speech with 3 tiers: OmniVoice Global (602+ languages including Yoruba, Bengali, Cebuano, Twi, zero-shot voice cloning, 100 chars/sat — use 'language' parameter with ISO code), Inworld Premium (#1 ranked TTS ELO 1217, emotion control, 40+ languages, 50 chars/sat), Minimax Studio (voice cloning from reference clip, 40+ languages, 10 chars/sat). Adjustable speed (0.5-2.0x). Returns audio URL. Pay with Bitcoin Lightning — no API key or signup needed. When NOT to use: not for phone calls (use place_call for one-shot broadcasts, ai_call for AI voice agents, or open_voice_bridge to drive the call with your own LLM). For rare/underserved languages (Yoruba, Twi, Marathi, Cebuano, etc.), pick OmniVoice Global via language= — Inworld/Minimax don't cover these. Requires create_payment with toolName='text_to_speech'.
    Connector
  • Statistically validated leading indicator signals evaluated against live supply chain data. Each signal is a Granger-causal relationship tested at p<=0.01 with directional accuracy >=55%. Signals predict commodity price movements, manufacturing shifts, and macroeconomic changes 1 week to 6 months ahead. Returns ACTIVE (threshold crossed — act now), WATCH (approaching threshold — prepare), or CLEAR status for each signal. 58 signals across 3 tiers organized by predictor group (GDI pillars, SMI regions, cross-index spreads). Used by commodity traders for forward-looking positioning, procurement teams for buy/defer timing, and hedge funds for alternative data signals.
    Connector
  • Statistically validated leading indicator signals evaluated against live supply chain data. Each signal is a Granger-causal relationship tested at p<=0.01 with directional accuracy >=55%. Signals predict commodity price movements, manufacturing shifts, and macroeconomic changes 1 week to 6 months ahead. Returns ACTIVE (threshold crossed — act now), WATCH (approaching threshold — prepare), or CLEAR status for each signal. 58 signals across 3 tiers organized by predictor group (GDI pillars, SMI regions, cross-index spreads). Used by commodity traders for forward-looking positioning, procurement teams for buy/defer timing, and hedge funds for alternative data signals.
    Connector
  • Alias of chieflab_status. Use as the FIRST tool when an agent session starts on a workspace that already has activity — recovers all open business loops with literal user commands. Same response shape as chieflab_status, same handler. If the user asked to launch the current repo and a recovered open loop looks unrelated, do not blindly resume it; start a fresh launch for the current repo.
    Connector
  • Use this when the user asks to read, extract, get the text/content/article of, or summarize a webpage/URL. Do NOT use for a visual screenshot (use rendex_screenshot). Extracts clean reader-mode content from any webpage as Markdown, JSON, or HTML. Runs the same Chromium render pass as a screenshot, so it captures content after JavaScript runs — handles SPAs that fetch-only readers miss. Strips nav, ads, and boilerplate, returning the article body plus title, byline, and excerpt. Great for feeding page content to an LLM, summarization, or RAG ingestion. Costs 1 render credit per call.
    Connector
  • List tone profiles for a strategy. Today returns at most one entry — the tone_of_voice synthesized by the Tone of Voice Synthesis agent (POWER-mode bundles only). The shape is list-stable so future multi-tone bundles plug in without changing the contract. Use this to align generation with the brand-tied voice DNA before writing copy, hooks, or scripts.
    Connector
  • NO AUTH / PUBLIC / READ-ONLY. Builds and validates a copy-pasteable authenticated /api/v2/{dataset}/runs HTTP request without sending it. This tool does not execute the request, query weather values, or return forecast data. Use gribstream_query_runs when the user asks for actual model-run forecast data or CSV/JSON/NDJSON data. Generated direct API requests include Accept-Encoding: gzip, and generated curl commands use --compressed so large responses can be transferred compressed when the client supports it. The request body must use exact selectors discovered from the catalog or shared-parameter tools, with coordinates in request.coordinates and selectors in request.variables.
    Connector