AssemblyAI MCP Server

Submit Audio for Transcription

submit_transcription

Submit audio files for transcription processing using AssemblyAI's services. This tool handles asynchronous transcription jobs with options for speaker labels, language detection, and text formatting.

Instructions

Submit audio for transcription without waiting for completion

Input Schema

TableJSON Schema

Name	Required	Description	Default
`audio`	Yes	The URL or local file path of the audio to transcribe
`options`	No	Optional transcription settings

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions 'without waiting for completion', which hints at asynchronous behavior, but fails to disclose critical details like how results are retrieved (e.g., via polling or callback), authentication requirements, rate limits, error handling, or what happens on submission failure. For a tool with no annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('submit audio for transcription') and adds key behavioral context ('without waiting for completion'). There is no wasted verbiage, and every word earns its place by clarifying the tool's asynchronous nature.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of an asynchronous submission tool with no annotations and no output schema, the description is incomplete. It lacks details on how to handle the submission result (e.g., returns a job ID or status), error scenarios, or integration with sibling tools like 'get_transcript'. For a tool that likely involves background processing, more context is needed to use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters ('audio' and 'options') thoroughly. The description adds no additional parameter semantics beyond what's in the schema (e.g., it doesn't explain format constraints for 'audio' or default values for 'options'). With high schema coverage, the baseline is 3, as the description doesn't compensate but also doesn't detract.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('submit audio for transcription') and distinguishes it from siblings by specifying 'without waiting for completion', which implies an asynchronous operation. This differentiates it from tools like 'get_transcript' (which likely retrieves results) and 'transcribe_file/url' (which might be synchronous).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context by indicating this is for submitting audio without waiting, which implies it's appropriate for asynchronous processing. However, it doesn't explicitly state when NOT to use this tool or name specific alternatives among the siblings (e.g., 'use transcribe_file if you need immediate results'), leaving some guidance implicit rather than explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cogell/assembly-ai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server