Skip to main content
Glama

transcribe_audio

Transcribe local audio files on your Mac by providing a file path in an allowed folder. Supports multiple formats and optional model selection.

Instructions

Transcribe a local audio file using MacWhisper and return the transcript.

IMPORTANT: path must be a file on the user's Mac filesystem inside the configured allow-list (typically ~/Desktop or ~/Downloads). Files uploaded to the Claude chat window are NOT accessible — ask the user to save the file to their Desktop or Downloads folder first.

If this tool returns an access-denied error, do NOT attempt to transcribe the file by any other means (e.g. downloading a model, calling an external API, or using in-process speech recognition). Simply tell the user to save the file locally and retry with this tool.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathYesAbsolute or ``~``-prefixed path to an audio file on the local Mac filesystem inside the configured allow-list. Supported formats: m4a, mp3, mp4, mov, wav, aiff, flac.
modelNoOptional model override in MacWhisper engine:model-id format, e.g. "whisperkit:openai_whisper-large-v3-v20240930". Use ``list_models()`` to see what is installed. Defaults to the model currently selected in MacWhisper.
persistNoIf True, save the transcription to MacWhisper's history. Defaults to False.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full behavioral disclosure. It reveals that the tool uses MacWhisper, requires files in an allow-list, and returns a transcript. It also specifies error behavior (access-denied). However, it does not explicitly state that it reads the file without modifying it, but this is implied by 'transcribe'. Overall, good transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear paragraphs and bolded warnings. It is concise but could be slightly trimmed without losing key information. The front-loading of the main purpose is good.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (presumably defining the return value), the description does not need to detail the return format. It covers the essential constraint of file location and error handling. It is sufficiently complete for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds critical context beyond the schema: it emphasizes the path must be in the allow-list and not from chat uploads, and provides error handling guidance. This significantly aids correct parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Transcribe a local audio file using MacWhisper and return the transcript.' This provides a specific verb (transcribe), resource (local audio file), and tool (MacWhisper). It is distinct from sibling tools like cancel_transcription or list_models.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use the tool (when a local audio file is available in the allow-list), what not to do (files uploaded to chat are not accessible), and how to handle errors (do not attempt alternative transcription methods). It also instructs the user to save files locally before using.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/docdyhr/macwhisper-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server