Upload and Analyze Media
upload_and_analyzeUpload media from direct file URLs or social/video links (YouTube, Instagram, TikTok, etc.) for transcription. Returns a media ID for async processing; poll status and retrieve AI-powered insights.
Instructions
Upload and transcribe media from a URL — a direct/public file URL, OR a shareable social/video link (YouTube, Instagram, TikTok, X, Facebook, Reddit, SoundCloud, and similar), which Speak resolves to the underlying media automatically. Returns media_id immediately; after this returns, poll get_media_status until state is 'processed' (typically 1-3 min for under 60min audio), then call get_media_insights for AI summaries. This async pattern is required for remote MCP transports — long blocking calls die at proxy idle timeouts. (Vimeo links are not yet supported.)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Direct/public media file URL, or a shareable social/video page link (e.g. an Instagram reel, TikTok, YouTube, or X post URL) — page links are resolved to the underlying media server-side. Pass the URL the user gave you as-is. | |
| name | No | Display name for the media (defaults to filename from URL) | |
| tags | No | Comma-separated tags | |
| folderId | No | Folder ID to place the media in | |
| mediaType | No | Media type (default: audio) | |
| sourceLanguage | No | BCP-47 language code (e.g., 'en-US', 'he-IL') |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Response payload from the Speak AI API |