get_youtube_transcript
Extract and download YouTube video transcripts and metadata by providing the video URL and optional language preference. Ideal for content analysis, accessibility, or research.
Instructions
Download YouTube video transcript and metadata
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| language | No | ||
| url | Yes |
Implementation Reference
- src/index.ts:54-93 (handler)Executes the get_youtube_transcript tool: validates input, fetches transcript and metadata using helpers, returns JSON or error.case 'get_youtube_transcript': { const parsed = YoutubeTranscriptSchema.safeParse(args); if (!parsed.success) { throw new Error( `Invalid arguments for download_youtube_url: ${parsed.error}` ); } try { const [transcript, metadata] = await Promise.all([ getTranscript(parsed.data.url, { language: parsed.data.language }), getVideoMetadata(parsed.data.url), ]); return { content: [ { type: 'text', text: JSON.stringify({ title: metadata.title, description: metadata.description, transcript, }), }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `YouTube API Error: ${errorMessage}`, }, ], isError: true, }; } }
- src/index.ts:15-18 (schema)Zod schema defining input for get_youtube_transcript: url (required), language (optional).const YoutubeTranscriptSchema = z.object({ url: z.string(), language: z.string().optional(), });
- src/index.ts:39-43 (registration)Registers the get_youtube_transcript tool in the listTools handler with name, description, and schema.{ name: 'get_youtube_transcript', description: 'Download YouTube video transcript and metadata', inputSchema: zodToJsonSchema(YoutubeTranscriptSchema) as ToolInput, },
- src/youtube.ts:49-142 (helper)Core helper function to download and clean YouTube transcript using yt-dlp, handles temp files and errors.export async function getTranscript( url: string, options: TranscriptOptions = {} ): Promise<string> { const tempDir = options.tempDir || fs.mkdtempSync(path.join(os.tmpdir(), 'yt-')); // Use video ID as filename to avoid "File name too long" errors const shortFilenameTemplate = path.join(tempDir, '%(id)s.%(ext)s'); const args = [ '--write-sub', '--write-auto-sub', '--sub-lang', options.language || 'ja', '--skip-download', '--sub-format', 'vtt', '--output', shortFilenameTemplate, '--verbose', url, ]; try { await spawnPromise('yt-dlp', args); const files = fs.readdirSync(tempDir); const subtitleFiles = files.filter( (file) => file.endsWith('.vtt') || file.endsWith('.srt') ); if (subtitleFiles.length === 0) { throw new YouTubeError('No transcript found for this video'); } const content = fs.readFileSync( path.join(tempDir, subtitleFiles[0]), 'utf8' ); return cleanTranscript(content); } catch (error) { // Check if it's a filename length error and try fallback with even shorter name if (error instanceof Error && (error.message.includes('File name too long') || error.message.includes('ENOENT') || error.message.includes('Errno 36'))) { console.warn('Filename too long error detected, attempting fallback with timestamp-based filename'); // Fallback: Use timestamp-based filename const timestamp = Date.now(); const fallbackTemplate = path.join(tempDir, `yt_${timestamp}.%(ext)s`); const fallbackArgs = [...args]; const outputIndex = fallbackArgs.indexOf('--output'); if (outputIndex !== -1) { fallbackArgs[outputIndex + 1] = fallbackTemplate; } try { await spawnPromise('yt-dlp', fallbackArgs); const files = fs.readdirSync(tempDir); const subtitleFiles = files.filter( (file) => file.endsWith('.vtt') || file.endsWith('.srt') ); if (subtitleFiles.length === 0) { throw new YouTubeError('No transcript found for this video (fallback attempt)'); } const content = fs.readFileSync( path.join(tempDir, subtitleFiles[0]), 'utf8' ); return cleanTranscript(content); } catch (fallbackError) { if (fallbackError instanceof Error) { throw new YouTubeError(`Failed to get transcript even with fallback filename: ${fallbackError.message}`); } throw fallbackError; } } if (error instanceof Error) { throw new YouTubeError(`Failed to get transcript: ${error.message}`); } throw error; } finally { if (!options.tempDir) { rimraf.sync(tempDir); } } }
- src/youtube.ts:24-47 (helper)Helper to fetch YouTube video title and description using yt-dlp.export async function getVideoMetadata(url: string): Promise<VideoMetadata> { const args = [ '--skip-download', '--print', '%(title)s\n%(description)s', url, ]; try { const result = await spawnPromise('yt-dlp', args); const [title, ...descriptionParts] = result.split('\n'); const description = descriptionParts.join('\n'); return { title: title.trim(), description: description.trim(), }; } catch (error) { if (error instanceof Error) { throw new YouTubeError(`Failed to get video metadata: ${error.message}`); } throw error; } }