get_youtube_transcript
Downloads YouTube video transcripts and metadata. Provide a video URL and optional language to retrieve the transcript.
Instructions
Download YouTube video transcript and metadata
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| language | No |
Implementation Reference
- src/index.ts:40-43 (registration)Tool registration in ListToolsRequestSchema handler. Defines the tool named 'get_youtube_transcript' with a description and input schema (url, optional language).
name: 'get_youtube_transcript', description: 'Download YouTube video transcript and metadata', inputSchema: zodToJsonSchema(YoutubeTranscriptSchema) as ToolInput, }, - src/index.ts:15-18 (schema)Zod schema for input validation: requires a 'url' string and has an optional 'language' string.
const YoutubeTranscriptSchema = z.object({ url: z.string(), language: z.string().optional(), }); - src/index.ts:54-93 (handler)Main handler for the 'get_youtube_transcript' tool. Parses args, calls getTranscript and getVideoMetadata from youtube.ts, and returns JSON with title, description, and transcript.
case 'get_youtube_transcript': { const parsed = YoutubeTranscriptSchema.safeParse(args); if (!parsed.success) { throw new Error( `Invalid arguments for download_youtube_url: ${parsed.error}` ); } try { const [transcript, metadata] = await Promise.all([ getTranscript(parsed.data.url, { language: parsed.data.language }), getVideoMetadata(parsed.data.url), ]); return { content: [ { type: 'text', text: JSON.stringify({ title: metadata.title, description: metadata.description, transcript, }), }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `YouTube API Error: ${errorMessage}`, }, ], isError: true, }; } } - src/youtube.ts:49-142 (helper)Core helper function getTranscript that downloads YouTube subtitles using yt-dlp, handles filename too long errors with a fallback, and returns cleaned transcript text.
export async function getTranscript( url: string, options: TranscriptOptions = {} ): Promise<string> { const tempDir = options.tempDir || fs.mkdtempSync(path.join(os.tmpdir(), 'yt-')); // Use video ID as filename to avoid "File name too long" errors const shortFilenameTemplate = path.join(tempDir, '%(id)s.%(ext)s'); const args = [ '--write-sub', '--write-auto-sub', '--sub-lang', options.language || 'ja', '--skip-download', '--sub-format', 'vtt', '--output', shortFilenameTemplate, '--verbose', url, ]; try { await spawnPromise('yt-dlp', args); const files = fs.readdirSync(tempDir); const subtitleFiles = files.filter( (file) => file.endsWith('.vtt') || file.endsWith('.srt') ); if (subtitleFiles.length === 0) { throw new YouTubeError('No transcript found for this video'); } const content = fs.readFileSync( path.join(tempDir, subtitleFiles[0]), 'utf8' ); return cleanTranscript(content); } catch (error) { // Check if it's a filename length error and try fallback with even shorter name if (error instanceof Error && (error.message.includes('File name too long') || error.message.includes('ENOENT') || error.message.includes('Errno 36'))) { console.warn('Filename too long error detected, attempting fallback with timestamp-based filename'); // Fallback: Use timestamp-based filename const timestamp = Date.now(); const fallbackTemplate = path.join(tempDir, `yt_${timestamp}.%(ext)s`); const fallbackArgs = [...args]; const outputIndex = fallbackArgs.indexOf('--output'); if (outputIndex !== -1) { fallbackArgs[outputIndex + 1] = fallbackTemplate; } try { await spawnPromise('yt-dlp', fallbackArgs); const files = fs.readdirSync(tempDir); const subtitleFiles = files.filter( (file) => file.endsWith('.vtt') || file.endsWith('.srt') ); if (subtitleFiles.length === 0) { throw new YouTubeError('No transcript found for this video (fallback attempt)'); } const content = fs.readFileSync( path.join(tempDir, subtitleFiles[0]), 'utf8' ); return cleanTranscript(content); } catch (fallbackError) { if (fallbackError instanceof Error) { throw new YouTubeError(`Failed to get transcript even with fallback filename: ${fallbackError.message}`); } throw fallbackError; } } if (error instanceof Error) { throw new YouTubeError(`Failed to get transcript: ${error.message}`); } throw error; } finally { if (!options.tempDir) { rimraf.sync(tempDir); } } } - src/youtube.ts:24-47 (helper)Helper function getVideoMetadata that extracts video title and description using yt-dlp's --print option.
export async function getVideoMetadata(url: string): Promise<VideoMetadata> { const args = [ '--skip-download', '--print', '%(title)s\n%(description)s', url, ]; try { const result = await spawnPromise('yt-dlp', args); const [title, ...descriptionParts] = result.split('\n'); const description = descriptionParts.join('\n'); return { title: title.trim(), description: description.trim(), }; } catch (error) { if (error instanceof Error) { throw new YouTubeError(`Failed to get video metadata: ${error.message}`); } throw error; } }