get_transcripts
Extract transcripts from YouTube videos for analysis or processing. Specify language codes and enable paragraph breaks to format text.
Instructions
Extract and process transcripts from a YouTube video.
Parameters:
url(string, required): YouTube video URL or ID.lang(string, optional, default 'en'): Language code for transcripts (e.g. 'en', 'uk', 'ja', 'ru', 'zh').enableParagraphs(boolean, optional, default false): Enable automatic paragraph breaks.
IMPORTANT: If the user does not specify a language code, DO NOT include the lang parameter in the tool call. Do not guess the language or use parts of the user query as the language code.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | YouTube video URL or ID | |
| lang | No | Language code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh') | en |
| enableParagraphs | No | Enable automatic paragraph breaks, default `false` |
Implementation Reference
- src/index.ts:68-107 (handler)The primary handler function for the 'get_transcripts' MCP tool. It processes input parameters, extracts the YouTube video ID, fetches transcripts via the extractor, formats the output text with optional paragraphs, and returns structured content with metadata including title, duration, and stats.async (input) => { try { const videoId = this.extractor.extractYoutubeId(input.url); console.error(`Processing transcripts for video: ${videoId}`); const { transcripts, title } = await this.extractor.getTranscripts({ videoID: videoId, lang: input.lang }); // Format text with optional paragraph breaks const formattedText = YouTubeUtils.formatTranscriptText(transcripts, { enableParagraphs: input.enableParagraphs }); console.error(`Successfully extracted transcripts for "${title}" (${formattedText.length} chars)`); return { content: [{ type: "text", text: `# ${title}\n\n${formattedText}`, metadata: { videoId, title, language: input.lang, timestamp: new Date().toISOString(), charCount: formattedText.length, transcriptCount: transcripts.length, totalDuration: YouTubeUtils.calculateTotalDuration(transcripts), paragraphsEnabled: input.enableParagraphs } }] }; } catch (error) { if (error instanceof YouTubeTranscriptError || error instanceof McpError) { throw error; } throw new YouTubeTranscriptError(`Failed to process transcripts: ${(error as Error).message}`); } }
- src/index.ts:60-108 (registration)MCP tool registration for 'get_transcripts', specifying the tool name, detailed description, Zod input schema (url, lang, enableParagraphs), and references the handler function.this.server.tool( "get_transcripts", `Extract and process transcripts from a YouTube video.\n\n**Parameters:**\n- \`url\` (string, required): YouTube video URL or ID.\n- \`lang\` (string, optional, default 'en'): Language code for transcripts (e.g. 'en', 'uk', 'ja', 'ru', 'zh').\n- \`enableParagraphs\` (boolean, optional, default false): Enable automatic paragraph breaks.\n\n**IMPORTANT:** If the user does *not* specify a language *code*, **DO NOT** include the \`lang\` parameter in the tool call. Do not guess the language or use parts of the user query as the language code.`, { url: z.string().describe("YouTube video URL or ID"), lang: z.string().default("en").describe("Language code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh')"), enableParagraphs: z.boolean().default(false).describe("Enable automatic paragraph breaks, default `false`") }, async (input) => { try { const videoId = this.extractor.extractYoutubeId(input.url); console.error(`Processing transcripts for video: ${videoId}`); const { transcripts, title } = await this.extractor.getTranscripts({ videoID: videoId, lang: input.lang }); // Format text with optional paragraph breaks const formattedText = YouTubeUtils.formatTranscriptText(transcripts, { enableParagraphs: input.enableParagraphs }); console.error(`Successfully extracted transcripts for "${title}" (${formattedText.length} chars)`); return { content: [{ type: "text", text: `# ${title}\n\n${formattedText}`, metadata: { videoId, title, language: input.lang, timestamp: new Date().toISOString(), charCount: formattedText.length, transcriptCount: transcripts.length, totalDuration: YouTubeUtils.calculateTotalDuration(transcripts), paragraphsEnabled: input.enableParagraphs } }] }; } catch (error) { if (error instanceof YouTubeTranscriptError || error instanceof McpError) { throw error; } throw new YouTubeTranscriptError(`Failed to process transcripts: ${(error as Error).message}`); } } );
- src/index.ts:64-67 (schema)Zod schema defining the input parameters for the get_transcripts tool: required url (YouTube video ID/URL), optional lang (default 'en'), optional enableParagraphs (default false).url: z.string().describe("YouTube video URL or ID"), lang: z.string().default("en").describe("Language code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh')"), enableParagraphs: z.boolean().default(false).describe("Enable automatic paragraph breaks, default `false`") },
- src/index.ts:20-33 (helper)Helper method in YouTubeTranscriptExtractor class that wraps YouTubeTranscriptFetcher.fetchTranscripts, adding error handling and validation for empty transcripts.async getTranscripts({ videoID, lang }: TranscriptOptions): Promise<{ transcripts: Transcript[], title: string }> { try { const result = await YouTubeTranscriptFetcher.fetchTranscripts(videoID, { lang }); if (result.transcripts.length === 0) { throw new YouTubeTranscriptError('No transcripts found'); } return result; } catch (error) { if (error instanceof YouTubeTranscriptError || error instanceof McpError) { throw error; } throw new YouTubeTranscriptError(`Failed to fetch transcripts: ${(error as Error).message}`); } }
- src/youtube.ts:421-436 (helper)Core static method in YouTubeTranscriptFetcher that extracts video ID, concurrently fetches transcript data (config/content) and title, returning transcripts array and title. This is the primary implementation of transcript fetching logic.static async fetchTranscripts(videoId: string, config?: { lang?: string }): Promise<{ transcripts: Transcript[], title: string }> { try { const identifier = this.extractVideoId(videoId); const [{ transcripts }, title] = await Promise.all([ this.fetchTranscriptConfigAndContent(identifier, config?.lang), this.fetchVideoTitle(identifier) ]); return { transcripts, title }; } catch (error) { if (error instanceof YouTubeTranscriptError || error instanceof McpError) { throw error; } throw new YouTubeTranscriptError(`Failed to fetch transcripts: ${(error as Error).message}`); } }