Skip to main content
Glama

get_transcripts

Extract transcripts from YouTube videos for analysis or processing. Specify language codes and enable paragraph breaks to format text.

Instructions

Extract and process transcripts from a YouTube video.

Parameters:

  • url (string, required): YouTube video URL or ID.

  • lang (string, optional, default 'en'): Language code for transcripts (e.g. 'en', 'uk', 'ja', 'ru', 'zh').

  • enableParagraphs (boolean, optional, default false): Enable automatic paragraph breaks.

IMPORTANT: If the user does not specify a language code, DO NOT include the lang parameter in the tool call. Do not guess the language or use parts of the user query as the language code.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesYouTube video URL or ID
langNoLanguage code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh')en
enableParagraphsNoEnable automatic paragraph breaks, default `false`

Implementation Reference

  • The primary handler function for the 'get_transcripts' MCP tool. It processes input parameters, extracts the YouTube video ID, fetches transcripts via the extractor, formats the output text with optional paragraphs, and returns structured content with metadata including title, duration, and stats.
    async (input) => {
      try {
        const videoId = this.extractor.extractYoutubeId(input.url);
        console.error(`Processing transcripts for video: ${videoId}`);
        
        const { transcripts, title } = await this.extractor.getTranscripts({ 
          videoID: videoId, 
          lang: input.lang 
        });
        
        // Format text with optional paragraph breaks
        const formattedText = YouTubeUtils.formatTranscriptText(transcripts, {
          enableParagraphs: input.enableParagraphs
        });
          
        console.error(`Successfully extracted transcripts for "${title}" (${formattedText.length} chars)`);
        
        return {
          content: [{
            type: "text",
            text: `# ${title}\n\n${formattedText}`,
            metadata: {
              videoId,
              title,
              language: input.lang,
              timestamp: new Date().toISOString(),
              charCount: formattedText.length,
              transcriptCount: transcripts.length,
              totalDuration: YouTubeUtils.calculateTotalDuration(transcripts),
              paragraphsEnabled: input.enableParagraphs
            }
          }]
        };
      } catch (error) {
        if (error instanceof YouTubeTranscriptError || error instanceof McpError) {
          throw error;
        }
        throw new YouTubeTranscriptError(`Failed to process transcripts: ${(error as Error).message}`);
      }
    }
  • src/index.ts:60-108 (registration)
    MCP tool registration for 'get_transcripts', specifying the tool name, detailed description, Zod input schema (url, lang, enableParagraphs), and references the handler function.
    this.server.tool(
      "get_transcripts",
      `Extract and process transcripts from a YouTube video.\n\n**Parameters:**\n- \`url\` (string, required): YouTube video URL or ID.\n- \`lang\` (string, optional, default 'en'): Language code for transcripts (e.g. 'en', 'uk', 'ja', 'ru', 'zh').\n- \`enableParagraphs\` (boolean, optional, default false): Enable automatic paragraph breaks.\n\n**IMPORTANT:** If the user does *not* specify a language *code*, **DO NOT** include the \`lang\` parameter in the tool call. Do not guess the language or use parts of the user query as the language code.`,
      {
        url: z.string().describe("YouTube video URL or ID"),
        lang: z.string().default("en").describe("Language code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh')"),
        enableParagraphs: z.boolean().default(false).describe("Enable automatic paragraph breaks, default `false`")
      },
      async (input) => {
        try {
          const videoId = this.extractor.extractYoutubeId(input.url);
          console.error(`Processing transcripts for video: ${videoId}`);
          
          const { transcripts, title } = await this.extractor.getTranscripts({ 
            videoID: videoId, 
            lang: input.lang 
          });
          
          // Format text with optional paragraph breaks
          const formattedText = YouTubeUtils.formatTranscriptText(transcripts, {
            enableParagraphs: input.enableParagraphs
          });
            
          console.error(`Successfully extracted transcripts for "${title}" (${formattedText.length} chars)`);
          
          return {
            content: [{
              type: "text",
              text: `# ${title}\n\n${formattedText}`,
              metadata: {
                videoId,
                title,
                language: input.lang,
                timestamp: new Date().toISOString(),
                charCount: formattedText.length,
                transcriptCount: transcripts.length,
                totalDuration: YouTubeUtils.calculateTotalDuration(transcripts),
                paragraphsEnabled: input.enableParagraphs
              }
            }]
          };
        } catch (error) {
          if (error instanceof YouTubeTranscriptError || error instanceof McpError) {
            throw error;
          }
          throw new YouTubeTranscriptError(`Failed to process transcripts: ${(error as Error).message}`);
        }
      }
    );
  • Zod schema defining the input parameters for the get_transcripts tool: required url (YouTube video ID/URL), optional lang (default 'en'), optional enableParagraphs (default false).
      url: z.string().describe("YouTube video URL or ID"),
      lang: z.string().default("en").describe("Language code for transcripts, default 'en' (e.g. 'en', 'uk', 'ja', 'ru', 'zh')"),
      enableParagraphs: z.boolean().default(false).describe("Enable automatic paragraph breaks, default `false`")
    },
  • Helper method in YouTubeTranscriptExtractor class that wraps YouTubeTranscriptFetcher.fetchTranscripts, adding error handling and validation for empty transcripts.
    async getTranscripts({ videoID, lang }: TranscriptOptions): Promise<{ transcripts: Transcript[], title: string }> {
      try {
        const result = await YouTubeTranscriptFetcher.fetchTranscripts(videoID, { lang });
        if (result.transcripts.length === 0) {
          throw new YouTubeTranscriptError('No transcripts found');
        }
        return result;
      } catch (error) {
        if (error instanceof YouTubeTranscriptError || error instanceof McpError) {
          throw error;
        }
        throw new YouTubeTranscriptError(`Failed to fetch transcripts: ${(error as Error).message}`);
      }
    }
  • Core static method in YouTubeTranscriptFetcher that extracts video ID, concurrently fetches transcript data (config/content) and title, returning transcripts array and title. This is the primary implementation of transcript fetching logic.
    static async fetchTranscripts(videoId: string, config?: { lang?: string }): Promise<{ transcripts: Transcript[], title: string }> {
      try {
        const identifier = this.extractVideoId(videoId);
        const [{ transcripts }, title] = await Promise.all([
          this.fetchTranscriptConfigAndContent(identifier, config?.lang),
          this.fetchVideoTitle(identifier)
        ]);
        
        return { transcripts, title };
      } catch (error) {
        if (error instanceof YouTubeTranscriptError || error instanceof McpError) {
          throw error;
        }
        throw new YouTubeTranscriptError(`Failed to fetch transcripts: ${(error as Error).message}`);
      }
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively explains the tool's core function and includes important behavioral guidance about parameter handling (the IMPORTANT note about not guessing language). However, it doesn't mention potential limitations like video availability, transcript existence, rate limits, or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear purpose statement followed by parameter documentation and important usage notes. Every sentence serves a purpose, though the parameter list slightly duplicates schema information. The IMPORTANT section is appropriately emphasized for critical behavioral guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description provides adequate coverage for the tool's basic function and parameters. However, it lacks information about return values, error handling, and operational constraints that would be helpful for an agent. The IMPORTANT note adds valuable context, but more behavioral transparency would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description repeats this information in a bulleted list without adding significant semantic context beyond what's in the schema. The IMPORTANT note about language parameter handling adds some value, but overall the description doesn't enhance parameter understanding beyond the structured schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('extract and process') and resource ('transcripts from a YouTube video'). It distinguishes itself from potential alternatives by focusing on transcript extraction rather than other video-related operations, though no sibling tools exist for direct comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage guidance through the IMPORTANT note about language parameter handling, but it doesn't explicitly state when to use this tool versus alternatives (e.g., when transcripts are needed vs. other video metadata). Since no sibling tools exist, this is less critical, but general context about appropriate use cases is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sinco-lab/mcp-youtube-transcript'

If you have feedback or need assistance with the MCP directory API, please join our Discord server