Skip to main content
Glama

get_youtube_transcript

Downloads YouTube video transcripts and metadata. Provide a video URL and optional language to retrieve the transcript.

Instructions

Download YouTube video transcript and metadata

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
languageNo

Implementation Reference

  • src/index.ts:40-43 (registration)
    Tool registration in ListToolsRequestSchema handler. Defines the tool named 'get_youtube_transcript' with a description and input schema (url, optional language).
      name: 'get_youtube_transcript',
      description: 'Download YouTube video transcript and metadata',
      inputSchema: zodToJsonSchema(YoutubeTranscriptSchema) as ToolInput,
    },
  • Zod schema for input validation: requires a 'url' string and has an optional 'language' string.
    const YoutubeTranscriptSchema = z.object({
      url: z.string(),
      language: z.string().optional(),
    });
  • Main handler for the 'get_youtube_transcript' tool. Parses args, calls getTranscript and getVideoMetadata from youtube.ts, and returns JSON with title, description, and transcript.
    case 'get_youtube_transcript': {
      const parsed = YoutubeTranscriptSchema.safeParse(args);
      if (!parsed.success) {
        throw new Error(
          `Invalid arguments for download_youtube_url: ${parsed.error}`
        );
      }
    
      try {
        const [transcript, metadata] = await Promise.all([
          getTranscript(parsed.data.url, { language: parsed.data.language }),
          getVideoMetadata(parsed.data.url),
        ]);
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({
                title: metadata.title,
                description: metadata.description,
                transcript,
              }),
            },
          ],
        };
      } catch (error) {
        const errorMessage =
          error instanceof Error ? error.message : String(error);
        return {
          content: [
            {
              type: 'text',
              text: `YouTube API Error: ${errorMessage}`,
            },
          ],
          isError: true,
        };
      }
    }
  • Core helper function getTranscript that downloads YouTube subtitles using yt-dlp, handles filename too long errors with a fallback, and returns cleaned transcript text.
    export async function getTranscript(
      url: string,
      options: TranscriptOptions = {}
    ): Promise<string> {
      const tempDir =
        options.tempDir || fs.mkdtempSync(path.join(os.tmpdir(), 'yt-'));
      
      // Use video ID as filename to avoid "File name too long" errors
      const shortFilenameTemplate = path.join(tempDir, '%(id)s.%(ext)s');
      
      const args = [
        '--write-sub',
        '--write-auto-sub',
        '--sub-lang',
        options.language || 'ja',
        '--skip-download',
        '--sub-format',
        'vtt',
        '--output',
        shortFilenameTemplate,
        '--verbose',
        url,
      ];
    
      try {
        await spawnPromise('yt-dlp', args);
        const files = fs.readdirSync(tempDir);
        const subtitleFiles = files.filter(
          (file) => file.endsWith('.vtt') || file.endsWith('.srt')
        );
    
        if (subtitleFiles.length === 0) {
          throw new YouTubeError('No transcript found for this video');
        }
    
        const content = fs.readFileSync(
          path.join(tempDir, subtitleFiles[0]),
          'utf8'
        );
    
        return cleanTranscript(content);
      } catch (error) {
        // Check if it's a filename length error and try fallback with even shorter name
        if (error instanceof Error && 
            (error.message.includes('File name too long') || 
             error.message.includes('ENOENT') || 
             error.message.includes('Errno 36'))) {
          
          console.warn('Filename too long error detected, attempting fallback with timestamp-based filename');
          
          // Fallback: Use timestamp-based filename
          const timestamp = Date.now();
          const fallbackTemplate = path.join(tempDir, `yt_${timestamp}.%(ext)s`);
          const fallbackArgs = [...args];
          const outputIndex = fallbackArgs.indexOf('--output');
          if (outputIndex !== -1) {
            fallbackArgs[outputIndex + 1] = fallbackTemplate;
          }
          
          try {
            await spawnPromise('yt-dlp', fallbackArgs);
            const files = fs.readdirSync(tempDir);
            const subtitleFiles = files.filter(
              (file) => file.endsWith('.vtt') || file.endsWith('.srt')
            );
    
            if (subtitleFiles.length === 0) {
              throw new YouTubeError('No transcript found for this video (fallback attempt)');
            }
    
            const content = fs.readFileSync(
              path.join(tempDir, subtitleFiles[0]),
              'utf8'
            );
    
            return cleanTranscript(content);
          } catch (fallbackError) {
            if (fallbackError instanceof Error) {
              throw new YouTubeError(`Failed to get transcript even with fallback filename: ${fallbackError.message}`);
            }
            throw fallbackError;
          }
        }
        
        if (error instanceof Error) {
          throw new YouTubeError(`Failed to get transcript: ${error.message}`);
        }
        throw error;
      } finally {
        if (!options.tempDir) {
          rimraf.sync(tempDir);
        }
      }
    }
  • Helper function getVideoMetadata that extracts video title and description using yt-dlp's --print option.
    export async function getVideoMetadata(url: string): Promise<VideoMetadata> {
      const args = [
        '--skip-download',
        '--print',
        '%(title)s\n%(description)s',
        url,
      ];
    
      try {
        const result = await spawnPromise('yt-dlp', args);
        const [title, ...descriptionParts] = result.split('\n');
        const description = descriptionParts.join('\n');
    
        return {
          title: title.trim(),
          description: description.trim(),
        };
      } catch (error) {
        if (error instanceof Error) {
          throw new YouTubeError(`Failed to get video metadata: ${error.message}`);
        }
        throw error;
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of disclosing behavioral traits. It only states 'Download', which implies a read operation, but it does not mention potential issues like missing transcripts, language availability, rate limits, or authentication requirements. The description lacks sufficient behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at only 6 words, with no wasted text. However, it is also under-specified, lacking structure or front-loading of key details. It meets minimal conciseness but at the cost of completeness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations, output schema, and schema descriptions, the description is insufficiently complete. It does not address parameter details, return format, error conditions, or usage prerequisites. The agent would lack necessary context to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the input schema provides no descriptions for the 'url' and 'language' parameters. The tool description does not mention or clarify any parameter semantics, failing to add meaning beyond the bare schema. This is a critical gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Download' and the resource 'YouTube video transcript and metadata', which is specific and unambiguous. However, it does not detail what metadata is included, but it is still clear enough for an agent to understand the tool's purpose. Since there are no sibling tools, no differentiation is needed.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (none exist), nor does it specify prerequisites, limitations, or context. It simply states what the tool does without any usage advice, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kazuph/mcp-youtube'

If you have feedback or need assistance with the MCP directory API, please join our Discord server