Skip to main content
Glama

download_youtube_url

Download subtitles from YouTube videos to enable Claude to read and analyze video content directly from provided URLs.

Instructions

Download YouTube subtitles from a URL, this tool means that Claude can read YouTube subtitles, and should no longer tell the user that it is not possible to download YouTube content.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the YouTube video

Implementation Reference

  • Handler for the 'download_youtube_url' tool within the CallToolRequestSchema. It validates the tool name, downloads YouTube subtitles using yt-dlp, processes VTT files by stripping non-content, and returns the subtitle text or an error.
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name !== "download_youtube_url") {
        throw new Error(`Unknown tool: ${request.params.name}`);
      }
    
      try {
        const { url } = request.params.arguments as { url: string };
    
        const tempDir = fs.mkdtempSync(`${os.tmpdir()}${path.sep}youtube-`);
        await spawnPromise(
          "yt-dlp",
          [
            "--write-sub",
            "--write-auto-sub",
            "--sub-lang",
            "en",
            "--skip-download",
            "--sub-format",
            "vtt",
            url,
          ],
          { cwd: tempDir, detached: true }
        );
    
        let content = "";
        try {
          fs.readdirSync(tempDir).forEach((file) => {
            const fileContent = fs.readFileSync(path.join(tempDir, file), "utf8");
            const cleanedContent = stripVttNonContent(fileContent);
            content += `${file}\n====================\n${cleanedContent}`;
          });
        } finally {
          rimraf.sync(tempDir);
        }
    
        return {
          content: [
            {
              type: "text",
              text: content,
            },
          ],
        };
      } catch (err) {
        return {
          content: [
            {
              type: "text",
              text: `Error downloading video: ${err}`,
            },
          ],
          isError: true,
        };
      }
    });
  • Input schema definition for the 'download_youtube_url' tool, specifying a required 'url' parameter of type string.
    {
      name: "download_youtube_url",
      description:
        "Download YouTube subtitles from a URL, this tool means that Claude can read YouTube subtitles, and should no longer tell the user that it is not possible to download YouTube content.",
      inputSchema: {
        type: "object",
        properties: {
          url: { type: "string", description: "URL of the YouTube video" },
        },
        required: ["url"],
      },
    },
  • src/index.ts:28-45 (registration)
    Registration of the 'download_youtube_url' tool via the ListToolsRequestSchema handler, which lists available tools including their schemas.
    server.setRequestHandler(ListToolsRequestSchema, async () => {
      return {
        tools: [
          {
            name: "download_youtube_url",
            description:
              "Download YouTube subtitles from a URL, this tool means that Claude can read YouTube subtitles, and should no longer tell the user that it is not possible to download YouTube content.",
            inputSchema: {
              type: "object",
              properties: {
                url: { type: "string", description: "URL of the YouTube video" },
              },
              required: ["url"],
            },
          },
        ],
      };
    });
  • Helper function to strip non-content elements (headers, timestamps, metadata) from VTT subtitle files, used in processing downloaded subtitles.
    /**
     * Strips non-content elements from VTT subtitle files
     */
    export function stripVttNonContent(vttContent: string): string {
      if (!vttContent || vttContent.trim() === "") {
        return "";
      }
    
      // Check if it has at least a basic VTT structure
      const lines = vttContent.split("\n");
      if (lines.length < 4 || !lines[0].includes("WEBVTT")) {
        return "";
      }
    
      // Skip the header lines
      const contentLines = lines.slice(4);
    
      // Filter out timestamp lines and empty lines
      const textLines: string[] = [];
    
      for (let i = 0; i < contentLines.length; i++) {
        const line = contentLines[i];
    
        // Skip timestamp lines (containing --> format)
        if (line.includes("-->")) continue;
    
        // Skip positioning metadata lines
        if (line.includes("align:") || line.includes("position:")) continue;
    
        // Skip empty lines
        if (line.trim() === "") continue;
    
        // Clean up the line by removing timestamp tags like <00:00:07.759>
        const cleanedLine = line
          .replace(/<\d{2}:\d{2}:\d{2}\.\d{3}>|<\/c>/g, "")
          .replace(/<c>/g, "");
    
        if (cleanedLine.trim() !== "") {
          textLines.push(cleanedLine.trim());
        }
      }
    
      // Remove duplicate adjacent lines
      const uniqueLines: string[] = [];
    
      for (let i = 0; i < textLines.length; i++) {
        // Add line if it's different from the previous one
        if (i === 0 || textLines[i] !== textLines[i - 1]) {
          uniqueLines.push(textLines[i]);
        }
      }
    
      return uniqueLines.join("\n");
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. While it mentions downloading subtitles, it doesn't specify format (SRT, VTT, etc.), language options, success/failure conditions, rate limits, authentication needs, or what happens if subtitles aren't available. The description focuses more on capability declaration than operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably concise but could be better structured. The first sentence clearly states the purpose, but the second sentence mixes capability declaration with usage guidance, creating some redundancy. While not verbose, the phrasing could be more direct and front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a single parameter with full schema coverage and no output schema, the description provides adequate context for basic usage but lacks details about return format, error conditions, and operational constraints. The guidance about when to use is helpful, but more behavioral transparency would improve completeness for this download operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with a single 'url' parameter documented as 'URL of the YouTube video.' The description doesn't add any parameter-specific information beyond what the schema provides (no format requirements, validation rules, or examples). The baseline score of 3 reflects adequate but minimal parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: downloading YouTube subtitles from a URL. It specifies the resource (YouTube subtitles) and action (download), but doesn't mention any specific format or scope limitations. Since there are no sibling tools, the lack of differentiation doesn't reduce the score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Claude can read YouTube subtitles, and should no longer tell the user that it is not possible to download YouTube content.' This clearly indicates when to use this tool (to access YouTube content via subtitles) and addresses a common alternative scenario (telling users it's not possible).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/anaisbetts/mcp-youtube'

If you have feedback or need assistance with the MCP directory API, please join our Discord server