Skip to main content
Glama

fetch_transcript

Read-only

Fetch official YouTube transcript with per-segment timestamps and automatic language detection. Returns error if no captions exist — then use AI ASR transcript. Free.

Instructions

Fetch the existing official transcript (subtitles/captions) of a YouTube video, with per-segment timestamps and language detected. Errors with NO_CAPTIONS if the video has no captions — fall back to transcribe_video in that case to generate one with AI ASR. This call is free.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
video_idYesYouTube video ID (e.g. 'dQw4w9WgXcQ') or full YouTube URL.
langNoISO 639-1 language code to select among multilingual captions (e.g. 'en', 'zh', 'ja'). Omit for the video's default language.
saveNoWhen true, also save the video to the user's Library in the same call. Bookmarks the meta row and flips has_asr when the transcript was produced by our ASR. Does NOT upload a summary — use save_to_library with kind='summary' or kind='both' for that.

Implementation Reference

  • Schema definition for the 'fetch_transcript' tool - defines name, description, annotations, and inputSchema with parameters: video_id (required), lang (optional ISO 639-1 code), save (optional boolean).
    {
      name: "fetch_transcript",
      description:
        "Fetch the existing official transcript (subtitles/captions) of a YouTube video, with per-segment timestamps and language detected. Errors with NO_CAPTIONS if the video has no captions — fall back to transcribe_video in that case to generate one with AI ASR. This call is free.",
      annotations: { title: "Fetch YouTube Video Transcript", ...ANN.YT_READ },
      inputSchema: {
        type: "object",
        properties: {
          video_id: {
            type: "string",
            description: "YouTube video ID (e.g. 'dQw4w9WgXcQ') or full YouTube URL.",
            minLength: 5,
          },
          lang: {
            type: "string",
            description:
              "ISO 639-1 language code to select among multilingual captions (e.g. 'en', 'zh', 'ja'). Omit for the video's default language.",
          },
          save: {
            type: "boolean",
            description:
              "When true, also save the video to the user's Library in the same call. Bookmarks the meta row and flips has_asr when the transcript was produced by our ASR. Does NOT upload a summary — use save_to_library with kind='summary' or kind='both' for that.",
          },
        },
        required: ["video_id"],
      },
  • The generic CallToolRequestSchema handler that forwards all tool calls (including fetch_transcript) to the upstream API via callUpstream. No specific fetch_transcript handler exists - it uses the generic proxy pattern.
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      try {
        return await callUpstream(
          request.params.name,
          request.params.arguments || {}
        );
      } catch (err) {
        return {
          content: [{ type: "text", text: err.message || String(err) }],
          isError: true,
        };
      }
    });
  • src/index.js:443-446 (registration)
    Server instantiation with tools capability. Tools are registered by including them in the TOOLS array (which includes fetch_transcript at index position).
    const server = new Server(
      { name: "subdownload", version: "1.0.0" },
      { capabilities: { tools: {} } }
    );
  • The callUpstream helper function that forwards tool calls (by name and arguments) to the upstream MCP endpoint at api.subdownload.com/mcp. Used by the CallToolRequestSchema handler for all tools including fetch_transcript.
    async function callUpstream(name, args) {
      if (!API_KEY) {
        throw new Error(
          "SUBDOWNLOAD_API_KEY env var is not set. Get one at https://subdownload.com/account, then run with -e SUBDOWNLOAD_API_KEY=<your-key>."
        );
      }
      const res = await fetch(UPSTREAM_URL, {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Accept: "application/json, text/event-stream",
          Authorization: `Bearer ${API_KEY}`,
        },
        body: JSON.stringify({
          jsonrpc: "2.0",
          id: Date.now(),
          method: "tools/call",
          params: { name, arguments: args },
        }),
      });
      const text = await res.text();
      let body;
      try {
        body = JSON.parse(text);
      } catch {
        throw new Error(
          `Upstream returned non-JSON response (HTTP ${res.status}): ${text.slice(0, 200)}`
        );
      }
      if (body.error) {
        throw new Error(body.error.message || JSON.stringify(body.error));
      }
      return body.result;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and the description adds error behavior (NO_CAPTIONS), cost (free), and output details (timestamps, language). It could mention rate limits or output structure, but overall adds good context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three efficient sentences: first states purpose, second covers error and fallback, third notes cost. Front-loaded and no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, parameters, error handling, and alternative. However, without an output schema, some detail on the return structure (e.g., exact format of segments) would improve completeness. Still, for a fetch tool with good annotations, it is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, yet the description adds significant meaning: explains how 'lang' selects among multilingual captions and details what 'save' does (saves to library, flips has_asr). This goes well beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool fetches official YouTube video transcripts with timestamps and language detection. It clearly differentiates from sibling tools by mentioning the fallback to transcribe_video when no captions exist.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: it errors with NO_CAPTIONS and instructs to fall back to transcribe_video. This is a clear when-to-use and when-not-to-use instruction, naming the alternative tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/SubDownload/subdownload-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server