Skip to main content
Glama

rip_transcript

Retrieve YouTube video transcripts and save them as Markdown or timestamped JSON files for AI agents, RAG pipelines, and LLM workflows.

Instructions

Extract the full transcript from a YouTube video. By default, saves the transcript as a Markdown file to disk and returns a resource_link + metadata (title, channel, language, duration, word count, saved path, preview). The full transcript text is NOT returned by default — this keeps context lean and gives the user a persistent file they can reuse. After calling, always tell the user where the file was saved. If you need the transcript text later (summarize, search, extract quotes), read the returned resource rather than re-ripping. Pass return_text: true only for short clips or when the user explicitly asks for the transcript inline. Pass format: 'segments' to save timestamped JSON instead of Markdown.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesYouTube video URL. Supports any YouTube URL format (watch, youtu.be, embed, shorts, or bare 11-char ID).
formatNoSaved file format. 'text' writes a Markdown file with YAML frontmatter and a continuous transcript block (best for RAG/LLM). 'segments' writes a JSON file with timestamped segments (best for chapter markers, precise citations). Default: text.
save_pathNoOptional override for where to save the transcript. Can be an absolute path, a ~/-relative path, a directory (file is named automatically), or a full file path. When you save outside the default directory, the returned resource_link still points to the saved file so it can be read later. Default: ~/rippr/transcripts/<slug>_<videoId>.<ext>
return_textNoIf true, include the full transcript text in the tool response alongside the resource_link. Default: false. Use true only for short clips or when the user explicitly wants the transcript inline.

Implementation Reference

  • Main handler for the rip_transcript tool. Receives a YouTube URL, extracts video ID, fetches the transcript via transcript.ts, saves it to disk via storage.ts, and returns a resource_link + metadata summary (title, channel, language, duration, word count, preview). Optionally returns the full transcript text if return_text: true.
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name !== "rip_transcript") {
        return {
          content: [{ type: "text", text: `Unknown tool: ${request.params.name}` }],
          isError: true,
        };
      }
    
      const args = request.params.arguments as {
        url: string;
        format?: "text" | "segments";
        save_path?: string;
        return_text?: boolean;
      };
    
      const videoId = extractVideoId(args.url);
      if (!videoId) {
        return {
          content: [
            {
              type: "text",
              text: `Could not extract a YouTube video ID from: ${args.url}`,
            },
          ],
          isError: true,
        };
      }
    
      try {
        const result = await fetchTranscript(videoId);
        const format: "text" | "segments" = args.format || "text";
        const returnText = args.return_text === true;
    
        const saved = await saveTranscript({
          videoId,
          title: result.title,
          channel: result.channel,
          language: result.language,
          isAuto: result.isAuto,
          segments: result.segments,
          format,
          savePath: args.save_path,
        });
    
        const fullText = result.segments.map((s) => s.text).join(" ");
        const duration = result.segments.reduce(
          (acc, s) => Math.max(acc, s.start + s.duration),
          0
        );
        const wordCount = fullText.split(/\s+/).filter(Boolean).length;
        const previewSource = fullText.replace(/\s+/g, " ").trim();
        const preview =
          previewSource.slice(0, 240) + (previewSource.length > 240 ? "…" : "");
    
        const fileUri = pathToFileURL(saved.path).href;
        const mimeType =
          format === "segments" ? "application/json" : "text/markdown";
    
        const summaryText = [
          `✓ Saved transcript to: ${saved.path}`,
          ``,
          `Title:    ${result.title}`,
          `Channel:  ${result.channel}`,
          `Language: ${result.language}${result.isAuto ? " (auto-generated)" : ""}`,
          `Duration: ${formatDuration(duration)}`,
          `Words:    ${wordCount.toLocaleString()}`,
          `Segments: ${result.segments.length.toLocaleString()}`,
          `URL:      https://www.youtube.com/watch?v=${videoId}`,
          ``,
          `Preview:  ${preview}`,
          ``,
          returnText
            ? `Full transcript included below.`
            : `To read the full transcript, fetch the resource at: ${fileUri}`,
        ].join("\n");
    
        const content: Array<Record<string, unknown>> = [
          {
            type: "resource_link",
            uri: fileUri,
            name: saved.filename,
            mimeType,
            description: `Transcript of "${result.title}" by ${result.channel} (${wordCount.toLocaleString()} words)`,
          },
          {
            type: "text",
            text: summaryText,
          },
        ];
    
        if (returnText) {
          content.push({
            type: "text",
            text: `\n--- Full transcript ---\n\n${fullText}`,
          });
        }
    
        return { content };
      } catch (error: unknown) {
        const message = error instanceof Error ? error.message : String(error);
        return {
          content: [
            {
              type: "text",
              text: `Failed to extract transcript: ${message}`,
            },
          ],
          isError: true,
        };
      }
    });
  • Tool registration with input schema for rip_transcript. Defines the name, description, and inputSchema with properties: url (required string), format (optional 'text'|'segments'), save_path (optional string), return_text (optional boolean).
      name: "rip_transcript",
      description:
        "Extract the full transcript from a YouTube video. By default, saves the transcript as a Markdown file to disk and returns a resource_link + metadata (title, channel, language, duration, word count, saved path, preview). The full transcript text is NOT returned by default — this keeps context lean and gives the user a persistent file they can reuse. After calling, always tell the user where the file was saved. If you need the transcript text later (summarize, search, extract quotes), read the returned resource rather than re-ripping. Pass return_text: true only for short clips or when the user explicitly asks for the transcript inline. Pass format: 'segments' to save timestamped JSON instead of Markdown.",
      inputSchema: {
        type: "object" as const,
        properties: {
          url: {
            type: "string",
            description:
              "YouTube video URL. Supports any YouTube URL format (watch, youtu.be, embed, shorts, or bare 11-char ID).",
          },
          format: {
            type: "string",
            enum: ["text", "segments"],
            description:
              "Saved file format. 'text' writes a Markdown file with YAML frontmatter and a continuous transcript block (best for RAG/LLM). 'segments' writes a JSON file with timestamped segments (best for chapter markers, precise citations). Default: text.",
          },
          save_path: {
            type: "string",
            description:
              "Optional override for where to save the transcript. Can be an absolute path, a ~/-relative path, a directory (file is named automatically), or a full file path. When you save outside the default directory, the returned resource_link still points to the saved file so it can be read later. Default: ~/rippr/transcripts/<slug>_<videoId>.<ext>",
          },
          return_text: {
            type: "boolean",
            description:
              "If true, include the full transcript text in the tool response alongside the resource_link. Default: false. Use true only for short clips or when the user explicitly wants the transcript inline.",
          },
        },
        required: ["url"],
      },
    },
  • mcp/src/index.ts:53-88 (registration)
    Registration of the rip_transcript tool via ListToolsRequestSchema handler. The tool is listed in the tools array returned by the server.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: "rip_transcript",
          description:
            "Extract the full transcript from a YouTube video. By default, saves the transcript as a Markdown file to disk and returns a resource_link + metadata (title, channel, language, duration, word count, saved path, preview). The full transcript text is NOT returned by default — this keeps context lean and gives the user a persistent file they can reuse. After calling, always tell the user where the file was saved. If you need the transcript text later (summarize, search, extract quotes), read the returned resource rather than re-ripping. Pass return_text: true only for short clips or when the user explicitly asks for the transcript inline. Pass format: 'segments' to save timestamped JSON instead of Markdown.",
          inputSchema: {
            type: "object" as const,
            properties: {
              url: {
                type: "string",
                description:
                  "YouTube video URL. Supports any YouTube URL format (watch, youtu.be, embed, shorts, or bare 11-char ID).",
              },
              format: {
                type: "string",
                enum: ["text", "segments"],
                description:
                  "Saved file format. 'text' writes a Markdown file with YAML frontmatter and a continuous transcript block (best for RAG/LLM). 'segments' writes a JSON file with timestamped segments (best for chapter markers, precise citations). Default: text.",
              },
              save_path: {
                type: "string",
                description:
                  "Optional override for where to save the transcript. Can be an absolute path, a ~/-relative path, a directory (file is named automatically), or a full file path. When you save outside the default directory, the returned resource_link still points to the saved file so it can be read later. Default: ~/rippr/transcripts/<slug>_<videoId>.<ext>",
              },
              return_text: {
                type: "boolean",
                description:
                  "If true, include the full transcript text in the tool response alongside the resource_link. Default: false. Use true only for short clips or when the user explicitly wants the transcript inline.",
              },
            },
            required: ["url"],
          },
        },
      ],
    }));
  • fetchTranscript function - core transcript extraction logic. Tries InnerTube API first, falls back to HTML scraping, then fetches and parses caption XML. Returns title, channel, language, isAuto, and segments.
    export async function fetchTranscript(videoId: string): Promise<TranscriptResult> {
      let captionTracks: any[] | undefined;
      let metadata: { title: string; author: string; duration: number } | undefined;
    
      // Try InnerTube API first
      try {
        const result = await fetchViaInnerTube(videoId);
        captionTracks = result.captionTracks;
        metadata = result.metadata;
      } catch {
        // silent — fall through to HTML scraping
      }
    
      // Fallback: scrape watch page HTML
      if (!captionTracks || captionTracks.length === 0) {
        const result = await fetchViaWebPage(videoId);
        captionTracks = result.captionTracks;
        metadata = result.metadata;
      }
    
      if (!captionTracks || captionTracks.length === 0) {
        throw new Error(
          "No captions found. This video may not have subtitles enabled."
        );
      }
    
      const track = captionTracks[0];
      const segments = await fetchCaptionXml(track.baseUrl);
    
      return {
        title: metadata!.title,
        channel: metadata!.author,
        language: track.languageCode,
        isAuto: track.kind === "asr",
        segments,
      };
    }
  • saveTranscript function - saves the fetched transcript to disk as Markdown or JSON. Handles custom save paths, creates directories, and registers custom save paths.
    export async function saveTranscript(args: SaveArgs): Promise<SaveResult> {
      const ext = args.format === "segments" ? "json" : "md";
      const slug = slugify(args.title);
      const defaultFilename = `${slug}_${args.videoId}.${ext}`;
    
      let targetPath: string;
      if (args.savePath && args.savePath.trim().length > 0) {
        const expanded = expandHome(args.savePath.trim());
        let treatAsDir = false;
        try {
          const stat = await fs.stat(expanded);
          treatAsDir = stat.isDirectory();
        } catch {
          // Path doesn't exist yet. If it has a known file extension, treat as file.
          treatAsDir = !/\.(md|json|txt)$/i.test(expanded);
        }
        targetPath = treatAsDir ? path.join(expanded, defaultFilename) : expanded;
      } else {
        targetPath = path.join(DEFAULT_SAVE_DIR, defaultFilename);
      }
    
      await fs.mkdir(path.dirname(targetPath), { recursive: true });
    
      const contents =
        args.format === "segments"
          ? renderSegmentsJson(args)
          : renderMarkdown(args);
    
      await fs.writeFile(targetPath, contents, "utf8");
      await registerSavedPath(targetPath);
    
      return {
        path: targetPath,
        filename: path.basename(targetPath),
      };
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses default behavior (saves file, does not return text), rationale (keep context lean, persistent file), and side effects (saves to disk). Could explicitly mention idempotency or rate limits, but overall strong.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Seven sentences, all contributing meaningful information. Front-loaded with purpose and key behavioral notes. No redundant or unnecessary content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all critical aspects: default behavior, parameter usage, post-call action (tell user where saved). No output schema, but description is sufficient for agent to correctly invoke and follow up. Minor missing detail on resource_link format, but not essential.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value: explains default behavior for format and return_text, supported URL formats, and save_path resolution details. Provides usage recommendations that go beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Extract the full transcript from a YouTube video' with a specific verb and resource, clearly defining the tool's function. No sibling tools exist, so differentiation is not an issue.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: default saves file and returns metadata, advises against re-ripping by reading resource, specifies when to use return_text (short clips or explicit request) and format alternatives. Covers context and exclusions thoroughly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mrslbt/rippr'

If you have feedback or need assistance with the MCP directory API, please join our Discord server