Skip to main content
Glama
DumplingAI

Dumpling AI MCP Server

Official
by DumplingAI

get-youtube-transcript

Extract transcripts from YouTube videos with options for timestamps and language preferences to support content analysis and accessibility.

Instructions

Extract transcripts from YouTube videos with optional parameters for timestamps and language preferences.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
videoUrlYesURL of the YouTube video
includeTimestampsNoWhether to include timestamps
timestampsToCombineNoNumber of timestamps to combine
preferredLanguageNoPreferred language code

Implementation Reference

  • Handler function that proxies the request to Dumpling AI's get-youtube-transcript API endpoint, authenticates with API key, and returns the transcript data as JSON string.
    async ({
      videoUrl,
      includeTimestamps,
      timestampsToCombine,
      preferredLanguage,
    }) => {
      const apiKey = process.env.DUMPLING_API_KEY;
      if (!apiKey) throw new Error("DUMPLING_API_KEY not set");
      const response = await fetch(
        `${NWS_API_BASE}/api/v1/get-youtube-transcript`,
        {
          method: "POST",
          headers: {
            "Content-Type": "application/json",
            Authorization: `Bearer ${apiKey}`,
          },
          body: JSON.stringify({
            videoUrl,
            includeTimestamps,
            timestampsToCombine,
            preferredLanguage,
          }),
        }
      );
      if (!response.ok)
        throw new Error(`Failed: ${response.status} ${await response.text()}`);
      const data = await response.json();
      return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
    }
  • Zod schema defining the input parameters for the get-youtube-transcript tool.
      videoUrl: z.string().url().describe("URL of the YouTube video"),
      includeTimestamps: z
        .boolean()
        .optional()
        .default(true)
        .describe("Whether to include timestamps"),
      timestampsToCombine: z
        .number()
        .optional()
        .describe("Number of timestamps to combine"),
      preferredLanguage: z
        .string()
        .optional()
        .describe("Preferred language code"),
    },
  • src/index.ts:16-64 (registration)
    MCP server.tool registration for the get-youtube-transcript tool, including description, input schema, and inline handler.
    server.tool(
      "get-youtube-transcript",
      "Extract transcripts from YouTube videos with optional parameters for timestamps and language preferences.",
      {
        videoUrl: z.string().url().describe("URL of the YouTube video"),
        includeTimestamps: z
          .boolean()
          .optional()
          .default(true)
          .describe("Whether to include timestamps"),
        timestampsToCombine: z
          .number()
          .optional()
          .describe("Number of timestamps to combine"),
        preferredLanguage: z
          .string()
          .optional()
          .describe("Preferred language code"),
      },
      async ({
        videoUrl,
        includeTimestamps,
        timestampsToCombine,
        preferredLanguage,
      }) => {
        const apiKey = process.env.DUMPLING_API_KEY;
        if (!apiKey) throw new Error("DUMPLING_API_KEY not set");
        const response = await fetch(
          `${NWS_API_BASE}/api/v1/get-youtube-transcript`,
          {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              Authorization: `Bearer ${apiKey}`,
            },
            body: JSON.stringify({
              videoUrl,
              includeTimestamps,
              timestampsToCombine,
              preferredLanguage,
            }),
          }
        );
        if (!response.ok)
          throw new Error(`Failed: ${response.status} ${await response.text()}`);
        const data = await response.json();
        return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
      }
    );
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. While it mentions 'optional parameters for timestamps and language preferences,' it doesn't describe what happens when transcripts aren't available, rate limits, authentication needs, error conditions, or the format/structure of the returned transcript. For a tool that interacts with external services, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently communicates the core functionality and key optional features. Every word earns its place with no redundancy or fluff. It's appropriately sized for a tool with clear parameters documented elsewhere.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of extracting transcripts from an external service like YouTube, the description is insufficient. With no annotations and no output schema, it doesn't address critical aspects like error handling (e.g., if the video has no transcript), rate limits, authentication requirements, or the structure of the returned data. The agent lacks necessary context for reliable tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value by mentioning 'optional parameters for timestamps and language preferences,' which loosely references 'includeTimestamps' and 'preferredLanguage' but doesn't provide additional context beyond what's in the schema. This meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Extract transcripts') and resource ('from YouTube videos'), making the purpose immediately understandable. It distinguishes itself from sibling tools like 'extract-audio' or 'extract-video' by focusing specifically on transcripts. However, it doesn't explicitly differentiate from potential transcript-related tools that might exist elsewhere, keeping it at a 4 rather than a 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention any prerequisites (like needing a valid YouTube URL), nor does it compare with sibling tools like 'extract-video' or 'extract-audio' that might handle different aspects of YouTube content. The agent must infer usage context solely from the tool name and parameters.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DumplingAI/mcp-server-dumplingai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server