YouTube Transcript Downloader
Server Details
Clean YouTube transcripts for agents: single videos, channels, playlists, plus AI caption cleanup.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.8/5 across 4 of 4 tools scored.
Each tool targets a distinct use case: single video, channel recent videos, playlist, and transcript polishing. No overlapping purposes.
All tools follow a verb_noun pattern (get_transcript, get_channel_transcripts, get_playlist_transcripts, polish_transcript), consistent and predictable.
Four tools cover the core needs of a transcript downloader (single, channel, playlist, polish) without being excessive or insufficient.
The set covers the main operations for accessing transcripts, though additional features like language selection or format options are absent.
Available Tools
4 toolsget_channel_transcriptsGet Channel TranscriptsARead-onlyInspect
Get transcripts for a YouTube channel's most recent videos (newest first) as timestamped markdown, one section per video. Use for research across a creator's recent output; for one known video use get_transcript. Read-only; requires an API key. Charges 1 credit per video that returns a transcript, including repeat calls; videos without captions are skipped free. A 10-video call typically costs up to 10 credits, so start with a small limit. Rate limit: 5 requests per 10 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | YouTube channel URL or handle (e.g. https://www.youtube.com/@lexfridman or @lexfridman) | |
| limit | No | Number of most-recent videos to fetch, 1-50 (default 10). Upper bound on the credit charge for this call. |
Output Schema
| Name | Required | Description |
|---|---|---|
| failed | Yes | Videos skipped without charge (no captions) |
| channel | Yes | Channel name |
| succeeded | Yes | Videos that returned a transcript (each charged 1 credit) |
| creditsUsed | Yes | Credits charged for this call |
| totalVideos | Yes | Videos attempted in this call |
| transcripts | Yes | All transcripts as timestamped markdown, one section per video, divider-separated |
| creditsRemaining | Yes | Account balance after this call |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, etc.), the description adds context: API key requirement, credit charges per transcript, free skipping for no captions, rate limit of 5/10s, and recommendation to limit video count.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences efficiently cover purpose, usage, safety, costs, and limits; no repetition or unnecessary text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema (exists), the description fully explains behavior, credit model, skipping logic, and rate limit, making it self-contained for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions; the description adds practical context: url examples, limit as max credit bound, and default of 10.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves transcripts for a channel's most recent videos (newest first) as timestamped markdown, differentiating it from get_transcript which is for a single known video.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises use for research across recent output and directs to get_transcript for single videos, plus provides credit and rate limit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_playlist_transcriptsGet Playlist TranscriptsARead-onlyInspect
Get transcripts for the videos in a YouTube playlist (in playlist order) as timestamped markdown, one section per video. Use for working through a course, series, or curated list; for one known video use get_transcript. Read-only; requires an API key. Charges 1 credit per video that returns a transcript, including repeat calls; videos without captions are skipped free. A 10-video call typically costs up to 10 credits, so start with a small limit. Rate limit: 5 requests per 10 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | YouTube playlist URL (e.g. https://www.youtube.com/playlist?list=PLxxxxxx) | |
| limit | No | Number of videos to fetch from the start of the playlist, 1-50 (default 10). Upper bound on the credit charge for this call. |
Output Schema
| Name | Required | Description |
|---|---|---|
| failed | Yes | Videos skipped without charge (no captions) |
| playlist | Yes | Playlist title |
| succeeded | Yes | Videos that returned a transcript (each charged 1 credit) |
| creditsUsed | Yes | Credits charged for this call |
| totalVideos | Yes | Videos attempted in this call |
| transcripts | Yes | All transcripts as timestamped markdown, one section per video, divider-separated |
| creditsRemaining | Yes | Account balance after this call |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, etc.), description discloses authentication requirement ('requires an API key'), billing details ('charges 1 credit per video'), behavior for missing captions ('videos without captions are skipped free'), and rate limit ('5 requests per 10 seconds'). No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, no wasted words. First sentence states purpose and output format. Second provides usage context. Third and fourth cover cost and rate limit. Front-loaded with core information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with 2 parameters, output schema exists, and annotations cover safety, the description provides complete context: output format, cost, rate limit, authentication, and usage notes. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage 100% already documents both parameters. The description adds context: limit as upper bound on credit charge and explains credit implication. This extra value justifies above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb+resource: 'Get transcripts for the videos in a YouTube playlist'. Distinguishes from sibling tool get_transcript: 'for one known video use get_transcript'. Specifies output format: timestamped markdown, one section per video.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'for working through a course, series, or curated list'. Provides alternative: 'for one known video use get_transcript'. Also includes credit cost and rate limit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_transcriptGet YouTube TranscriptARead-onlyInspect
Get the full transcript of a single YouTube video as timestamped markdown. Read-only: fetches existing captions, modifies nothing. Requires an API key; each successful call charges 1 credit, including repeat calls for the same video, so reuse a transcript already in context instead of re-fetching. Videos without captions return an error and cost nothing. Rate limit: 5 requests per 10 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| video | Yes | YouTube video ID (e.g. dQw4w9WgXcQ) or full video URL (youtube.com/watch?v=... or youtu.be/... forms) |
Output Schema
| Name | Required | Description |
|---|---|---|
| title | Yes | Video title |
| videoId | Yes | YouTube video ID |
| transcript | Yes | Full transcript as timestamped markdown |
| creditsUsed | Yes | Credits charged for this call |
| creditsRemaining | Yes | Account balance after this call |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds rate limit (5 per 10 seconds), credit cost details, and error conditions beyond annotations. Confirms read-only nature ('fetches existing captions, modifies nothing') aligning with readOnlyHint=true. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each adding distinct value: purpose, read-only, usage advice, error handling, rate limit. No redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given single parameter, high schema coverage, and existing output schema, description fully covers necessary information for agent to use tool correctly and efficiently.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear description of video parameter. Description adds no extra meaning to the parameter beyond what schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'Get', resource 'full transcript of a single YouTube video', and output format 'timestamped markdown'. Distinguishes from siblings like get_channel_transcripts and get_playlist_transcripts which handle multiple videos.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance: API key required, credit cost and reusing transcripts to avoid charges, error behavior for videos without captions. Lacks explicit when-to-use versus siblings, but context signals partially compensate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polish_transcriptPolish YouTube TranscriptARead-onlyInspect
Get a cleaned-up transcript of a YouTube video's auto-generated captions: punctuation and capitalisation restored, filler and false starts removed, paragraphs added, misheard names fixed, faithful to what was said. Use when raw captions are too messy to read or quote; for a plain transcript use get_transcript. Read-only; requires an API key. Each call charges credits by transcript length (about 3 per 1,000 words, minimum 5), including repeat calls, so keep the result in context. Human-uploaded captions (already clean) and transcripts over ~7,000 words return an error without charging. Rate limit: 5 requests per 10 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| video | Yes | YouTube video ID (e.g. dQw4w9WgXcQ) or full video URL (youtube.com/watch?v=... or youtu.be/... forms) |
Output Schema
| Name | Required | Description |
|---|---|---|
| title | Yes | Video title |
| videoId | Yes | YouTube video ID |
| transcript | Yes | Cleaned transcript as timestamped markdown |
| creditsUsed | Yes | Credits charged for this call (scales by length) |
| creditsRemaining | Yes | Account balance after this call |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds critical context: requires API key, credit costs (3 per 1000 words, min 5), repeat calls charge, errors for human captions or long transcripts. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is detailed but every sentence serves a purpose. Could be slightly condensed, but maintains high information density without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With one parameter fully covered, output schema present, and comprehensive descriptions of costs, errors, rate limits, and sibling differentiation, the tool definition is fully self-contained for an agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description enriches by detailing the 'video' parameter accepts ID or URL forms (both youtube.com and youtu.be). Provides usage meaning beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get a cleaned-up transcript...' with specific improvements (punctuation, filler removal, etc.) and distinguishes it from sibling 'get_transcript' for plain transcripts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use ('when raw captions are too messy to read or quote') vs 'get_transcript' ('for a plain transcript'). Also covers costs, rate limits, and error conditions, providing comprehensive guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!