YouTube Transcript MCP Server

README.md•16.8 KiB

# YouTube Transcript Library A pure TypeScript library for fetching YouTube video transcripts. This library is designed to be used as a standalone package or integrated into other projects. ## Features - ✅ Fetch transcripts from YouTube videos - ✅ Support for multiple languages - ✅ Auto-generated and manual transcripts - ✅ List all available transcript languages - ✅ TypeScript native with full type safety - ✅ Zero dependencies (uses native fetch API) - ✅ Comprehensive error handling - ✅ Designed to be extracted as a separate npm package ## Architecture This library is a custom implementation in modern JavaScript: ``` yt-lib/ ├── src/ │ ├── index.ts # Main exports │ ├── types.ts # TypeScript types and interfaces │ ├── errors.ts # Custom error classes │ ├── parser.ts # XML transcript parser │ └── fetcher.ts # Core fetching logic ├── package.json # Package configuration └── README.md # This file ``` ## Usage ### Basic Usage ```typescript import { YouTubeTranscriptFetcher } from "./yt-lib/src/index.js"; const fetcher = new YouTubeTranscriptFetcher(); // Fetch a transcript const transcript = await fetcher.fetchTranscript("dQw4w9WgXcQ", { languages: ["en"], preserveFormatting: false, }); console.log(transcript.snippets); // [ // { text: "Never gonna give you up", start: 0, duration: 2.5 }, // { text: "Never gonna let you down", start: 2.5, duration: 2.8 }, // ... // ] ``` ### List Available Transcripts ```typescript const transcripts = await fetcher.listTranscripts("dQw4w9WgXcQ"); console.log(transcripts); // [ // { // videoId: "dQw4w9WgXcQ", // language: "English", // languageCode: "en", // isGenerated: false, // isTranslatable: true, // ... // }, // ... // ] ``` ### Multiple Language Preferences ```typescript // Try German first, then English, then Spanish const transcript = await fetcher.fetchTranscript("dQw4w9WgXcQ", { languages: ["de", "en", "es"], }); ``` ### Extract Video ID from URL ```typescript // All these work: await fetcher.fetchTranscript("https://www.youtube.com/watch?v=dQw4w9WgXcQ"); await fetcher.fetchTranscript("https://youtu.be/dQw4w9WgXcQ"); await fetcher.fetchTranscript("dQw4w9WgXcQ"); ``` ## API Reference ### YouTubeTranscriptFetcher The main class for fetching transcripts. #### Methods ##### `fetchTranscript(videoId, options?): Promise<FetchedTranscript>` Fetches a transcript for a video. **Parameters:** - `videoId` (string): Video ID or URL - `options` (object, optional): - `languages` (string[]): Language codes in priority order (default: `["en"]`) - `preserveFormatting` (boolean): Keep HTML formatting tags (default: `false`) **Returns:** `Promise<FetchedTranscript>` ##### `listTranscripts(videoId): Promise<TranscriptInfo[]>` Lists all available transcripts for a video. **Parameters:** - `videoId` (string): Video ID or URL **Returns:** `Promise<TranscriptInfo[]>` ##### `extractVideoId(input): string` Extracts video ID from various YouTube URL formats. **Parameters:** - `input` (string): Video ID or URL **Returns:** Video ID string ## Types ### TranscriptSnippet ```typescript interface TranscriptSnippet { text: string; start: number; duration: number; } ``` ### FetchedTranscript ```typescript interface FetchedTranscript { snippets: TranscriptSnippet[]; videoId: string; language: string; languageCode: string; isGenerated: boolean; } ``` ### TranscriptInfo ```typescript interface TranscriptInfo { videoId: string; url: string; language: string; languageCode: string; isGenerated: boolean; translationLanguages: TranslationLanguage[]; isTranslatable: boolean; } ``` ## Error Handling The library provides specific error classes for different failure scenarios: ```typescript import { YouTubeTranscriptError, TranscriptsDisabledError, NoTranscriptFoundError, VideoUnavailableError, RequestBlockedError, } from "./yt-lib/src/index.js"; try { const transcript = await fetcher.fetchTranscript("some-video-id"); } catch (error) { if (error instanceof TranscriptsDisabledError) { console.log("Transcripts are disabled for this video"); } else if (error instanceof NoTranscriptFoundError) { console.log("No transcript found in requested languages"); } else if (error instanceof RequestBlockedError) { console.log("Request was blocked by YouTube"); } } ``` ### Available Error Classes - `YouTubeTranscriptError` - Base error class - `TranscriptsDisabledError` - Transcripts disabled for video - `NoTranscriptFoundError` - No transcript in requested language - `VideoUnavailableError` - Video is unavailable - `InvalidVideoIdError` - Invalid video ID format - `RequestBlockedError` - Request blocked by YouTube - `YouTubeRequestFailedError` - HTTP request failed - `YouTubeDataUnparsableError` - Cannot parse YouTube data - `AgeRestrictedError` - Video is age-restricted - `VideoUnplayableError` - Video is unplayable - `TranscriptParseError` - Cannot parse transcript XML ## Extracting as a Separate Package This library is designed to be easily extracted and published as a standalone npm package: 1. **Copy the `yt-lib` folder** to a new repository 2. **Update package.json** with your package name and details 3. **Add a build script** if you want to compile TypeScript 4. **Publish to npm**: `npm publish` The library has zero external dependencies and uses only native Node.js APIs (fetch), making it lightweight and easy to maintain. ## How Transcript Retrieval Works This section explains the mechanism and flow of how transcripts are retrieved from YouTube, including URLs accessed and the underlying architecture. ### YouTube's Transcript Architecture YouTube stores video transcripts separately from the video itself. Transcripts are available in two forms: 1. **Manual Transcripts** - Uploaded by video creators 2. **Auto-Generated Transcripts** - Created by YouTube's speech recognition These transcripts are stored as **timed text tracks** (similar to subtitles) in XML format on YouTube's servers. ### The Retrieval Flow #### Step 1: Fetch the Video Page HTML **URL Accessed:** ``` https://www.youtube.com/watch?v={VIDEO_ID} ``` **What Happens:** - The library fetches the public YouTube video page (the same page you see in a browser) - This HTML contains embedded JavaScript with configuration data - No authentication is required for public videos **Why This Step:** YouTube embeds critical API configuration data in the page HTML, including: - The Innertube API key (YouTube's internal API) - Initial video metadata - Player configuration #### Step 2: Extract the Innertube API Key **What We Look For:** ```javascript "INNERTUBE_API_KEY": "AIzaSy..." ``` **What Happens:** - The library parses the HTML to find the `INNERTUBE_API_KEY` - This key is embedded in the page's JavaScript configuration - The key is public and changes periodically (YouTube rotates it) **Why This Matters:** The Innertube API is YouTube's internal API used by the web player. It provides access to: - Video metadata - Caption/transcript information - Playability status - Available quality levels #### Step 3: Call YouTube's Innertube API **URL Accessed:** ``` https://www.youtube.com/youtubei/v1/player?key={API_KEY} ``` **Request Payload:** ```json { "context": { "client": { "clientName": "ANDROID", "clientVersion": "20.10.38" } }, "videoId": "{VIDEO_ID}" } ``` **What Happens:** - The library makes a POST request to YouTube's Innertube API - We identify as an Android client (more reliable than web client) - YouTube returns comprehensive video metadata **Response Contains:** - `playabilityStatus` - Whether the video can be played - `captions.playerCaptionsTracklistRenderer` - **Caption track information** - `captionTracks[]` - Array of available transcripts - `translationLanguages[]` - Available translation options **Why Android Client:** The Android client identifier provides more consistent access to transcript data compared to web clients, which may have additional restrictions. #### Step 4: Parse Caption Track Metadata **What We Extract:** ```json { "captionTracks": [ { "baseUrl": "https://www.youtube.com/api/timedtext?...", "name": { "simpleText": "English" }, "languageCode": "en", "kind": "asr", // "asr" = auto-generated "isTranslatable": true } ] } ``` **Key Information:** - `baseUrl` - The URL to fetch the actual transcript XML - `languageCode` - ISO language code (e.g., 'en', 'es', 'ja') - `kind: "asr"` - Indicates auto-generated transcript - `isTranslatable` - Whether the transcript can be machine-translated **What Happens:** - The library catalogs all available transcripts - Identifies which are manual vs auto-generated - Builds a list of available languages #### Step 5: Fetch the Transcript XML **URL Accessed:** ``` https://www.youtube.com/api/timedtext? v={VIDEO_ID} &lang={LANGUAGE_CODE} &fmt=srv3 &name=... ``` **What Happens:** - The library requests the actual transcript data - YouTube returns timed text in XML format - This is the same data used by YouTube's subtitle/caption system **Response Format (XML):** ```xml <?xml version="1.0" encoding="utf-8" ?> <transcript> <text start="0" dur="2.5">Never gonna give you up</text> <text start="2.5" dur="2.8">Never gonna let you down</text> <text start="5.3" dur="3.1">Never gonna run around and desert you</text> </transcript> ``` **XML Structure:** - `<text>` - Individual transcript snippet - `start` - Timestamp in seconds when text appears - `dur` - Duration in seconds the text is displayed - Text content - The actual transcript text (may include HTML entities) #### Step 6: Parse the XML **What Happens:** - The library parses the XML using regex (avoiding XML parser dependencies) - Decodes HTML entities (`&` → `&`, `"` → `"`, etc.) - Removes HTML formatting tags (unless `preserveFormatting: true`) - Builds structured transcript objects **Output:** ```javascript { snippets: [ { text: "Never gonna give you up", start: 0, duration: 2.5 }, { text: "Never gonna let you down", start: 2.5, duration: 2.8 } ], videoId: "dQw4w9WgXcQ", language: "English", languageCode: "en", isGenerated: false } ``` ### Why This Approach Works **No API Key Required:** - Uses publicly accessible endpoints - The Innertube API key is embedded in the public web page - No OAuth or authentication needed for public videos **Mimics Normal Browser Behavior:** - Fetches the same data that the YouTube player uses - Uses standard HTTP requests - No web scraping of protected content **Reliable Access:** - Transcripts are served from YouTube's CDN - Same infrastructure that powers YouTube's subtitle system - High availability and performance ### When It Doesn't Work **1. Transcripts Disabled by Owner** - Video creator has disabled captions/subtitles - The Innertube API returns no `captionTracks` - Error: `TranscriptsDisabledError` **2. Age-Restricted Videos** - Require authentication to verify age - Cannot be accessed without logged-in session - Error: `AgeRestrictedError` **3. Private/Unlisted Videos with Restrictions** - May require authentication - May have disabled transcript access - Error: `VideoUnavailableError` **4. IP Blocking / Rate Limiting** - YouTube detects automated access - Triggers bot detection (reCAPTCHA) - Error: `RequestBlockedError` **5. Caption Metadata vs Actual Access** - Sometimes caption metadata exists but transcript URL returns 403 - This happens when owners disable programmatic access - The transcript appears available but fetching fails ### URLs Summary | Step | URL | Purpose | |------|-----|---------| | 1 | `https://www.youtube.com/watch?v={VIDEO_ID}` | Get video page HTML | | 2 | (HTML parsing) | Extract Innertube API key | | 3 | `https://www.youtube.com/youtubei/v1/player?key={KEY}` | Get video metadata & caption info | | 4 | (JSON parsing) | Extract transcript URLs | | 5 | `https://www.youtube.com/api/timedtext?v={ID}&lang={LANG}` | Get transcript XML | | 6 | (XML parsing) | Convert to structured data | ### Design Decisions - **No Dependencies**: Uses native fetch API and regex-based XML parsing - **JavaScript Native**: Pure JavaScript/ES modules, no compilation needed - **Error-First**: Comprehensive error handling with specific error types - **Modular**: Clean separation of concerns (parser, fetcher, types, errors) - **YouTube API Compatible**: Follows YouTube's Innertube API patterns - **Extensible**: Easy to add formatters, proxies, or other features ## Multi-Client Fallback System ### What It Does The library implements a **multi-client fallback mechanism** similar to yt-dlp. When fetching video metadata from YouTube's Innertube API, it tries multiple client types in order until one succeeds. ### Why It's Needed YouTube's Innertube API behaves differently depending on which client makes the request: - **ANDROID client**: Generally most reliable for transcripts, but may be blocked for some videos - **WEB client**: Provides translation languages, but sometimes restricted - **TV_EMBEDDED client**: Least restricted, works for age-gated content, but limited features Some videos work with one client but not others. By trying multiple clients automatically, we maximize success rate. ### How It Works ```javascript // When you create a fetcher const fetcher = new YouTubeTranscriptFetcher(); // Internally, it tries clients in this order: // 1. ANDROID (clientName: "ANDROID", clientVersion: "19.09.37") // 2. WEB (clientName: "WEB", clientVersion: "2.20250103.01.00") // 3. TV_EMBEDDED (clientName: "TVHTML5_SIMPLY_EMBEDDED_PLAYER") // If ANDROID fails → tries WEB // If WEB fails → tries TV_EMBEDDED // If all fail → throws the last error ``` ### Client Details **ANDROID Client:** ```javascript { context: { client: { clientName: "ANDROID", clientVersion: "19.09.37", androidSdkVersion: 30, }, }, } ``` - **Best for**: General transcript access - **Pros**: Most reliable for standard videos - **Cons**: May be blocked for restricted content **WEB Client:** ```javascript { context: { client: { clientName: "WEB", clientVersion: "2.20250103.01.00", }, }, } ``` - **Best for**: Translation languages - **Pros**: Provides full translation language list - **Cons**: Sometimes more restricted than Android **TV_EMBEDDED Client:** ```javascript { context: { client: { clientName: "TVHTML5_SIMPLY_EMBEDDED_PLAYER", clientVersion: "2.0", }, }, } ``` - **Best for**: Age-restricted or embedded videos - **Pros**: Least restrictions - **Cons**: May have limited metadata ### Debug Mode Enable debug logging to see which client succeeds: ```javascript const fetcher = new YouTubeTranscriptFetcher({ debug: true }); // You'll see logs like: // [yt-lib] Trying ANDROID client for video dQw4w9WgXcQ // [yt-lib] ANDROID response status: 200 OK // [yt-lib] ANDROID client succeeded ``` ### Transparent Error Reporting When a request fails, the library now provides **actual evidence** instead of speculation: ```javascript try { await fetcher.fetchTranscript(videoId); } catch (error) { // Error message includes: // - HTTP status code (e.g., 403, 404) // - Response body preview // - Actual URL that was accessed // - Which client failed console.error(error.message); // Example: // "HTTP 403: Forbidden // // EVIDENCE: Attempted to fetch transcript from: // https://www.youtube.com/api/timedtext?v=... // // Response preview: // <?xml version="1.0"?><error>Access denied</error> // // This is the actual error from YouTube's server, not speculation." } ``` ### What This Means for You **Without multi-client fallback:** ``` Video fails with ANDROID client → Error thrown → User sees failure ``` **With multi-client fallback:** ``` Video fails with ANDROID → Try WEB → WEB succeeds → User gets transcript ``` This significantly improves success rate, especially for: - Age-restricted videos - Region-restricted content - Videos with special access requirements - Videos that work better with specific clients ### Evidence-Based Error Handling The library follows a **"show, don't tell"** philosophy for errors: ❌ **Old approach** (speculation): ``` "Video owner has disabled programmatic transcript access" ``` ✅ **New approach** (evidence): ``` "HTTP 403: Forbidden EVIDENCE: GET https://www.youtube.com/api/timedtext?v=ABC... Response: <?xml version="1.0"?><error>Access denied</error> Tried clients: ANDROID (403), WEB (403), TV_EMBEDDED (403) All clients failed with same error." ``` This allows you to: 1. **See exactly what failed** (HTTP status, URL, response) 2. **Understand why it failed** (actual server response) 3. **Debug issues** (know which clients were tried) 4. **Report problems** (provide concrete evidence in bug reports) ## Contributing This library is part of the `youtube-transcript-mcp` project but is designed to be standalone. Contributions are welcome! ## License MIT

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hancengiz/youtube-transcript-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•16.8 KiB