Skip to main content
Glama
jkawamoto

YouTube Transcript MCP Server

get_timed_transcript

Extract timestamped transcripts from YouTube videos to analyze content, create subtitles, or study video material with precise time references.

Instructions

Retrieves the transcript of a YouTube video with timestamps.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL of the YouTube video
langNoThe preferred language for the transcripten
next_cursorNoCursor to retrieve the next page of the transcript

Implementation Reference

  • Main handler function for 'get_timed_transcript' tool. Parses video URL, fetches transcript snippets using helper, applies pagination and response limit if set, returns TimedTranscript with title, snippets, and next_cursor.
    @mcp.tool()
    async def get_timed_transcript(
        ctx: Context[ServerSession, AppContext],
        url: str = Field(description="The URL of the YouTube video"),
        lang: str = Field(description="The preferred language for the transcript", default="en"),
        next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None),
    ) -> TimedTranscript:
        """Retrieves the transcript of a YouTube video with timestamps."""
    
        title, snippets = _get_transcript_snippets(ctx.request_context.lifespan_context, _parse_video_id(url), lang)
    
        if response_limit is None or response_limit <= 0:
            return TimedTranscript(
                title=title, snippets=[TranscriptSnippet.from_fetched_transcript_snippet(s) for s in snippets]
            )
    
        res = []
        size = len(title) + 1
        cursor = None
        for i, s in islice(enumerate(snippets), int(next_cursor or 0), None):
            snippet = TranscriptSnippet.from_fetched_transcript_snippet(s)
            if size + len(snippet) + 1 > response_limit:
                cursor = str(i)
                break
            res.append(snippet)
    
        return TimedTranscript(title=title, snippets=res, next_cursor=cursor)
  • Pydantic model defining the output schema for get_timed_transcript: title, list of timed snippets, and pagination cursor.
    class TimedTranscript(BaseModel):
        """Transcript of a YouTube video with timestamps."""
    
        title: str = Field(description="Title of the video")
        snippets: list[TranscriptSnippet] = Field(description="Transcript snippets of the video")
        next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None)
  • Pydantic model for individual timed transcript snippet, used in TimedTranscript.snippets. Includes conversion from youtube_transcript_api snippet.
    class TranscriptSnippet(BaseModel):
        """Transcript snippet of a YouTube video."""
    
        text: str = Field(description="Text of the transcript snippet")
        start: float = Field(description="The timestamp at which this transcript snippet appears on screen in seconds.")
        duration: float = Field(description="The duration of how long the snippet in seconds.")
    
        def __len__(self) -> int:
            return len(self.model_dump_json())
    
        @classmethod
        def from_fetched_transcript_snippet(
            cls: type[TranscriptSnippet], snippet: FetchedTranscriptSnippet
        ) -> TranscriptSnippet:
            return cls(text=snippet.text, start=snippet.start, duration=snippet.duration)
  • Cached helper to fetch transcript snippets using YouTubeTranscriptApi, prefers given language or fallback to English. Also scrapes video title from YouTube page.
    @lru_cache
    def _get_transcript_snippets(ctx: AppContext, video_id: str, lang: str) -> Tuple[str, list[FetchedTranscriptSnippet]]:
        if lang == "en":
            languages = ["en"]
        else:
            languages = [lang, "en"]
    
        page = ctx.http_client.get(
            f"https://www.youtube.com/watch?v={video_id}", headers={"Accept-Language": ",".join(languages)}
        )
        page.raise_for_status()
        soup = BeautifulSoup(page.text, "html.parser")
        title = soup.title.string if soup.title and soup.title.string else "Transcript"
    
        transcripts = ctx.ytt_api.fetch(video_id, languages=languages)
        return title, transcripts.snippets
  • Helper function to extract YouTube video ID from various URL formats (youtu.be or watch?v=).
    def _parse_video_id(url: str) -> str:
        parsed_url = urlparse(url)
        if parsed_url.hostname == "youtu.be":
            return parsed_url.path.lstrip("/")
        else:
            q = parse_qs(parsed_url.query).get("v")
            if q is None:
                raise ValueError(f"couldn't find a video ID from the provided URL: {url}.")
            return q[0]

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jkawamoto/mcp-youtube-transcript'

If you have feedback or need assistance with the MCP directory API, please join our Discord server