Skip to main content
Glama
jkawamoto

YouTube Transcript MCP Server

get_timed_transcript

Extract timestamped transcripts from YouTube videos to analyze content, create subtitles, or study video material with precise time references.

Instructions

Retrieves the transcript of a YouTube video with timestamps.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL of the YouTube video
langNoThe preferred language for the transcripten
next_cursorNoCursor to retrieve the next page of the transcript

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
titleYesTitle of the video
snippetsYesTranscript snippets of the video
next_cursorNoCursor to retrieve the next page of the transcript

Implementation Reference

  • Main handler function for 'get_timed_transcript' tool. Parses video URL, fetches transcript snippets using helper, applies pagination and response limit if set, returns TimedTranscript with title, snippets, and next_cursor.
    @mcp.tool()
    async def get_timed_transcript(
        ctx: Context[ServerSession, AppContext],
        url: str = Field(description="The URL of the YouTube video"),
        lang: str = Field(description="The preferred language for the transcript", default="en"),
        next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None),
    ) -> TimedTranscript:
        """Retrieves the transcript of a YouTube video with timestamps."""
    
        title, snippets = _get_transcript_snippets(ctx.request_context.lifespan_context, _parse_video_id(url), lang)
    
        if response_limit is None or response_limit <= 0:
            return TimedTranscript(
                title=title, snippets=[TranscriptSnippet.from_fetched_transcript_snippet(s) for s in snippets]
            )
    
        res = []
        size = len(title) + 1
        cursor = None
        for i, s in islice(enumerate(snippets), int(next_cursor or 0), None):
            snippet = TranscriptSnippet.from_fetched_transcript_snippet(s)
            if size + len(snippet) + 1 > response_limit:
                cursor = str(i)
                break
            res.append(snippet)
    
        return TimedTranscript(title=title, snippets=res, next_cursor=cursor)
  • Pydantic model defining the output schema for get_timed_transcript: title, list of timed snippets, and pagination cursor.
    class TimedTranscript(BaseModel):
        """Transcript of a YouTube video with timestamps."""
    
        title: str = Field(description="Title of the video")
        snippets: list[TranscriptSnippet] = Field(description="Transcript snippets of the video")
        next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None)
  • Pydantic model for individual timed transcript snippet, used in TimedTranscript.snippets. Includes conversion from youtube_transcript_api snippet.
    class TranscriptSnippet(BaseModel):
        """Transcript snippet of a YouTube video."""
    
        text: str = Field(description="Text of the transcript snippet")
        start: float = Field(description="The timestamp at which this transcript snippet appears on screen in seconds.")
        duration: float = Field(description="The duration of how long the snippet in seconds.")
    
        def __len__(self) -> int:
            return len(self.model_dump_json())
    
        @classmethod
        def from_fetched_transcript_snippet(
            cls: type[TranscriptSnippet], snippet: FetchedTranscriptSnippet
        ) -> TranscriptSnippet:
            return cls(text=snippet.text, start=snippet.start, duration=snippet.duration)
  • Cached helper to fetch transcript snippets using YouTubeTranscriptApi, prefers given language or fallback to English. Also scrapes video title from YouTube page.
    @lru_cache
    def _get_transcript_snippets(ctx: AppContext, video_id: str, lang: str) -> Tuple[str, list[FetchedTranscriptSnippet]]:
        if lang == "en":
            languages = ["en"]
        else:
            languages = [lang, "en"]
    
        page = ctx.http_client.get(
            f"https://www.youtube.com/watch?v={video_id}", headers={"Accept-Language": ",".join(languages)}
        )
        page.raise_for_status()
        soup = BeautifulSoup(page.text, "html.parser")
        title = soup.title.string if soup.title and soup.title.string else "Transcript"
    
        transcripts = ctx.ytt_api.fetch(video_id, languages=languages)
        return title, transcripts.snippets
  • Helper function to extract YouTube video ID from various URL formats (youtu.be or watch?v=).
    def _parse_video_id(url: str) -> str:
        parsed_url = urlparse(url)
        if parsed_url.hostname == "youtu.be":
            return parsed_url.path.lstrip("/")
        else:
            q = parse_qs(parsed_url.query).get("v")
            if q is None:
                raise ValueError(f"couldn't find a video ID from the provided URL: {url}.")
            return q[0]
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but only states what the tool does, not behavioral traits like pagination handling (implied by 'next_cursor'), rate limits, authentication needs, or error conditions. It mentions timestamps but doesn't detail format or structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose with zero wasted words, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema and 100% schema coverage, the description is minimally adequate. However, as a retrieval tool with no annotations and sibling tools, it lacks context on differentiation and behavioral details, leaving gaps in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents parameters. The description adds no additional meaning beyond implying 'timestamps' in the output, but doesn't explain parameter interactions or usage nuances.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Retrieves') and resource ('transcript of a YouTube video with timestamps'), making the purpose evident. It distinguishes from 'get_transcript' by specifying 'with timestamps', but doesn't explicitly contrast with 'get_video_info'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus the sibling tools 'get_transcript' or 'get_video_info'. The description implies usage for timestamped transcripts but lacks explicit alternatives or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jkawamoto/mcp-youtube-transcript'

If you have feedback or need assistance with the MCP directory API, please join our Discord server