Skip to main content
Glama

get_transcript

Extract YouTube video transcripts in your preferred language by providing the video URL. Supports pagination for lengthy transcripts to simplify text retrieval and processing.

Instructions

Retrieves the transcript of a YouTube video.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
langNoThe preferred language for the transcripten
next_cursorNoCursor to retrieve the next page of the transcript
urlYesThe URL of the YouTube video

Implementation Reference

  • The main async handler function for the 'get_transcript' tool, decorated with @mcp.tool() which handles the tool execution logic including fetching and paginating the transcript.
    @mcp.tool() async def get_transcript( ctx: Context[ServerSession, AppContext], url: str = Field(description="The URL of the YouTube video"), lang: str = Field(description="The preferred language for the transcript", default="en"), next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None), ) -> Transcript: """Retrieves the transcript of a YouTube video.""" title, snippets = _get_transcript_snippets(ctx.request_context.lifespan_context, _parse_video_id(url), lang) transcripts = (item.text for item in snippets) if response_limit is None or response_limit <= 0: return Transcript(title=title, transcript="\n".join(transcripts)) res = "" cursor = None for i, line in islice(enumerate(transcripts), int(next_cursor or 0), None): if len(res) + len(line) + 1 > response_limit: cursor = str(i) break res += f"{line}\n" return Transcript(title=title, transcript=res[:-1], next_cursor=cursor)
  • Pydantic model defining the output schema for the get_transcript tool response.
    class Transcript(BaseModel): """Transcript of a YouTube video.""" title: str = Field(description="Title of the video") transcript: str = Field(description="Transcript of the video") next_cursor: str | None = Field(description="Cursor to retrieve the next page of the transcript", default=None)
  • Cached helper function that fetches transcript snippets and video title using YouTubeTranscriptApi.
    @lru_cache def _get_transcript_snippets(ctx: AppContext, video_id: str, lang: str) -> Tuple[str, list[FetchedTranscriptSnippet]]: if lang == "en": languages = ["en"] else: languages = [lang, "en"] page = ctx.http_client.get( f"https://www.youtube.com/watch?v={video_id}", headers={"Accept-Language": ",".join(languages)} ) page.raise_for_status() soup = BeautifulSoup(page.text, "html.parser") title = soup.title.string if soup.title and soup.title.string else "Transcript" transcripts = ctx.ytt_api.fetch(video_id, languages=languages) return title, transcripts.snippets
  • Helper function to extract YouTube video ID from URL.
    def _parse_video_id(url: str) -> str: parsed_url = urlparse(url) if parsed_url.hostname == "youtu.be": return parsed_url.path.lstrip("/") else: q = parse_qs(parsed_url.query).get("v") if q is None: raise ValueError(f"couldn't find a video ID from the provided URL: {url}.") return q[0]
  • The server function initializes the FastMCP instance where tools like get_transcript are registered via decorators.
    mcp = FastMCP("Youtube Transcript", lifespan=partial(_app_lifespan, proxy_config=proxy_config))

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jkawamoto/mcp-youtube-transcript'

If you have feedback or need assistance with the MCP directory API, please join our Discord server