Skip to main content
Glama

get_video_info

Retrieve detailed metadata such as title and description from a YouTube video by passing its URL or video ID.

Instructions

Get metadata from a YouTube video.

Args: url: YouTube URL or video ID

Returns: Dict containing video metadata

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The core handler function for the 'get_video_info' tool. It extracts a video ID from a YouTube URL using extract_video_id helper, fetches metadata via yt-dlp, and returns a dict with id, title, description, duration, channel, upload_date, view_count, and available subtitle languages.
    def get_video_info(url: str) -> dict[str, Any]:
        """Get metadata from a YouTube video.
    
        Args:
            url: YouTube URL or video ID
    
        Returns:
            Dict containing video metadata
        """
        video_id = extract_video_id(url)
        video_url = f"https://www.youtube.com/watch?v={video_id}"
    
        ydl_opts = {
            "quiet": True,
            "no_warnings": True,
            "skip_download": True,
            "writesubtitles": True,
            "writeautomaticsub": True,
        }
    
        try:
            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
                info = ydl.extract_info(video_url, download=False)
    
                if info is None:
                    return {
                        "id": video_id,
                        "error": "Failed to extract video info",
                    }
    
                # Collect available subtitle languages
                subtitles = info.get("subtitles", {})
                automatic_captions = info.get("automatic_captions", {})
                available_languages = sorted(set(subtitles.keys()) | set(automatic_captions.keys()))
    
                return {
                    "id": video_id,
                    "title": info.get("title"),
                    "description": info.get("description"),
                    "duration": info.get("duration"),
                    "channel": info.get("channel") or info.get("uploader"),
                    "upload_date": _format_date(info.get("upload_date")),
                    "view_count": info.get("view_count"),
                    "available_languages": available_languages,
                }
    
        except Exception as e:
            return {
                "id": video_id,
                "error": str(e),
            }
  • Registration of the get_video_info function as an MCP tool using FastMCP's decorator pattern (mcp.tool()(get_video_info)).
    mcp.tool()(get_video_info)
  • Helper function used within get_video_info to format the upload_date from YYYYMMDD to YYYY-MM-DD format.
    def _format_date(date_str: str | None) -> str | None:
        """Format YYYYMMDD to YYYY-MM-DD."""
        if not date_str or len(date_str) != 8:
            return date_str
        return f"{date_str[:4]}-{date_str[4:6]}-{date_str[6:8]}"
  • Helper utility used by get_video_info to extract a YouTube video ID from various URL formats (standard, shortlinks, embed, shorts, live, etc.) or a bare video ID.
    def extract_video_id(url: str) -> str:
        """Extract video ID from various forms of YouTube URLs.
    
        Handles standard, shortened, embed, shorts, live URLs,
        as well as regional domains and mobile versions.
    
        Args:
            url: YouTube URL or video ID
    
        Returns:
            The extracted video ID
    
        Raises:
            ValueError: If the video ID cannot be extracted
        """
        if not url:
            raise ValueError("URL is required")
    
        # Handle case where input is just the 11-character video ID
        if re.match(r"^[a-zA-Z0-9_-]{11}$", url):
            return url
    
        # Prepend scheme if not present
        if not re.match(r"https?://", url):
            url = "https://" + url
    
        parsed = urlparse(url)
        hostname = parsed.hostname
    
        if not hostname:
            raise ValueError(f"Could not parse URL: {url}")
    
        # youtu.be shortlinks
        if hostname == "youtu.be":
            video_id = parsed.path.lstrip("/").split("/")[0].split("?")[0]
            if video_id:
                return video_id
            raise ValueError(f"Could not extract video ID from URL: {url}")
    
        # youtube.com variants
        youtube_pattern = r"^(?:www\.|m\.|music\.)?youtube\.com$"
        if not re.match(youtube_pattern, hostname):
            raise ValueError(f"Not a YouTube URL: {url}")
    
        # /watch?v=VIDEO_ID
        if parsed.path == "/watch":
            query_params = parse_qs(parsed.query)
            video_ids = query_params.get("v")
            if video_ids:
                return video_ids[0]
            raise ValueError(f"Could not extract video ID from URL: {url}")
    
        # /embed/VIDEO_ID, /shorts/VIDEO_ID, /live/VIDEO_ID, /v/VIDEO_ID
        path_parts = parsed.path.split("/")
        if len(path_parts) >= 3 and path_parts[1] in ("embed", "shorts", "live", "v"):
            video_id = path_parts[2].split("?")[0]
            if video_id:
                return video_id
    
        raise ValueError(f"Could not extract video ID from URL: {url}")
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only mentions it returns a dict of metadata but does not disclose side effects (e.g., network call) or whether it is read-only. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Short, front-loaded description with clear Args and Returns sections. Every sentence provides necessary information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite low complexity and presence of output schema, the description lacks any usage context or constraints (e.g., authentication, rate limits). Adequate but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds value by specifying the 'url' parameter can be a URL or video ID, beyond the schema's type-only definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Get metadata from a YouTube video', specifying the verb and resource. Differentiates from sibling 'get_transcript' which serves a different purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'get_transcript'. Missing context on appropriate scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kzmshx/youtube-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server