# YtMCP: Feasible Features Research & Architecture
## 1. Executive Summary
This document outlines the strictly feasible feature set for the YtMCP (YouTube Model Context Protocol) server. The analysis is based on a deep code inspection of the installed libraries (`yt-dlp`, `scrapetube`, `youtube-search-python`, `youtube-transcript-api`).
**Constraint**: All features must function **without** a rapid-API subscription or an official Google Cloud Console YouTube Data API key (which has strict quota limits). We rely on robust, open-source scraping libraries.
## 2. The "Green Light" Feature Set
The following 16 features are confirmed technically viable. They are categorized by their primary function.
### Category A: Core Discovery (Search)
*Powered primarily by `youtube-search-python` and `yt-dlp`.*
| Feature Name | Functionality | Underlying Library / Method |
| :--- | :--- | :--- |
| **`search_videos`** | Basic keyword search. | `youtubesearchpython.VideosSearch(query, limit=N)` |
| **`search_filtered`** | Search with filters (Duration, Date, Type). | `youtubesearchpython.CustomSearch` with constants like `VideoUploadDateFilter.thisWeek` or `VideoDurationFilter.long`. |
| **`get_trending_videos`** | Fetch current trending list. | `yt-dlp --flat-playlist https://www.youtube.com/feed/trending` (Fast & lightweight). |
| **`find_channels`** | Search specifically for creators. | `youtubesearchpython.ChannelsSearch` or `scrapetube.get_search(results_type='channel')`. |
| **`find_playlists`** | Search for organized playlists. | `youtubesearchpython.PlaylistsSearch`. |
### Category B: Video Intelligence (Deep Dive)
*Powered by `youtube-transcript-api` and `yt-dlp`.*
| Feature Name | Functionality | Underlying Library / Method |
| :--- | :--- | :--- |
| **`get_transcript`** | **CRITICAL**. Fetches time-synced subtitles. | `youtube_transcript_api.YouTubeTranscriptApi.get_transcript(video_id)`. Supports auto-generated & manual. |
| **`get_video_metadata`** | Views, Tags, Description, Likes, Publish Date. | `yt-dlp --dump-json --skip-download`. Provides the most comprehensive metadata object available. |
| **`get_video_chapters`** | Extracts "Key Moments" / timestamps. | `yt-dlp` metadata contains `chapters` array if available. |
| **`get_thumbnail`** | High-res thumbnail URLs. | `yt-dlp` or `youtubesearchpython` results object. |
| **`get_comments`** | **(Modified)**: Fetch top N comments. | `yt-dlp --get-comments --dump-json` (Warning: Slower than other tools, implement with strict limits). |
### Category C: Channel & Playlist Forensics
*Powered primarily by `scrapetube` (Best for lists).*
| Feature Name | Functionality | Underlying Library / Method |
| :--- | :--- | :--- |
| **`get_channel_videos`** | List all videos (Newest/Popular/Oldest). | `scrapetube.get_channel(channel_id, sort_by='newest')`. |
| **`get_channel_shorts`** | List Shorts specifically. | `scrapetube.get_channel(content_type='shorts')`. |
| **`get_channel_streams`** | List Live Streams (Past & Present). | `scrapetube.get_channel(content_type='streams')`. |
| **`get_playlist_items`** | Flatten a playlist into video IDs. | `scrapetube.get_playlist(playlist_id)`. |
| **`get_channel_about`** | Channel description & stats. | `youtubesearchpython.ChannelSearch(...).result()`. |
### Category D: Utilities
| Feature Name | Functionality | Underlying Library / Method |
| :--- | :--- | :--- |
| **`download_audio`** | Get playable audio URL or download mp3. | `yt-dlp -f bestaudio -g` (Generates stream URL) or download. |
---
## 4. Architecture Recommendation
The MCP server should expose these capabilities as **Tools**.
- **Global Rate Limiting**: Since we are scraping, we must implement a delay (e.g., 0.5s - 1s) between requests to avoid IP bans.
- **Failover Logic**: `youtube-search-python` and `scrapetube` sometimes compete on functionality. We will use `scrapetube` for *lists* (Channels/Playlists) because it is faster / lighter, and `youtube-search-python` for *search* and specific filtered queries.
- **Data Standardization**: All tools should return a simplified Markdown or JSON structure optimized for LLM context windows, stripping unnecessary raw HTML or massive debug blobs.