Skip to main content
Glama
jikime

YouTube Toolbox

get_video_transcript

Extract transcripts or captions from YouTube videos to access spoken content in text format for analysis, translation, or accessibility purposes.

Instructions

Get transcript/captions for a YouTube video

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
video_idYes
languageNoko

Implementation Reference

  • Primary MCP tool handler implementing the get_video_transcript tool. Fetches video metadata using YouTube API, retrieves transcript segments via helper, formats with timestamps, and returns structured data including metadata, segment list, and plain timestamped text.
    @mcp.tool(
        name="get_video_transcript",
        description="Get transcript/captions for a YouTube video",
    )
    async def get_video_transcript(video_id: str, language: Optional[str] = 'ko') -> Dict[str, Any]:
        """
        Get transcript/captions for a YouTube video
        
        Args:
            video_id (str): YouTube video ID
            language (str, optional): Language code (e.g., 'en', 'ko', 'fr')
        
        Returns:
            Dict[str, Any]: Transcript data
        """
        try:
            # Get video details for metadata
            video_data = youtube_service.get_video_details(video_id)
            
            if not video_data.get('items'):
                return {'error': f"Video with ID {video_id} not found"}
                
            video = video_data['items'][0]
            
            # Get transcript
            try:
                transcript_data = youtube_service.get_video_transcript(video_id, language)
                
                # Format transcript with timestamps
                formatted_transcript = []
                for segment in transcript_data:
                    text = getattr(segment, 'text', '')
                    start = getattr(segment, 'start', 0)
                    duration = getattr(segment, 'duration', 0)
                    
                    formatted_transcript.append({
                        'text': text,
                        'start': start,
                        'duration': duration,
                        'timestamp': youtube_service.format_time(int(start * 1000))
                    })
                
                # Create metadata
                metadata = {
                    'videoId': video.get('id'),
                    'title': video.get('snippet', {}).get('title'),
                    'channelTitle': video.get('snippet', {}).get('channelTitle'),
                    'language': language or 'default',
                    'segmentCount': len(transcript_data)
                }
                
                # Create timestamped text version
                timestamped_text = "\n".join([
                    f"[{item['timestamp']}] {item['text']}" 
                    for item in formatted_transcript
                ])
                
                return {
                    'metadata': metadata,
                    'transcript': formatted_transcript,
                    'text': timestamped_text,
                    'channelId': video.get('snippet', {}).get('channelId')
                }
            except Exception as e:
                return {
                    'error': f"Could not retrieve transcript: {str(e)}",
                    'videoId': video_id,
                    'title': video.get('snippet', {}).get('title')
                }
                
        except Exception as e:
            logger.exception(f"Error in get_video_transcript: {e}")
            return {'error': str(e)}
  • Core helper method in YouTubeService that fetches raw transcript segments using youtube_transcript_api. Supports language-specific transcripts with fallbacks to generated and English transcripts.
    def get_video_transcript(self, video_id: str, language: Optional[str] = 'ko') -> List[Dict[str, Any]]:
        """
        Get transcript for a specific YouTube video
        """
        video_id = self.parse_url(video_id)
        
        try:
            if language:
                transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
                try:
                    transcript = transcript_list.find_transcript([language])
                    return transcript.fetch()
                except NoTranscriptFound:
                    # Fallback to generated transcript if available
                    try:
                        transcript = transcript_list.find_generated_transcript([language])
                        return transcript.fetch()
                    except:
                        # Final fallback to any available transcript
                        transcript = transcript_list.find_transcript(['en'])
                        return transcript.fetch()
            else:
                return YouTubeTranscriptApi.get_video_transcript(video_id)
                
        except (TranscriptsDisabled, NoTranscriptFound) as e:
            logger.error(f"No transcript available for video {video_id}: {e}")
            return []
        except Exception as e:
            logger.error(f"Error getting transcript for video {video_id}: {e}")
            raise e
  • server.py:684-687 (registration)
    The tool is listed in the available-youtube-tools resource, confirming its registration and providing description.
    {"name": "get_video_transcript", "description": "Get transcript/captions for a YouTube video"},
    {"name": "get_related_videos", "description": "Get videos related to a specific YouTube video"},
    {"name": "get_trending_videos", "description": "Get trending videos on YouTube by region"},
    {"name": "get_video_enhanced_transcript", "description": "Advanced transcript extraction tool with filtering, search, and multi-video capabilities. Provides rich transcript data for detailed analysis and processing. Features: 1) Extract transcripts from multiple videos; 2) Filter by time ranges; 3) Search within transcripts; 4) Segment transcripts; 5) Format output in different ways; 6) Include video metadata."}

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jikime/py-mcp-youtube-toolbox'

If you have feedback or need assistance with the MCP directory API, please join our Discord server