get_bulk_transcripts
Extract transcripts from multiple YouTube videos in bulk, supporting text, JSON, or SRT formats with optional metadata. Choose preferred language for accurate results.
Instructions
Extract transcripts from multiple YouTube videos
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| includeMetadata | No | Include metadata in response | |
| language | No | Language code (e.g., "en", "es", "fr") | en |
| outputFormat | No | Output format | json |
| videoIds | Yes | Array of YouTube video IDs or URLs |
Implementation Reference
- Core handler function that implements getBulkTranscripts by executing a Python bulk fetch script, processing results, handling errors, and returning structured BulkTranscriptResponse.public async getBulkTranscripts( request: BulkTranscriptRequest ): Promise<BulkTranscriptResponse> { try { this.logger.info(`Processing bulk request for ${request.videoIds.length} videos`); // Call Python script for bulk processing const videoIds = request.videoIds.map(id => this.extractVideoId(id)).join(','); const command = `python3 "${this.pythonScript}" bulk --video-ids "${videoIds}" --language "${request.language || 'en'}"`; const { stdout, stderr } = await execAsync(command); if (stderr) { this.logger.warn(`Python script warning: ${stderr}`); } const pythonResult: PythonBulkResult = JSON.parse(stdout); if (!pythonResult.success) { throw new Error('Bulk processing failed'); } // Convert results to our format const results: TranscriptResponse[] = []; for (const result of pythonResult.results) { const transcript: TranscriptItem[] = result.transcript.map(item => ({ text: item.text, start: item.start, duration: item.duration })); results.push({ videoId: result.videoId, title: await this.getVideoTitle(result.videoId), language: result.language, transcript, metadata: { extractedAt: new Date().toISOString(), source: 'youtube-transcript-api', duration: result.metadata?.duration || transcript.reduce((acc, item) => acc + item.duration, 0) } }); } return { results, errors: pythonResult.errors, summary: pythonResult.summary }; } catch (error) { this.logger.error(`Failed to process bulk request:`, error); return { results: [], errors: request.videoIds.map(videoId => ({ videoId: this.extractVideoId(videoId), error: error instanceof Error ? error.message : 'Unknown error' })), summary: { total: request.videoIds.length, successful: 0, failed: request.videoIds.length } }; } }
- src/server/mcp-server.ts:117-147 (registration)Tool registration in getAvailableTools(): defines name 'get_bulk_transcripts', description, and inputSchema for MCP tool listing.{ name: 'get_bulk_transcripts', description: 'Extract transcripts from multiple YouTube videos', inputSchema: { type: 'object', properties: { videoIds: { type: 'array', items: { type: 'string' }, description: 'Array of YouTube video IDs or URLs' }, language: { type: 'string', description: 'Language code (e.g., "en", "es", "fr")', default: 'en' }, outputFormat: { type: 'string', enum: ['text', 'json', 'srt'], description: 'Output format', default: 'json' }, includeMetadata: { type: 'boolean', description: 'Include metadata in response', default: true } }, required: ['videoIds'] } },
- src/server/mcp-server.ts:256-278 (handler)MCP server tool handler: validates arguments, creates request object, calls transcriptService.getBulkTranscripts, and formats response as MCP content.private async handleGetBulkTranscripts(args: any) { const { videoIds, language = 'en', outputFormat = 'json', includeMetadata = true } = args; if (!videoIds || !Array.isArray(videoIds) || videoIds.length === 0) { throw new McpError(ErrorCode.InvalidParams, 'videoIds array is required'); } const request = { videoIds, language, outputFormat, includeMetadata }; const result = await this.transcriptService.getBulkTranscripts(request); return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] }; }
- src/server/mcp-server.ts:120-146 (schema)Input schema definition for get_bulk_transcripts tool: specifies videoIds array, language, outputFormat, includeMetadata.inputSchema: { type: 'object', properties: { videoIds: { type: 'array', items: { type: 'string' }, description: 'Array of YouTube video IDs or URLs' }, language: { type: 'string', description: 'Language code (e.g., "en", "es", "fr")', default: 'en' }, outputFormat: { type: 'string', enum: ['text', 'json', 'srt'], description: 'Output format', default: 'json' }, includeMetadata: { type: 'boolean', description: 'Include metadata in response', default: true } }, required: ['videoIds'] }