get_bulk_transcripts
Extract transcripts from multiple YouTube videos simultaneously in various languages and formats for batch processing.
Instructions
Extract transcripts from multiple YouTube videos
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| videoIds | Yes | Array of YouTube video IDs or URLs | |
| language | No | Language code (e.g., "en", "es", "fr") | en |
| outputFormat | No | Output format | json |
| includeMetadata | No | Include metadata in response |
Implementation Reference
- src/server/mcp-server.ts:256-278 (handler)The main handler function for the 'get_bulk_transcripts' tool in the MCP server. It validates input, constructs the request object, delegates to the transcript service, and returns a formatted MCP response.private async handleGetBulkTranscripts(args: any) { const { videoIds, language = 'en', outputFormat = 'json', includeMetadata = true } = args; if (!videoIds || !Array.isArray(videoIds) || videoIds.length === 0) { throw new McpError(ErrorCode.InvalidParams, 'videoIds array is required'); } const request = { videoIds, language, outputFormat, includeMetadata }; const result = await this.transcriptService.getBulkTranscripts(request); return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] }; }
- src/server/mcp-server.ts:117-147 (registration)Registration of the 'get_bulk_transcripts' tool in the server's list of available tools, defining its metadata and input schema for the MCP ListTools request.{ name: 'get_bulk_transcripts', description: 'Extract transcripts from multiple YouTube videos', inputSchema: { type: 'object', properties: { videoIds: { type: 'array', items: { type: 'string' }, description: 'Array of YouTube video IDs or URLs' }, language: { type: 'string', description: 'Language code (e.g., "en", "es", "fr")', default: 'en' }, outputFormat: { type: 'string', enum: ['text', 'json', 'srt'], description: 'Output format', default: 'json' }, includeMetadata: { type: 'boolean', description: 'Include metadata in response', default: true } }, required: ['videoIds'] } },
- src/types/index.ts:20-38 (schema)TypeScript type definitions for the BulkTranscriptRequest (input) and BulkTranscriptResponse (output) used by the tool implementation.export interface BulkTranscriptRequest { videoIds: string[]; outputFormat: 'text' | 'json' | 'srt'; language?: string; includeMetadata?: boolean; } export interface BulkTranscriptResponse { results: TranscriptResponse[]; errors: Array<{ videoId: string; error: string; }>; summary: { total: number; successful: number; failed: number; }; }
- Helper service method containing the core logic for fetching bulk transcripts. It invokes a Python script for extraction, processes the results, handles errors, and returns structured responses.public async getBulkTranscripts( request: BulkTranscriptRequest ): Promise<BulkTranscriptResponse> { try { this.logger.info(`Processing bulk request for ${request.videoIds.length} videos`); // Call Python script for bulk processing const videoIds = request.videoIds.map(id => this.extractVideoId(id)).join(','); const command = `python3 "${this.pythonScript}" bulk --video-ids "${videoIds}" --language "${request.language || 'en'}"`; const { stdout, stderr } = await execAsync(command); if (stderr) { this.logger.warn(`Python script warning: ${stderr}`); } const pythonResult: PythonBulkResult = JSON.parse(stdout); if (!pythonResult.success) { throw new Error('Bulk processing failed'); } // Convert results to our format const results: TranscriptResponse[] = []; for (const result of pythonResult.results) { const transcript: TranscriptItem[] = result.transcript.map(item => ({ text: item.text, start: item.start, duration: item.duration })); results.push({ videoId: result.videoId, title: await this.getVideoTitle(result.videoId), language: result.language, transcript, metadata: { extractedAt: new Date().toISOString(), source: 'youtube-transcript-api', duration: result.metadata?.duration || transcript.reduce((acc, item) => acc + item.duration, 0) } }); } return { results, errors: pythonResult.errors, summary: pythonResult.summary }; } catch (error) { this.logger.error(`Failed to process bulk request:`, error); return { results: [], errors: request.videoIds.map(videoId => ({ videoId: this.extractVideoId(videoId), error: error instanceof Error ? error.message : 'Unknown error' })), summary: { total: request.videoIds.length, successful: 0, failed: request.videoIds.length } }; } }