get_next_chunk
Retrieve the next subtitle chunk for sequential translation processing after conversation detection, enabling chunk-by-chunk handling of large SRT files.
Instructions
📦 CHUNK RETRIEVAL FOR TRANSLATION WORKFLOW 📦
🎯 PURPOSE: Retrieves the next chunk from memory for sequential processing. Use this after detect_conversations with storeInMemory=true.
🔄 HOW IT WORKS:
Automatically tracks which chunk to return next
Returns actual chunk data with subtitle text content
Advances to next chunk automatically
Returns null when all chunks processed
📥 PARAMETERS:
sessionId: Session ID from detect_conversations response
📤 RETURNS:
chunk: Complete chunk data with subtitle text (or null if done)
chunkIndex: Current chunk number (0-based)
totalChunks: Total chunks available
hasMore: Boolean indicating if more chunks exist
message: Status message
💡 USAGE PATTERN:
Call detect_conversations with storeInMemory=true
Get sessionId from response
Call get_next_chunk repeatedly until hasMore=false
Process each chunk for translation
Use translate_srt() on individual chunks
📋 EXAMPLE: {"sessionId": "srt-session-123456789"}
⚠️ NOTE:
Each call advances to the next chunk automatically
Store sessionId from detect_conversations response
Use this for chunk-by-chunk processing of large files
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| sessionId | Yes | Session ID from detect_conversations with storeInMemory=true |
Implementation Reference
- src/mcp/server.ts:618-667 (handler)Implements the core logic for the get_next_chunk MCP tool. Retrieves the next SRT chunk from session-specific memory storage, advances the chunk index, and returns structured chunk data including subtitles or completion status when all chunks are processed.private async handleGetNextChunk(args: any) { const { sessionId } = args; if (!this.chunkMemory.has(sessionId)) { throw new Error(`Session ${sessionId} not found in memory`); } const chunks = this.chunkMemory.get(sessionId); const currentIndex = this.chunkIndex.get(sessionId) || 0; if (currentIndex >= chunks.length) { return { content: [ { type: 'text', text: JSON.stringify({ success: true, chunk: null, chunkIndex: currentIndex, totalChunks: chunks.length, hasMore: false, message: 'All chunks have been processed' }, null, 2), }, ], }; } const currentChunk = chunks[currentIndex]; this.chunkIndex.set(sessionId, currentIndex + 1); return { content: [ { type: 'text', text: JSON.stringify({ success: true, chunk: currentChunk, chunkIndex: currentIndex, totalChunks: chunks.length, hasMore: currentIndex + 1 < chunks.length, message: `Retrieved chunk ${currentIndex + 1} of ${chunks.length}`, nextInstruction: currentIndex + 1 < chunks.length ? `Call get_next_chunk again to get chunk ${currentIndex + 2}` : 'All chunks have been retrieved' }, null, 2), }, ], }; }
- src/mcp/server.ts:194-242 (registration)Registers the get_next_chunk tool in the MCP server's tool list, including detailed description of sequential chunk retrieval workflow and input schema requiring a sessionId.{ name: 'get_next_chunk', description: `📦 CHUNK RETRIEVAL FOR TRANSLATION WORKFLOW 📦 🎯 PURPOSE: Retrieves the next chunk from memory for sequential processing. Use this after detect_conversations with storeInMemory=true. 🔄 HOW IT WORKS: - Automatically tracks which chunk to return next - Returns actual chunk data with subtitle text content - Advances to next chunk automatically - Returns null when all chunks processed 📥 PARAMETERS: - sessionId: Session ID from detect_conversations response 📤 RETURNS: - chunk: Complete chunk data with subtitle text (or null if done) - chunkIndex: Current chunk number (0-based) - totalChunks: Total chunks available - hasMore: Boolean indicating if more chunks exist - message: Status message 💡 USAGE PATTERN: 1. Call detect_conversations with storeInMemory=true 2. Get sessionId from response 3. Call get_next_chunk repeatedly until hasMore=false 4. Process each chunk for translation 5. Use translate_srt() on individual chunks 📋 EXAMPLE: {"sessionId": "srt-session-123456789"} ⚠️ NOTE: - Each call advances to the next chunk automatically - Store sessionId from detect_conversations response - Use this for chunk-by-chunk processing of large files`, inputSchema: { type: 'object', properties: { sessionId: { type: 'string', description: 'Session ID from detect_conversations with storeInMemory=true', }, }, required: ['sessionId'], }, },
- src/mcp/server.ts:232-241 (schema)Defines the input schema for get_next_chunk tool: requires a sessionId string parameter.inputSchema: { type: 'object', properties: { sessionId: { type: 'string', description: 'Session ID from detect_conversations with storeInMemory=true', }, }, required: ['sessionId'], },
- src/mcp/server.ts:68-71 (helper)Class properties used by the handler to store chunks and track current index per session ID, enabling stateful sequential chunk retrieval.private translationService = TranslationServiceFactory.createService('chat-ai'); private chunkMemory = new Map<string, any>(); // Store chunks by session ID private chunkIndex = new Map<string, number>(); // Track current chunk index per session private todoManager = new SRTProcessingTodoManager('generic'); // Shared TODO manager
- Advanced conversation detection function that generates SRT chunks (used in detect_conversations to populate memory for get_next_chunk). Configurable parameters control chunking strategy for optimal AI processing.export function detectConversationsAdvanced( subtitles: SRTSubtitle[], options: { boundaryThreshold?: number; maxChunkSize?: number; minChunkSize?: number; enableSemanticAnalysis?: boolean; enableSpeakerDiarization?: boolean; } = {} ): SRTChunk[] { const { boundaryThreshold = 0.7, maxChunkSize = 20, minChunkSize = 2, enableSemanticAnalysis = true, enableSpeakerDiarization = true } = options; // First pass: Basic boundary detection with custom threshold const initialChunks = detectBasicBoundariesWithThreshold(subtitles, boundaryThreshold); let processedChunks = initialChunks; // Second pass: Semantic analysis (optional) if (enableSemanticAnalysis) { processedChunks = applySemanticAnalysis(processedChunks); } // Third pass: Speaker diarization (optional) if (enableSpeakerDiarization) { processedChunks = applySpeakerDiarization(processedChunks); } // Fourth pass: Size optimization processedChunks = optimizeChunkSizesWithLimits(processedChunks, maxChunkSize, minChunkSize); return processedChunks; }