create_podcast
Generate podcast episodes from text or URLs with 1-2 speakers in quick, deep, or debate formats, automatically creating both content and audio.
Instructions
Create a podcast episode with full generation (text + audio). Supports single-speaker (solo) or dual-speaker (dialogue) formats with 1-2 speakers (can use speaker names or IDs). Choose from 3 generation modes: quick (3-5 min podcast), deep (8-15 min podcast), or debate (5-10 min podcast). Accepts text or URL sources. This tool will automatically poll until generation is complete (may take several minutes).
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | The content or topic for the podcast (optional if sources provided) | |
| sources | No | Additional sources (text or URLs) | |
| speakers | Yes | 1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs. | |
| language | No | Language code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker's language. If not specified, will use the first speaker's language. | |
| mode | No | Generation mode (time indicates podcast audio length): "quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. "deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. "debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quick | quick |
Implementation Reference
- source/tools/podcast.ts:84-209 (handler)Core handler function for 'create_podcast' tool. Resolves speakers, builds CreatePodcastRequest, submits to API via client, polls for completion using pollUntilComplete, formats episode info with formatPodcastEpisode.async execute(args, {log}: {log: any}) { try { // Resolve speaker names/IDs to actual speaker IDs log.info('Resolving speaker identifiers', { input: args.speakers, language: args.language, }); const resolvedSpeakers = await client.resolveSpeakers(args.speakers); const speakers = resolvedSpeakers.map((r) => ({ speakerId: r.speakerId, })); const validationError = validateSpeakers(speakers, 1, 2); if (validationError) { return `Validation error: ${validationError}`; } // Use provided language or infer from resolved speaker const allSpeakers = await client.getCachedSpeakers(); const resolvedSpeaker = allSpeakers.find( (s) => s.speakerId === resolvedSpeakers[0]?.speakerId, ); const language = args.language ?? resolvedSpeaker?.language ?? 'zh'; log.info('Creating podcast episode', { query: args.query?.slice(0, 50), speakerCount: speakers.length, sourcesCount: args.sources?.length ?? 0, language, mode: args.mode, }); const requestData: CreatePodcastRequest = { speakers, mode: args.mode, language, }; if (args.query) { requestData.query = args.query; } if (args.sources && args.sources.length > 0) { requestData.sources = args.sources as PodcastSource[]; } const submitResponse = await client.podcast.createPodcast(requestData); if (submitResponse.code !== 0) { return `Failed to submit task: ${submitResponse.message ?? 'Unknown error'}`; } const episodeId = submitResponse.data?.episodeId; if (!episodeId) { return 'Failed to submit task: No episodeId returned'; } log.info(`Podcast task submitted successfully`, {episodeId}); const result = await pollUntilComplete( async () => { const statusResponse = await client.podcast.getEpisodeStatus(episodeId); if (statusResponse.code !== 0) { throw new Error( statusResponse.message ?? 'Failed to query status', ); } if (!statusResponse.data) { throw new Error('No episode data returned'); } return statusResponse.data; }, { pollInterval: 5000, maxRetries: 60, onProgress(status, retry) { log.debug(`Podcast generation status: ${status}`, { episodeId, retry: `${retry}/60`, }); }, }, ); if (!result.success) { if (result.error) { log.error('Podcast generation failed', { episodeId, error: result.error, }); return `Podcast generation failed: ${result.error}`; } log.warn('Podcast generation timeout', { episodeId, lastStatus: result.lastStatus, }); return `Task is still processing after 5 minutes.\n\nEpisode ID: ${episodeId}\nLast Status: ${result.lastStatus}\n\nThe task is running in the background. Use get_podcast_status tool with this Episode ID to check progress.`; } const episode = result.data!; log.info('Podcast generation completed', {episodeId}); const hasOutline = Boolean(episode.outline); const hasScripts = Boolean( episode.scripts && episode.scripts.length > 0, ); return `Podcast Generation Completed Content Included: - Basic Info: Episode ID, Title, Speakers, Language, Status - Audio Files: ${episode.audioUrl ? 'Yes' : 'No'} - Outline: ${hasOutline ? 'Yes (see below)' : 'No'} - Scripts: ${hasScripts ? 'Yes (see below)' : 'No'} ${formatPodcastEpisode(episode)}`; } catch (error) { const errorMessage = formatError(error); log.error('Failed to create podcast', {error: errorMessage}); return `Failed to create podcast: ${errorMessage}`; } },
- source/tools/podcast.ts:37-77 (schema)Zod input schema for the create_podcast tool defining parameters: query, sources, speakers (1-2), language, mode (quick/deep/debate).parameters: z.object({ query: z .string() .optional() .describe( 'The content or topic for the podcast (optional if sources provided)', ), sources: z .array( z.object({ type: z.enum(['text', 'url']).describe('Source type: text or url'), content: z .string() .describe('Source content (text content or URL)'), }), ) .optional() .describe('Additional sources (text or URLs)'), speakers: z .array(z.string()) .min(1) .max(2) .describe( '1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs.', ), language: z .string() .optional() .describe( 'Language code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker\'s language. If not specified, will use the first speaker\'s language.', ), mode: z .enum(['quick', 'deep', 'debate']) .default('quick') .describe( 'Generation mode (time indicates podcast audio length): ' + '"quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. ' + '"deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. ' + '"debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quick', ), }),
- source/tools/podcast.ts:33-210 (registration)MCP tool registration via FastMCP server.addTool call within registerPodcastTools function.server.addTool({ name: 'create_podcast', description: 'Create a podcast episode with full generation (text + audio). Supports single-speaker (solo) or dual-speaker (dialogue) formats with 1-2 speakers (can use speaker names or IDs). Choose from 3 generation modes: quick (3-5 min podcast), deep (8-15 min podcast), or debate (5-10 min podcast). Accepts text or URL sources. This tool will automatically poll until generation is complete (may take several minutes).', parameters: z.object({ query: z .string() .optional() .describe( 'The content or topic for the podcast (optional if sources provided)', ), sources: z .array( z.object({ type: z.enum(['text', 'url']).describe('Source type: text or url'), content: z .string() .describe('Source content (text content or URL)'), }), ) .optional() .describe('Additional sources (text or URLs)'), speakers: z .array(z.string()) .min(1) .max(2) .describe( '1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs.', ), language: z .string() .optional() .describe( 'Language code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker\'s language. If not specified, will use the first speaker\'s language.', ), mode: z .enum(['quick', 'deep', 'debate']) .default('quick') .describe( 'Generation mode (time indicates podcast audio length): ' + '"quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. ' + '"deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. ' + '"debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quick', ), }), annotations: { title: 'Create Podcast', openWorldHint: true, readOnlyHint: false, }, // eslint-disable-next-line complexity async execute(args, {log}: {log: any}) { try { // Resolve speaker names/IDs to actual speaker IDs log.info('Resolving speaker identifiers', { input: args.speakers, language: args.language, }); const resolvedSpeakers = await client.resolveSpeakers(args.speakers); const speakers = resolvedSpeakers.map((r) => ({ speakerId: r.speakerId, })); const validationError = validateSpeakers(speakers, 1, 2); if (validationError) { return `Validation error: ${validationError}`; } // Use provided language or infer from resolved speaker const allSpeakers = await client.getCachedSpeakers(); const resolvedSpeaker = allSpeakers.find( (s) => s.speakerId === resolvedSpeakers[0]?.speakerId, ); const language = args.language ?? resolvedSpeaker?.language ?? 'zh'; log.info('Creating podcast episode', { query: args.query?.slice(0, 50), speakerCount: speakers.length, sourcesCount: args.sources?.length ?? 0, language, mode: args.mode, }); const requestData: CreatePodcastRequest = { speakers, mode: args.mode, language, }; if (args.query) { requestData.query = args.query; } if (args.sources && args.sources.length > 0) { requestData.sources = args.sources as PodcastSource[]; } const submitResponse = await client.podcast.createPodcast(requestData); if (submitResponse.code !== 0) { return `Failed to submit task: ${submitResponse.message ?? 'Unknown error'}`; } const episodeId = submitResponse.data?.episodeId; if (!episodeId) { return 'Failed to submit task: No episodeId returned'; } log.info(`Podcast task submitted successfully`, {episodeId}); const result = await pollUntilComplete( async () => { const statusResponse = await client.podcast.getEpisodeStatus(episodeId); if (statusResponse.code !== 0) { throw new Error( statusResponse.message ?? 'Failed to query status', ); } if (!statusResponse.data) { throw new Error('No episode data returned'); } return statusResponse.data; }, { pollInterval: 5000, maxRetries: 60, onProgress(status, retry) { log.debug(`Podcast generation status: ${status}`, { episodeId, retry: `${retry}/60`, }); }, }, ); if (!result.success) { if (result.error) { log.error('Podcast generation failed', { episodeId, error: result.error, }); return `Podcast generation failed: ${result.error}`; } log.warn('Podcast generation timeout', { episodeId, lastStatus: result.lastStatus, }); return `Task is still processing after 5 minutes.\n\nEpisode ID: ${episodeId}\nLast Status: ${result.lastStatus}\n\nThe task is running in the background. Use get_podcast_status tool with this Episode ID to check progress.`; } const episode = result.data!; log.info('Podcast generation completed', {episodeId}); const hasOutline = Boolean(episode.outline); const hasScripts = Boolean( episode.scripts && episode.scripts.length > 0, ); return `Podcast Generation Completed Content Included: - Basic Info: Episode ID, Title, Speakers, Language, Status - Audio Files: ${episode.audioUrl ? 'Yes' : 'No'} - Outline: ${hasOutline ? 'Yes (see below)' : 'No'} - Scripts: ${hasScripts ? 'Yes (see below)' : 'No'} ${formatPodcastEpisode(episode)}`; } catch (error) { const errorMessage = formatError(error); log.error('Failed to create podcast', {error: errorMessage}); return `Failed to create podcast: ${errorMessage}`; } }, });
- source/types/podcast.ts:29-35 (schema)TypeScript interface CreatePodcastRequest used for the API payload in createPodcast call.export type CreatePodcastRequest = { query?: string; sources?: PodcastSource[]; speakers: Array<{speakerId: string}>; language?: string; mode?: 'quick' | 'deep' | 'debate'; };
- source/tools/utils.ts:38-94 (helper)Utility function pollUntilComplete used in handler to wait for podcast generation to finish.export async function pollUntilComplete<T extends {processStatus: string}>( fetchStatus: () => Promise<T>, options: { pollInterval?: number; maxRetries?: number; onProgress?: (status: string, retry: number) => void; } = {}, ): Promise<PollResult<T>> { const {pollInterval = 5000, maxRetries = 120, onProgress} = options; let retries = 0; let lastStatus = 'pending'; while (retries < maxRetries) { try { // eslint-disable-next-line no-await-in-loop const data = await fetchStatus(); lastStatus = data.processStatus; if (onProgress) { onProgress(lastStatus, retries); } if (data.processStatus === 'success') { return {success: true, data, lastStatus}; } if (data.processStatus === 'failed') { return { success: false, data, lastStatus, error: ((data as any).message ?? (data as any).failCode) ? `Error code: ${(data as any).failCode}` : 'Unknown error', }; } // eslint-disable-next-line no-await-in-loop, no-promise-executor-return await new Promise((resolve) => setTimeout(resolve, pollInterval)); retries++; } catch (error) { return { success: false, lastStatus, error: formatError(error), }; } } return { success: false, lastStatus, error: `Timeout (exceeded ${(maxRetries * pollInterval) / 1000} seconds)`, }; }