Skip to main content
Glama

create_podcast

Generate podcast episodes from text or URLs with 1-2 speakers in quick, deep, or debate formats, automatically creating both content and audio.

Instructions

Create a podcast episode with full generation (text + audio). Supports single-speaker (solo) or dual-speaker (dialogue) formats with 1-2 speakers (can use speaker names or IDs). Choose from 3 generation modes: quick (3-5 min podcast), deep (8-15 min podcast), or debate (5-10 min podcast). Accepts text or URL sources. This tool will automatically poll until generation is complete (may take several minutes).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryNoThe content or topic for the podcast (optional if sources provided)
sourcesNoAdditional sources (text or URLs)
speakersYes1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs.
languageNoLanguage code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker's language. If not specified, will use the first speaker's language.
modeNoGeneration mode (time indicates podcast audio length): "quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. "deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. "debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quickquick

Implementation Reference

  • Core handler function for 'create_podcast' tool. Resolves speakers, builds CreatePodcastRequest, submits to API via client, polls for completion using pollUntilComplete, formats episode info with formatPodcastEpisode.
    		async execute(args, {log}: {log: any}) {
    			try {
    				// Resolve speaker names/IDs to actual speaker IDs
    				log.info('Resolving speaker identifiers', {
    					input: args.speakers,
    					language: args.language,
    				});
    
    				const resolvedSpeakers = await client.resolveSpeakers(args.speakers);
    				const speakers = resolvedSpeakers.map((r) => ({
    					speakerId: r.speakerId,
    				}));
    				const validationError = validateSpeakers(speakers, 1, 2);
    				if (validationError) {
    					return `Validation error: ${validationError}`;
    				}
    
    				// Use provided language or infer from resolved speaker
    				const allSpeakers = await client.getCachedSpeakers();
    				const resolvedSpeaker = allSpeakers.find(
    					(s) => s.speakerId === resolvedSpeakers[0]?.speakerId,
    				);
    				const language = args.language ?? resolvedSpeaker?.language ?? 'zh';
    
    				log.info('Creating podcast episode', {
    					query: args.query?.slice(0, 50),
    					speakerCount: speakers.length,
    					sourcesCount: args.sources?.length ?? 0,
    					language,
    					mode: args.mode,
    				});
    
    				const requestData: CreatePodcastRequest = {
    					speakers,
    					mode: args.mode,
    					language,
    				};
    
    				if (args.query) {
    					requestData.query = args.query;
    				}
    
    				if (args.sources && args.sources.length > 0) {
    					requestData.sources = args.sources as PodcastSource[];
    				}
    
    				const submitResponse = await client.podcast.createPodcast(requestData);
    
    				if (submitResponse.code !== 0) {
    					return `Failed to submit task: ${submitResponse.message ?? 'Unknown error'}`;
    				}
    
    				const episodeId = submitResponse.data?.episodeId;
    				if (!episodeId) {
    					return 'Failed to submit task: No episodeId returned';
    				}
    
    				log.info(`Podcast task submitted successfully`, {episodeId});
    
    				const result = await pollUntilComplete(
    					async () => {
    						const statusResponse =
    							await client.podcast.getEpisodeStatus(episodeId);
    						if (statusResponse.code !== 0) {
    							throw new Error(
    								statusResponse.message ?? 'Failed to query status',
    							);
    						}
    
    						if (!statusResponse.data) {
    							throw new Error('No episode data returned');
    						}
    
    						return statusResponse.data;
    					},
    					{
    						pollInterval: 5000,
    						maxRetries: 60,
    						onProgress(status, retry) {
    							log.debug(`Podcast generation status: ${status}`, {
    								episodeId,
    								retry: `${retry}/60`,
    							});
    						},
    					},
    				);
    
    				if (!result.success) {
    					if (result.error) {
    						log.error('Podcast generation failed', {
    							episodeId,
    							error: result.error,
    						});
    						return `Podcast generation failed: ${result.error}`;
    					}
    
    					log.warn('Podcast generation timeout', {
    						episodeId,
    						lastStatus: result.lastStatus,
    					});
    					return `Task is still processing after 5 minutes.\n\nEpisode ID: ${episodeId}\nLast Status: ${result.lastStatus}\n\nThe task is running in the background. Use get_podcast_status tool with this Episode ID to check progress.`;
    				}
    
    				const episode = result.data!;
    				log.info('Podcast generation completed', {episodeId});
    
    				const hasOutline = Boolean(episode.outline);
    				const hasScripts = Boolean(
    					episode.scripts && episode.scripts.length > 0,
    				);
    
    				return `Podcast Generation Completed
    
    Content Included:
    - Basic Info: Episode ID, Title, Speakers, Language, Status
    - Audio Files: ${episode.audioUrl ? 'Yes' : 'No'}
    - Outline: ${hasOutline ? 'Yes (see below)' : 'No'}
    - Scripts: ${hasScripts ? 'Yes (see below)' : 'No'}
    
    ${formatPodcastEpisode(episode)}`;
    			} catch (error) {
    				const errorMessage = formatError(error);
    				log.error('Failed to create podcast', {error: errorMessage});
    				return `Failed to create podcast: ${errorMessage}`;
    			}
    		},
  • Zod input schema for the create_podcast tool defining parameters: query, sources, speakers (1-2), language, mode (quick/deep/debate).
    parameters: z.object({
    	query: z
    		.string()
    		.optional()
    		.describe(
    			'The content or topic for the podcast (optional if sources provided)',
    		),
    	sources: z
    		.array(
    			z.object({
    				type: z.enum(['text', 'url']).describe('Source type: text or url'),
    				content: z
    					.string()
    					.describe('Source content (text content or URL)'),
    			}),
    		)
    		.optional()
    		.describe('Additional sources (text or URLs)'),
    	speakers: z
    		.array(z.string())
    		.min(1)
    		.max(2)
    		.describe(
    			'1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs.',
    		),
    	language: z
    		.string()
    		.optional()
    		.describe(
    			'Language code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker\'s language. If not specified, will use the first speaker\'s language.',
    		),
    	mode: z
    		.enum(['quick', 'deep', 'debate'])
    		.default('quick')
    		.describe(
    			'Generation mode (time indicates podcast audio length): ' +
    				'"quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. ' +
    				'"deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. ' +
    				'"debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quick',
    		),
    }),
  • MCP tool registration via FastMCP server.addTool call within registerPodcastTools function.
    	server.addTool({
    		name: 'create_podcast',
    		description:
    			'Create a podcast episode with full generation (text + audio). Supports single-speaker (solo) or dual-speaker (dialogue) formats with 1-2 speakers (can use speaker names or IDs). Choose from 3 generation modes: quick (3-5 min podcast), deep (8-15 min podcast), or debate (5-10 min podcast). Accepts text or URL sources. This tool will automatically poll until generation is complete (may take several minutes).',
    		parameters: z.object({
    			query: z
    				.string()
    				.optional()
    				.describe(
    					'The content or topic for the podcast (optional if sources provided)',
    				),
    			sources: z
    				.array(
    					z.object({
    						type: z.enum(['text', 'url']).describe('Source type: text or url'),
    						content: z
    							.string()
    							.describe('Source content (text content or URL)'),
    					}),
    				)
    				.optional()
    				.describe('Additional sources (text or URLs)'),
    			speakers: z
    				.array(z.string())
    				.min(1)
    				.max(2)
    				.describe(
    					'1-2 speaker names or IDs. Use speaker names from get_speakers tool output (the "name" field, not speakerId). Full speaker IDs also supported. Names will be automatically resolved to IDs.',
    				),
    			language: z
    				.string()
    				.optional()
    				.describe(
    					'Language code (e.g., "zh" for Chinese, "en" for English). Should match the selected speaker\'s language. If not specified, will use the first speaker\'s language.',
    				),
    			mode: z
    				.enum(['quick', 'deep', 'debate'])
    				.default('quick')
    				.describe(
    					'Generation mode (time indicates podcast audio length): ' +
    						'"quick" (3-5 min audio): Fast-paced content for simple topics, news summaries, brief introductions. Supports 1-2 speakers. ' +
    						'"deep" (8-15 min audio): Comprehensive analysis for detailed topics, in-depth explanations, thorough coverage. Supports 1-2 speakers. ' +
    						'"debate" (5-10 min audio): Conversational discussion, interviews, dialogues, debates, Q&A sessions. Supports 1-2 speakers. Default: quick',
    				),
    		}),
    		annotations: {
    			title: 'Create Podcast',
    			openWorldHint: true,
    			readOnlyHint: false,
    		},
    		// eslint-disable-next-line complexity
    		async execute(args, {log}: {log: any}) {
    			try {
    				// Resolve speaker names/IDs to actual speaker IDs
    				log.info('Resolving speaker identifiers', {
    					input: args.speakers,
    					language: args.language,
    				});
    
    				const resolvedSpeakers = await client.resolveSpeakers(args.speakers);
    				const speakers = resolvedSpeakers.map((r) => ({
    					speakerId: r.speakerId,
    				}));
    				const validationError = validateSpeakers(speakers, 1, 2);
    				if (validationError) {
    					return `Validation error: ${validationError}`;
    				}
    
    				// Use provided language or infer from resolved speaker
    				const allSpeakers = await client.getCachedSpeakers();
    				const resolvedSpeaker = allSpeakers.find(
    					(s) => s.speakerId === resolvedSpeakers[0]?.speakerId,
    				);
    				const language = args.language ?? resolvedSpeaker?.language ?? 'zh';
    
    				log.info('Creating podcast episode', {
    					query: args.query?.slice(0, 50),
    					speakerCount: speakers.length,
    					sourcesCount: args.sources?.length ?? 0,
    					language,
    					mode: args.mode,
    				});
    
    				const requestData: CreatePodcastRequest = {
    					speakers,
    					mode: args.mode,
    					language,
    				};
    
    				if (args.query) {
    					requestData.query = args.query;
    				}
    
    				if (args.sources && args.sources.length > 0) {
    					requestData.sources = args.sources as PodcastSource[];
    				}
    
    				const submitResponse = await client.podcast.createPodcast(requestData);
    
    				if (submitResponse.code !== 0) {
    					return `Failed to submit task: ${submitResponse.message ?? 'Unknown error'}`;
    				}
    
    				const episodeId = submitResponse.data?.episodeId;
    				if (!episodeId) {
    					return 'Failed to submit task: No episodeId returned';
    				}
    
    				log.info(`Podcast task submitted successfully`, {episodeId});
    
    				const result = await pollUntilComplete(
    					async () => {
    						const statusResponse =
    							await client.podcast.getEpisodeStatus(episodeId);
    						if (statusResponse.code !== 0) {
    							throw new Error(
    								statusResponse.message ?? 'Failed to query status',
    							);
    						}
    
    						if (!statusResponse.data) {
    							throw new Error('No episode data returned');
    						}
    
    						return statusResponse.data;
    					},
    					{
    						pollInterval: 5000,
    						maxRetries: 60,
    						onProgress(status, retry) {
    							log.debug(`Podcast generation status: ${status}`, {
    								episodeId,
    								retry: `${retry}/60`,
    							});
    						},
    					},
    				);
    
    				if (!result.success) {
    					if (result.error) {
    						log.error('Podcast generation failed', {
    							episodeId,
    							error: result.error,
    						});
    						return `Podcast generation failed: ${result.error}`;
    					}
    
    					log.warn('Podcast generation timeout', {
    						episodeId,
    						lastStatus: result.lastStatus,
    					});
    					return `Task is still processing after 5 minutes.\n\nEpisode ID: ${episodeId}\nLast Status: ${result.lastStatus}\n\nThe task is running in the background. Use get_podcast_status tool with this Episode ID to check progress.`;
    				}
    
    				const episode = result.data!;
    				log.info('Podcast generation completed', {episodeId});
    
    				const hasOutline = Boolean(episode.outline);
    				const hasScripts = Boolean(
    					episode.scripts && episode.scripts.length > 0,
    				);
    
    				return `Podcast Generation Completed
    
    Content Included:
    - Basic Info: Episode ID, Title, Speakers, Language, Status
    - Audio Files: ${episode.audioUrl ? 'Yes' : 'No'}
    - Outline: ${hasOutline ? 'Yes (see below)' : 'No'}
    - Scripts: ${hasScripts ? 'Yes (see below)' : 'No'}
    
    ${formatPodcastEpisode(episode)}`;
    			} catch (error) {
    				const errorMessage = formatError(error);
    				log.error('Failed to create podcast', {error: errorMessage});
    				return `Failed to create podcast: ${errorMessage}`;
    			}
    		},
    	});
  • TypeScript interface CreatePodcastRequest used for the API payload in createPodcast call.
    export type CreatePodcastRequest = {
    	query?: string;
    	sources?: PodcastSource[];
    	speakers: Array<{speakerId: string}>;
    	language?: string;
    	mode?: 'quick' | 'deep' | 'debate';
    };
  • Utility function pollUntilComplete used in handler to wait for podcast generation to finish.
    export async function pollUntilComplete<T extends {processStatus: string}>(
    	fetchStatus: () => Promise<T>,
    	options: {
    		pollInterval?: number;
    		maxRetries?: number;
    		onProgress?: (status: string, retry: number) => void;
    	} = {},
    ): Promise<PollResult<T>> {
    	const {pollInterval = 5000, maxRetries = 120, onProgress} = options;
    
    	let retries = 0;
    	let lastStatus = 'pending';
    
    	while (retries < maxRetries) {
    		try {
    			// eslint-disable-next-line no-await-in-loop
    			const data = await fetchStatus();
    			lastStatus = data.processStatus;
    
    			if (onProgress) {
    				onProgress(lastStatus, retries);
    			}
    
    			if (data.processStatus === 'success') {
    				return {success: true, data, lastStatus};
    			}
    
    			if (data.processStatus === 'failed') {
    				return {
    					success: false,
    					data,
    					lastStatus,
    					error:
    						((data as any).message ?? (data as any).failCode)
    							? `Error code: ${(data as any).failCode}`
    							: 'Unknown error',
    				};
    			}
    
    			// eslint-disable-next-line no-await-in-loop, no-promise-executor-return
    			await new Promise((resolve) => setTimeout(resolve, pollInterval));
    			retries++;
    		} catch (error) {
    			return {
    				success: false,
    				lastStatus,
    				error: formatError(error),
    			};
    		}
    	}
    
    	return {
    		success: false,
    		lastStatus,
    		error: `Timeout (exceeded ${(maxRetries * pollInterval) / 1000} seconds)`,
    	};
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate this is a write operation (readOnlyHint: false) and supports open-world data (openWorldHint: true). The description adds valuable behavioral context beyond annotations by specifying that the tool automatically polls until generation is complete (which may take several minutes), disclosing the time-consuming nature of the operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: first covers purpose and formats, second details generation modes and inputs, third explains the polling behavior. Every sentence provides essential information with zero wasted words, making it highly front-loaded and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 5 parameters, no output schema, and annotations covering basic safety, the description does well by explaining generation modes, polling behavior, and speaker resolution. However, it doesn't describe the return value or error conditions, leaving some gaps in completeness for a tool that performs a potentially lengthy operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all 5 parameters thoroughly. The description adds minimal additional semantic context, such as clarifying that speakers can be names or IDs and that names are resolved automatically, but this is largely redundant with the schema. Baseline 3 is appropriate given the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a podcast episode with full generation (text + audio), specifies supported formats (single-speaker or dual-speaker), and distinguishes it from siblings like create_podcast_text_only and generate_podcast_audio by emphasizing the comprehensive nature of the generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use this tool by detailing generation modes (quick, deep, debate) and input options (text or URL sources). However, it doesn't explicitly mention when to choose this over alternatives like create_podcast_text_only or generate_podcast_audio, which would be needed for a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marswaveai/listenhub-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server