upload_caption
Add caption tracks to YouTube videos using SRT or WebVTT files. Set language, track name, and mark as draft for review.
Instructions
Upload a caption track (SRT or WebVTT) to a video. Creates a new track — use a distinct name per language/track, or is_draft=true while iterating.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| video_id | Yes | Video ID the caption belongs to. | |
| language | Yes | BCP-47 language code, e.g. 'en', 'en-US', 'es', 'ja'. Must match a language the video supports. | |
| name | No | Caption track name shown in the player's caption menu. Empty string for the default track. | |
| caption_text | Yes | Caption content as a string (SRT or WebVTT format). Source this from a file or the model's output. | |
| format | No | Content type of caption_text: 'srt' (SubRip, application/x-subrip) or 'vtt' (WebVTT, text/vtt). | srt |
| is_draft | No | Draft captions aren't visible to viewers. Useful while reviewing auto-translations. |
Implementation Reference
- src/tools/captions.ts:83-116 (handler)The handler function for the 'upload_caption' tool. It converts the caption text to bytes, determines the content type (SRT or VTT), calls client.insertCaption, and returns a formatted success message.
async (args) => { const contentType = args.format === "vtt" ? "text/vtt" : "application/x-subrip"; const bytes = new Uint8Array(Buffer.from(args.caption_text, "utf-8")); const result = (await client.insertCaption({ videoId: args.video_id, language: args.language, name: args.name, isDraft: args.is_draft, body: bytes, captionContentType: contentType, })) as { id?: string; snippet?: { status?: string }; }; return { content: [ { type: "text" as const, text: [ `Uploaded caption track: ${result.id ?? "(unknown id)"}`, ` video: ${args.video_id}`, ` language: ${args.language}`, ` name: "${args.name}"`, ` format: ${args.format}`, ` status: ${result.snippet?.status ?? "?"}`, args.is_draft ? " (draft — not visible to viewers)" : "", ] .filter(Boolean) .join("\n"), }, ], }; }, - src/tools/captions.ts:6-36 (schema)The Zod schema for 'upload_caption' inputs: video_id, language, name (default ''), caption_text, format (srt/vtt, default 'srt'), and is_draft (default false).
const uploadCaptionSchema = { video_id: z.string().describe("Video ID the caption belongs to."), language: z .string() .describe( "BCP-47 language code, e.g. 'en', 'en-US', 'es', 'ja'. Must match a language the video supports.", ), name: z .string() .default("") .describe( "Caption track name shown in the player's caption menu. Empty string for the default track.", ), caption_text: z .string() .describe( "Caption content as a string (SRT or WebVTT format). Source this from a file or the model's output.", ), format: z .enum(["srt", "vtt"]) .default("srt") .describe( "Content type of caption_text: 'srt' (SubRip, application/x-subrip) or 'vtt' (WebVTT, text/vtt).", ), is_draft: z .boolean() .default(false) .describe( "Draft captions aren't visible to viewers. Useful while reviewing auto-translations.", ), }; - src/tools/captions.ts:79-117 (registration)The registration call on server.tool(...) that binds the name 'upload_caption' to its description, schema, and handler.
server.tool( "upload_caption", "Upload a caption track (SRT or WebVTT) to a video. Creates a new track — use a distinct `name` per language/track, or `is_draft=true` while iterating.", uploadCaptionSchema, async (args) => { const contentType = args.format === "vtt" ? "text/vtt" : "application/x-subrip"; const bytes = new Uint8Array(Buffer.from(args.caption_text, "utf-8")); const result = (await client.insertCaption({ videoId: args.video_id, language: args.language, name: args.name, isDraft: args.is_draft, body: bytes, captionContentType: contentType, })) as { id?: string; snippet?: { status?: string }; }; return { content: [ { type: "text" as const, text: [ `Uploaded caption track: ${result.id ?? "(unknown id)"}`, ` video: ${args.video_id}`, ` language: ${args.language}`, ` name: "${args.name}"`, ` format: ${args.format}`, ` status: ${result.snippet?.status ?? "?"}`, args.is_draft ? " (draft — not visible to viewers)" : "", ] .filter(Boolean) .join("\n"), }, ], }; }, ); - src/server.ts:51-51 (registration)Where registerCaptionTools is called from the main server setup, passing the MCP server and YouTube client.
registerCaptionTools(s, youtube); - src/youtube/client.ts:231-273 (helper)The YouTube API helper that performs the multipart upload of the caption track (insertCaption). Constructs a multipart/related body with JSON metadata and caption content.
async insertCaption(params: { videoId: string; language: string; name: string; isDraft: boolean; body: Uint8Array; captionContentType: string; }): Promise<unknown> { const boundary = `youtube-mcp-${Date.now().toString(16)}`; const metadata = JSON.stringify({ snippet: { videoId: params.videoId, language: params.language, name: params.name, isDraft: params.isDraft, }, }); const opening = Buffer.from( `--${boundary}\r\nContent-Type: application/json; charset=UTF-8\r\n\r\n${metadata}\r\n--${boundary}\r\nContent-Type: ${params.captionContentType}\r\n\r\n`, "utf-8", ); const closing = Buffer.from(`\r\n--${boundary}--\r\n`, "utf-8"); const body = Buffer.concat([opening, Buffer.from(params.body), closing]); const url = new URL(`${UPLOAD_API}/captions`); url.searchParams.set("part", "snippet"); url.searchParams.set("uploadType", "multipart"); const token = await this.ensureAccessToken(); const res = await fetch(url.toString(), { method: "POST", headers: { Authorization: `Bearer ${token}`, "Content-Type": `multipart/related; boundary=${boundary}`, "Content-Length": String(body.length), }, body, }); if (!res.ok) { throw new Error( `YouTube caption insert failed: ${res.status} ${await res.text()}`, ); } return res.json();