Skip to main content
Glama
gaudiolab-jp

gaudio-developers-mcp

Official

gaudio_separate_audio

Separate audio stems (e.g., vocal, drums) or perform DME separation by uploading a file or reusing an upload ID. Poll until complete and receive download URLs.

Instructions

All-in-one audio separation: upload file (or reuse uploadId) → create job → poll until done → return download URLs. For Stem Separation, provide 'type' (e.g. 'vocal', 'vocal,drum'). For DME Separation, no type needed. Supports WAV, FLAC, MP3, M4A, MOV, MP4.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathNoPath to local audio/video file. Either filePath or uploadId is required.
uploadIdNoExisting uploadId to reuse (skips upload). Valid for 72 hours.
modelYesModel name (e.g. gsep_music_hq_v1, gsep_dme_dtrack_v1)
typeNoStem type(s) for Stem Separation models. e.g. 'vocal', 'vocal,drum'
pollIntervalNoPolling interval in seconds (default: 10)

Implementation Reference

  • Main tool handler: exports registerSeparateAudio that registers 'gaudio_separate_audio' MCP tool. Handler validates model, uploads file (or reuses uploadId), creates job, polls until done, and returns download URLs.
    import { z } from "zod";
    import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
    import type { GaudioClient } from "../api/client.js";
    import { getModel } from "../models/registry.js";
    import { pollJob } from "../utils/polling.js";
    
    export function registerSeparateAudio(server: McpServer, client: GaudioClient) {
      server.tool(
        "gaudio_separate_audio",
        "All-in-one audio separation: upload file (or reuse uploadId) → create job → poll until done → return download URLs. For Stem Separation, provide 'type' (e.g. 'vocal', 'vocal,drum'). For DME Separation, no type needed. Supports WAV, FLAC, MP3, M4A, MOV, MP4.",
        {
          filePath: z
            .string()
            .optional()
            .describe("Path to local audio/video file. Either filePath or uploadId is required."),
          uploadId: z
            .string()
            .optional()
            .describe("Existing uploadId to reuse (skips upload). Valid for 72 hours."),
          model: z.string().describe("Model name (e.g. gsep_music_hq_v1, gsep_dme_dtrack_v1)"),
          type: z
            .string()
            .optional()
            .describe("Stem type(s) for Stem Separation models. e.g. 'vocal', 'vocal,drum'"),
          pollInterval: z
            .number()
            .optional()
            .default(10)
            .describe("Polling interval in seconds (default: 10)"),
        },
        async ({ filePath, uploadId, model, type, pollInterval }) => {
          const modelInfo = getModel(model);
          if (!modelInfo) {
            return {
              content: [{ type: "text" as const, text: `Unknown model: ${model}. Use gaudio_list_models to see available models.` }],
              isError: true,
            };
          }
    
          if (!filePath && !uploadId) {
            return {
              content: [{ type: "text" as const, text: "Either filePath or uploadId is required." }],
              isError: true,
            };
          }
    
          if (modelInfo.typeRequired && !type) {
            return {
              content: [
                {
                  type: "text" as const,
                  text: `Model ${model} requires 'type'. Options: ${modelInfo.typeOptions?.join(", ")}`,
                },
              ],
              isError: true,
            };
          }
    
          if (modelInfo.category === "text_sync") {
            return {
              content: [{ type: "text" as const, text: "For Text Sync, use gaudio_sync_lyrics instead." }],
              isError: true,
            };
          }
    
          const messages: string[] = [];
          const log = (msg: string) => messages.push(msg);
    
          // Step 1: Upload if needed
          let resolvedUploadId = uploadId;
          if (!resolvedUploadId) {
            log("업로드 중...");
            const result = await client.uploadFile(filePath!);
            resolvedUploadId = result.uploadId;
            log(`업로드 완료. uploadId: ${resolvedUploadId}`);
          } else {
            log(`기존 uploadId 재사용: ${resolvedUploadId}`);
          }
    
          // Step 2: Create job
          const params: Record<string, unknown> = {
            audioUploadId: resolvedUploadId,
          };
          if (type) params.type = type;
    
          const { jobId } = await client.createJob(model, params);
          log(`Job 생성 완료. jobId: ${jobId}`);
    
          // Step 3: Poll
          const intervalMs = (pollInterval ?? 10) * 1000;
          const result = await pollJob(client, model, jobId, intervalMs, 30, log);
    
          const output: Record<string, unknown> = {
            jobId: result.jobId,
            status: result.status,
            uploadId: resolvedUploadId,
            model,
          };
    
          if (result.downloadUrl) output.downloadUrl = result.downloadUrl;
          if (result.expireAt) output.expireAt = result.expireAt;
          if (result.errorMessage) output.errorMessage = result.errorMessage;
    
          messages.push(JSON.stringify(output, null, 2));
    
          return {
            content: [
              {
                type: "text" as const,
                text: messages.join("\n"),
              },
            ],
          };
        },
      );
    }
  • Input schema using Zod: filePath (optional), uploadId (optional), model (required string), type (optional string for stem types), pollInterval (default 10s).
    {
      filePath: z
        .string()
        .optional()
        .describe("Path to local audio/video file. Either filePath or uploadId is required."),
      uploadId: z
        .string()
        .optional()
        .describe("Existing uploadId to reuse (skips upload). Valid for 72 hours."),
      model: z.string().describe("Model name (e.g. gsep_music_hq_v1, gsep_dme_dtrack_v1)"),
      type: z
        .string()
        .optional()
        .describe("Stem type(s) for Stem Separation models. e.g. 'vocal', 'vocal,drum'"),
      pollInterval: z
        .number()
        .optional()
        .default(10)
        .describe("Polling interval in seconds (default: 10)"),
    },
  • src/index.ts:10-33 (registration)
    Registration: imports registerSeparateAudio from ./tools/separate-audio.js and calls it at line 31 with server and client instances.
    import { registerSeparateAudio } from "./tools/separate-audio.js";
    import { registerSyncLyrics } from "./tools/sync-lyrics.js";
    import { registerGetKeyInfo } from "./tools/get-key-info.js";
    
    const apiKey = process.env.GAUDIO_API_KEY;
    if (!apiKey) {
      console.error("GAUDIO_API_KEY environment variable is required.");
      process.exit(1);
    }
    
    const server = new McpServer({
      name: "com.gaudiolab/mcp-developers",
      version: "1.0.0",
    });
    
    const client = new GaudioClient(apiKey);
    
    registerListModels(server);
    registerUploadFile(server, client);
    registerCreateJob(server, client);
    registerGetJob(server, client);
    registerSeparateAudio(server, client);
    registerSyncLyrics(server, client);
    registerGetKeyInfo(server, client);
  • Polling helper used by the handler: pollJob polls the job status via GaudioClient.getJob up to maxAttempts, returns success with downloadUrl or failure/error.
    export async function pollJob(
      client: GaudioClient,
      model: string,
      jobId: string,
      intervalMs: number = 10_000,
      maxAttempts: number = 30,
      onProgress?: (message: string) => void,
    ): Promise<PollResult> {
      for (let attempt = 0; attempt < maxAttempts; attempt++) {
        let result;
        try {
          result = await client.getJob(model, jobId);
        } catch (err) {
          if (err instanceof GaudioApiError) {
            return {
              jobId,
              status: "failed",
              errorMessage: err.message,
            };
          }
          throw err;
        }
    
        const status = result.resultData?.status as string;
    
        if (status === "success") {
          onProgress?.("처리 완료!");
          return {
            jobId,
            status: "success",
            downloadUrl: result.resultData?.downloadUrl as Record<string, unknown>,
            expireAt: result.resultData?.expireAt as string,
          };
        }
    
        if (status === "failed") {
          return {
            jobId,
            status: "failed",
            errorMessage: (result.resultData?.errorMessage as string) ?? "Job failed",
          };
        }
    
        if (attempt === 0) {
          onProgress?.("처리 대기 중...");
        } else {
          onProgress?.(`처리 중... (${attempt + 1}/${maxAttempts})`);
        }
    
        await new Promise((resolve) => setTimeout(resolve, intervalMs));
      }
    
      return {
        jobId,
        status: "polling_timeout",
        errorMessage: `${maxAttempts}회 폴링 후에도 미완료. gaudio_get_job으로 나중에 확인하세요. jobId: ${jobId}`,
      };
    }
  • Upload helper used by the handler: uploadFile reads local file, performs multipart upload via pre-signed URLs, returns uploadId.
    async uploadFile(filePath: string): Promise<{ uploadId: string }> {
      const stat = statSync(filePath);
      const fileName = basename(filePath);
      const fileSize = stat.size;
      const fileBuffer = readFileSync(filePath);
    
      const ext = fileName.split(".").pop()?.toLowerCase() ?? "";
      const contentTypeMap: Record<string, string> = {
        wav: "audio/wav",
        flac: "audio/flac",
        mp3: "audio/mpeg",
        m4a: "audio/mp4",
        mov: "video/quicktime",
        mp4: "video/mp4",
        txt: "text/plain",
      };
      const contentType = contentTypeMap[ext] ?? "application/octet-stream";
    
      const { uploadId, chunkSize, preSignedUrl } = await this.uploadCreate(
        fileName,
        fileSize,
      );
    
      const parts: { awsETag: string; partNumber: number }[] = [];
    
      for (let i = 0; i < preSignedUrl.length; i++) {
        const start = i * chunkSize;
        const end = Math.min(start + chunkSize, fileSize);
        const chunk = fileBuffer.subarray(start, end);
    
        const etag = await this.uploadChunk(preSignedUrl[i], new Uint8Array(chunk.buffer, chunk.byteOffset, chunk.byteLength), contentType);
        parts.push({ awsETag: etag, partNumber: i + 1 });
      }
    
      parts.sort((a, b) => a.partNumber - b.partNumber);
      await this.uploadComplete(uploadId, parts);
    
      return { uploadId };
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavior. It explains the multi-step workflow and the polling interval parameter. However, it does not mention potential side effects, rate limits, error handling, or what happens on failure. The polling behavior is partially described but lacks details on completion criteria.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the core purpose, followed by targeted details. No redundant or extraneous information. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the full workflow, input parameters, and output (download URLs). It lists supported file formats and differentiates between modes. Without an output schema, it provides enough context for basic usage, though it could include more detail on result structure or error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds context: it clarifies that filePath and uploadId are alternatives, explains the conditional use of 'type' for Stem Separation, and notes the default pollInterval. This goes beyond the schema's individual parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool's purpose as an all-in-one audio separation workflow (upload, create job, poll, download). It distinguishes between Stem Separation and DME Separation modes and lists supported file types, making it highly specific and distinct from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear guidance on when to provide 'type' for Stem Separation and that DME Separation doesn't need it. Also explains upload options (filePath vs uploadId). However, it does not explicitly address when NOT to use this tool or compare it to sibling tools like gaudio_create_job or gaudio_upload_file.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gaudiolab-jp/gaudio-developers-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server