Skip to main content
Glama

play_sound

Play system sounds, text-to-speech audio, or custom audio files to provide audio feedback in applications.

Instructions

Play various types of sounds with customizable parameters

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
typeYesType of sound to play
nameNoSystem sound name (required when type is "system")
textNoText to speak (required when type is "tts")
voiceNoVoice name (optional, used with type "tts", uses system default if not specified)
pathNoAbsolute path to audio file (required when type is "file")

Implementation Reference

  • src/index.ts:306-338 (registration)
    Registration of the 'play_sound' tool, including name, description, and input schema definition.
    {
      name: 'play_sound',
      description: 'Play various types of sounds with customizable parameters',
      inputSchema: {
        type: 'object',
        required: ['type'],
        properties: {
          type: {
            type: 'string',
            enum: ['system', 'tts', 'file'],
            description: 'Type of sound to play'
          },
          name: {
            type: 'string',
            enum: ['Basso', 'Blow', 'Bottle', 'Frog', 'Funk', 'Glass', 'Hero', 'Morse', 'Ping', 'Pop', 'Purr', 'Sosumi', 'Submarine', 'Tink'],
            description: 'System sound name (required when type is "system")'
          },
          text: {
            type: 'string',
            description: 'Text to speak (required when type is "tts")'
          },
          voice: {
            type: 'string',
            description: 'Voice name (optional, used with type "tts", uses system default if not specified)'
          },
          path: {
            type: 'string',
            description: 'Absolute path to audio file (required when type is "file")'
          }
        },
        additionalProperties: false
      },
    },
  • Handler logic for the 'play_sound' tool call: input validation and delegation to playCustomSound function.
    case 'play_sound': {
      // Validate args exists
      if (!args || typeof args !== 'object') {
        throw new Error('Invalid arguments provided');
      }
      
      // Validate required parameters based on type
      const { type } = args as { type?: string };
      
      if (!type || !['system', 'tts', 'file'].includes(type)) {
        throw new Error('Invalid or missing type. Must be "system", "tts", or "file"');
      }
      
      if (type === 'system' && !(args as any).name) {
        throw new Error('Parameter "name" is required when type is "system"');
      }
      
      if (type === 'tts' && !(args as any).text) {
        throw new Error('Parameter "text" is required when type is "tts"');
      }
      
      if (type === 'file' && !(args as any).path) {
        throw new Error('Parameter "path" is required when type is "file"');
      }
      
      const soundOptions = args as PlaySoundOptions;
      const result = await playCustomSound(soundOptions);
      return {
        content: [
          {
            type: 'text',
            text: result,
          },
        ],
      };
    }
  • Core implementation of custom sound playback: handles system sounds, TTS, and file playback with validation, throttling, and timeouts.
    async function playCustomSound(options: PlaySoundOptions): Promise<string> {
      const requestId = `${options.type}-${Date.now()}`;
      
      // Throttle same type requests
      if (activeRequests.has(options.type)) {
        throw new Error(`${options.type} sound already playing`);
      }
      
      activeRequests.add(options.type);
      
      try {
        let child: ReturnType<typeof spawn>;
        
        // Validate and spawn process
        switch (options.type) {
          case 'system': {
            const { name: soundName } = options;
            if (!ALLOWED_SYSTEM_SOUNDS.has(soundName)) {
              throw new Error(`Unsupported system sound: ${soundName}`);
            }
            child = spawn('afplay', [`/System/Library/Sounds/${soundName}.aiff`]);
            break;
          }
            
          case 'tts': {
            const { text, voice } = options;
            
            // Validate text length
            if (text.length > MAX_TTS_TEXT_LENGTH) {
              throw new Error(`Text too long (max ${MAX_TTS_TEXT_LENGTH} characters)`);
            }
            
            // Validate voice if provided, gracefully fall back to system default
            let finalVoice = voice;
            if (voice !== undefined && !ALLOWED_TTS_VOICES.has(voice)) {
              // Log warning but continue with system default
              console.warn(`Unsupported voice: ${voice}. Using system default voice.`);
              finalVoice = undefined;
            }
            
            const args = finalVoice ? ['-v', finalVoice, text] : [text];
            child = spawn('say', args);
            break;
          }
            
          case 'file': {
            const { path: filePath } = options;
            if (!isAbsolute(filePath)) {
              throw new Error('File path must be absolute');
            }
            try {
              const stats = await getCachedFileStat(filePath);
              if (!stats.isFile()) {
                throw new Error('Path must point to a file');
              }
            } catch (error) {
              throw new Error(`File not found or inaccessible: ${filePath}`);
            }
            child = spawn('afplay', [filePath]);
            break;
          }
            
          default: {
            // TypeScript ensures this is unreachable, but keep for runtime safety
            const exhaustiveCheck: never = options;
            throw new Error(`Unknown sound type: ${(exhaustiveCheck as any).type}`);
          }
        }
        
        // Wrap child process lifecycle in Promise with timeout
        return new Promise((resolve, reject) => {
          const timeout = setTimeout(() => {
            child.kill();
            reject(new Error('Sound playback timed out'));
          }, PROCESS_TIMEOUT_MS);
          
          child.once('close', (code: number) => {
            clearTimeout(timeout);
            if (code === 0) {
              resolve(`${options.type} sound played successfully`);
            } else {
              reject(new Error(`Sound playback failed with code ${code}`));
            }
          });
          
          child.once('error', (error) => {
            clearTimeout(timeout);
            reject(error);
          });
        });
      } finally {
        activeRequests.delete(options.type);
      }
    }
  • TypeScript type definitions for PlaySoundOptions used in the tool implementation.
    type SystemSound = { type: 'system'; name: string };
    type TTSSound = { type: 'tts'; text: string; voice?: string };
    type FileSound = { type: 'file'; path: string };
    
    type PlaySoundOptions = SystemSound | TTSSound | FileSound;
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'customizable parameters' but fails to explain key traits like whether playback is blocking or non-blocking, error handling (e.g., if a file path is invalid), or system dependencies (e.g., TTS availability). This leaves significant gaps in understanding the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('play various types of sounds'). It avoids redundancy but could be more structured by explicitly mentioning the parameter types (e.g., system, TTS, file) to enhance clarity without adding unnecessary length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of handling multiple sound types (system, TTS, file) with no annotations or output schema, the description is incomplete. It lacks details on behavioral aspects like playback effects, error responses, or how outputs are handled, making it insufficient for safe and effective use by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, providing clear details for all parameters, including enums and requirements. The description adds minimal value beyond this, as it only vaguely references 'customizable parameters' without elaborating on their semantics or interactions, aligning with the baseline score for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool 'play[s] various types of sounds with customizable parameters', which clarifies the action (play) and resource (sounds) but is vague about scope and differentiation. It does not specify what 'various types' means or how it differs from sibling tools like play_error_sound, leaving room for ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, such as the sibling tools (play_error_sound, play_info_sound, play_warning_sound). The description implies general sound playback but offers no context for choosing between this and more specific tools, leading to potential misuse.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nocoo/mcp-make-sound'

If you have feedback or need assistance with the MCP directory API, please join our Discord server