Skip to main content
Glama
misbahsy

Video & Audio Editing MCP Server

by misbahsy

set_video_audio_track_codec

Modify the audio codec of a video's audio track while preserving the video stream. Specify input and output paths alongside the desired audio codec for processing.

Instructions

Sets the audio codec of a video's audio track, attempting to copy the video stream. Args: input_video_path: Path to the source video file. output_video_path: Path to save the video with the new audio codec. audio_codec: Target audio codec (e.g., 'aac', 'mp3'). Returns: A status message indicating success or failure.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
audio_codecYes
input_video_pathYes
output_video_pathYes

Implementation Reference

  • The handler function for the 'set_video_audio_track_codec' MCP tool. It uses FFmpeg to change the audio codec of the video's audio track, first trying to copy the video stream, falling back to re-encoding the video if necessary. The schema (input parameters and description) is defined in the function signature and docstring.
    @mcp.tool()
    def set_video_audio_track_codec(input_video_path: str, output_video_path: str, audio_codec: str) -> str:
        """Sets the audio codec of a video's audio track, attempting to copy the video stream.
        Args:
            input_video_path: Path to the source video file.
            output_video_path: Path to save the video with the new audio codec.
            audio_codec: Target audio codec (e.g., 'aac', 'mp3').
        Returns:
            A status message indicating success or failure.
        """
        primary_kwargs = {'acodec': audio_codec, 'vcodec': 'copy'}
        fallback_kwargs = {'acodec': audio_codec} # Re-encode video
        return _run_ffmpeg_with_fallback(input_video_path, output_video_path, primary_kwargs, fallback_kwargs)
  • Supporting helper function called by set_video_audio_track_codec (and similar tools) to execute the FFmpeg command with a primary set of parameters (e.g., copying the video stream) and a fallback (full re-encode). Handles errors and provides status messages.
    def _run_ffmpeg_with_fallback(input_path: str, output_path: str, primary_kwargs: dict, fallback_kwargs: dict) -> str:
        """Helper to run ffmpeg command with primary kwargs, falling back to other kwargs on ffmpeg.Error."""
        try:
            ffmpeg.input(input_path).output(output_path, **primary_kwargs).run(capture_stdout=True, capture_stderr=True)
            return f"Operation successful (primary method) and saved to {output_path}"
        except ffmpeg.Error as e_primary:
            try:
                ffmpeg.input(input_path).output(output_path, **fallback_kwargs).run(capture_stdout=True, capture_stderr=True)
                return f"Operation successful (fallback method) and saved to {output_path}"
            except ffmpeg.Error as e_fallback:
                err_primary_msg = e_primary.stderr.decode('utf8') if e_primary.stderr else str(e_primary)
                err_fallback_msg = e_fallback.stderr.decode('utf8') if e_fallback.stderr else str(e_fallback)
                return f"Error. Primary method failed: {err_primary_msg}. Fallback method also failed: {err_fallback_msg}"
        except FileNotFoundError:
            return f"Error: Input file not found at {input_path}"
        except Exception as e:
            return f"An unexpected error occurred: {str(e)}"
  • server.py:10-10 (registration)
    Initialization of the FastMCP server instance where tools like set_video_audio_track_codec are registered via the @mcp.tool() decorator.
    mcp = FastMCP("VideoAudioServer")
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'attempting to copy the video stream', which hints at partial mutation behavior, but fails to specify critical details like whether the operation is destructive to the original file, what permissions are needed, error handling, or performance implications. For a tool that modifies media files, this leaves significant gaps in understanding its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections for Args and Returns, using minimal sentences that directly convey purpose and parameters. It avoids redundancy, though the 'attempting to copy the video stream' phrase could be more precise, and the return value description is somewhat vague.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, 0% schema coverage, and no output schema, the description does a fair job by covering purpose and parameters. However, for a tool that performs media transformation, it lacks details on supported formats, error conditions, side effects (e.g., file overwriting), and the structure of the return message, making it incomplete for safe and effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explicitly lists and explains all three parameters ('input_video_path', 'output_video_path', 'audio_codec') with examples for 'audio_codec' (e.g., 'aac', 'mp3'), adding meaningful context beyond the bare schema. However, it doesn't detail constraints like valid codec formats or path requirements, slightly limiting completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Sets the audio codec of a video's audio track') and resource ('video's audio track'), with the additional detail 'attempting to copy the video stream' that distinguishes it from siblings like 'set_audio_bitrate' or 'convert_video_format'. It precisely defines what the tool does beyond just restating the name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'set_video_audio_track_bitrate' or 'convert_audio_format'. It lacks context about prerequisites (e.g., file formats supported), exclusions, or typical use cases, offering only basic functional information without comparative usage advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/misbahsy/video-audio-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server