Skip to main content
Glama
allvoicelab

All Voice Lab MCP Server

Official
by allvoicelab

remove_subtitle

Remove hardcoded subtitles from MP4 and MOV videos using OCR technology to detect and erase text while preserving original video content.

Instructions

[AllVoiceLab Tool] Remove hardcoded subtitles from videos using OCR technology.

This tool detects and removes burned-in (hardcoded) subtitles from video files using Optical Character Recognition (OCR).
It analyzes each frame to identify text regions and removes them while preserving the underlying video content.
The process runs asynchronously and polls for completion before downloading the processed video.

Args:
    video_file_path: Path to the video file to process. Only MP4 and MOV formats are supported. Maximum file size: 2GB.
    language_code: Language code for subtitle text detection (e.g., 'en', 'zh'). Set to 'auto' for automatic language detection. Default is 'auto'.
    name: Optional project name for identification purposes.
    output_dir: Output directory for the processed video file. Default is user's desktop.
    
Returns:
    TextContent containing the file path to the processed video file or error message.
    If the process takes longer than expected, returns the project ID for later status checking.
    
Limitations:
    - Only MP4 and MOV formats are supported
    - Maximum file size: 2GB
    - Processing may take several minutes depending on video length and complexity
    - Works best with clear, high-contrast subtitles
    - May not completely remove stylized or animated subtitles

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
video_file_pathYes
language_codeNoauto
nameNo
output_dirNo

Implementation Reference

  • The handler function that executes the remove_subtitle tool: validates video file (path, format MP4/MOV, size <2GB), initiates subtitle removal via AllVoiceLab API, polls status up to 10 minutes, downloads processed MP4 to output_dir if successful, returns path or project_id.
    def remove_subtitle(
        video_file_path: str,
        language_code: str = "auto",
        name: str = None,
        output_dir: str = None
    ) -> TextContent:
        """
        Remove hardcoded subtitles from videos using OCR technology
        
        Args:
            video_file_path: Path to the video file to process. Only MP4 and MOV formats are supported. Maximum file size: 2GB.
            language_code: Language code for subtitle text detection (e.g., 'en', 'zh'). Set to 'auto' for automatic language detection. Default is 'auto'.
            name: Optional project name for identification purposes.
            output_dir: Output directory for the processed video file. Default is user's desktop.
            
        Returns:
            TextContent: Text content containing the processed video file path or error message.
            If the process takes longer than expected, returns the project ID for later status checking.
        """
        all_voice_lab = get_client()
        output_dir = all_voice_lab.get_output_path(output_dir)
    
        poll_interval = 10
        max_retries = 60
        logging.info(f"Tool called: subtitle_removal")
        logging.info(f"Video file path: {video_file_path}")
        logging.info(f"Language code: {language_code}")
        logging.info(f"Output directory: {output_dir}")
        logging.info(f"Poll interval: {poll_interval} seconds")
        logging.info(f"Max retries: {max_retries}")
        if name:
            logging.info(f"Project name: {name}")
    
        # Validate parameters
        if not video_file_path:
            logging.warning("Video file path parameter is empty")
            return TextContent(
                type="text",
                text="video_file_path parameter cannot be empty"
            )
    
        # Check if video file exists before processing
        if not os.path.exists(video_file_path):
            logging.warning(f"Video file does not exist: {video_file_path}")
            return TextContent(
                type="text",
                text=f"Video file does not exist: {video_file_path}"
            )
    
        # Check file format, only allow mp4 and mov
        _, file_extension = os.path.splitext(video_file_path)
        file_extension = file_extension.lower()
        if file_extension not in [".mp4", ".mov"]:
            logging.warning(f"Unsupported video file format: {file_extension}")
            return TextContent(
                type="text",
                text=f"Unsupported video file format. Only MP4 and MOV formats are supported."
            )
    
        # Check file size, limit to 2GB
        max_size_bytes = 2 * 1024 * 1024 * 1024  # 2GB in bytes
        file_size = os.path.getsize(video_file_path)
        if file_size > max_size_bytes:
            logging.warning(f"Video file size exceeds limit: {file_size} bytes, max allowed: {max_size_bytes} bytes")
            return TextContent(
                type="text",
                text=f"Video file size exceeds the maximum limit of 2GB. Please use a smaller file."
            )
    
        try:
            logging.info("Starting subtitle removal process")
            project_id = all_voice_lab.subtitle_removal(
                video_file_path=video_file_path,
                language_code=language_code,
                name=name
            )
            logging.info(f"Subtitle removal initiated, project ID: {project_id}")
    
            # Poll for task completion
            logging.info(f"Starting to poll for task completion, interval: {poll_interval}s, max retries: {max_retries}")
    
            # Initialize variables for polling
            retry_count = 0
            task_completed = False
            removal_info = None
    
            # Poll until task is completed or max retries reached
            while retry_count < max_retries and not task_completed:
                try:
                    # Wait for the specified interval
                    time.sleep(poll_interval)
    
                    # Check task status
                    removal_info = all_voice_lab.get_removal_info(project_id)
                    logging.info(f"Poll attempt {retry_count + 1}, status: {removal_info.status}")
    
                    # Check if task is completed
                    if removal_info.status.lower() == "success":
                        task_completed = True
                        logging.info("Subtitle removal task completed successfully")
                        break
                    elif removal_info.status.lower() == "failed":
                        logging.error("Subtitle removal task failed")
                        return TextContent(
                            type="text",
                            text=f"Subtitle removal failed. Please try again later."
                        )
    
                    # Increment retry count
                    retry_count += 1
    
                except Exception as e:
                    logging.error(f"Error checking task status: {str(e)}")
                    retry_count += 1
    
            # Check if task completed successfully
            if not task_completed:
                logging.warning(f"Subtitle removal task did not complete within {max_retries} attempts")
                return TextContent(
                    type="text",
                    text=f"Subtitle removal is still in progress. Your project ID is: {project_id}. You can check the status later."
                )
    
            # Download the processed video
            logging.info("Downloading processed video")
            try:
                # Check if output URL is available
                if not removal_info.removal_result:
                    logging.error("No removal_result URL available in the response")
                    return TextContent(
                        type="text",
                        text=f"Subtitle removal completed but no output file is available. Your project ID is: {project_id}"
                    )
    
                # Prepare HTTP request
                url = removal_info.removal_result
    
                # Set request headers, accept all types of responses
                headers = all_voice_lab._get_headers(content_type="", accept="*/*")
    
                # Send request and get response
                response = requests.get(url, headers=headers, stream=True)
    
                # Check response status
                response.raise_for_status()
    
                # Generate filename based on original file name with suffix
                original_filename = os.path.basename(video_file_path)
                name_without_ext, _ = os.path.splitext(original_filename)
                filename = f"{name_without_ext}_subtitle_removed.mp4"
    
                # Build complete file path
                file_path = os.path.join(output_dir, filename)
    
                # Save response content to file
                with open(file_path, 'wb') as f:
                    for chunk in response.iter_content(chunk_size=8192):
                        if chunk:
                            f.write(chunk)
    
                logging.info(f"Processed video saved to: {file_path}")
                return TextContent(
                    type="text",
                    text=f"Subtitle removal completed successfully. Processed video saved to: {file_path}"
                )
    
            except Exception as e:
                logging.error(f"Failed to download processed video: {str(e)}")
                return TextContent(
                    type="text",
                    text=f"Subtitle removal completed but failed to download the processed video. Your project ID is: {project_id}"
                )
    
        except FileNotFoundError as e:
            logging.error(f"Video file does not exist: {video_file_path}, error: {str(e)}")
            return TextContent(
                type="text",
                text=f"Video file does not exist: {video_file_path}"
            )
        except Exception as e:
            logging.error(f"Subtitle removal failed: {str(e)}")
            return TextContent(
                type="text",
                text=f"Subtitle removal failed, tool temporarily unavailable"
            )
  • Registers the 'remove_subtitle' tool in the MCP FastMCP server, binding the handler from dubbing.py, with comprehensive description serving as input schema (args: video_file_path, language_code='auto', name=None, output_dir=None).
    mcp.tool(
        name="remove_subtitle",
        description="""[AllVoiceLab Tool] Remove hardcoded subtitles from videos using OCR technology.
        
        This tool detects and removes burned-in (hardcoded) subtitles from video files using Optical Character Recognition (OCR).
        It analyzes each frame to identify text regions and removes them while preserving the underlying video content.
        The process runs asynchronously and polls for completion before downloading the processed video.
        
        Args:
            video_file_path: Path to the video file to process. Only MP4 and MOV formats are supported. Maximum file size: 2GB.
            language_code: Language code for subtitle text detection (e.g., 'en', 'zh'). Set to 'auto' for automatic language detection. Default is 'auto'.
            name: Optional project name for identification purposes.
            output_dir: Output directory for the processed video file. Default is user's desktop.
            
        Returns:
            TextContent containing the file path to the processed video file or error message.
            If the process takes longer than expected, returns the project ID for later status checking.
            
        Limitations:
            - Only MP4 and MOV formats are supported
            - Maximum file size: 2GB
            - Processing may take several minutes depending on video length and complexity
            - Works best with clear, high-contrast subtitles
            - May not completely remove stylized or animated subtitles
        """
    )(remove_subtitle)
  • Supporting function to query status of a remove_subtitle project using project_id, used for long-running tasks.
    def get_removal_info(project_id: str) -> TextContent:
        """
        Retrieve status and details of a subtitle removal task
        
        Args:
            project_id: The unique identifier of the subtitle removal task to check. This ID is returned from the remove_subtitle tool. Required.
            
        Returns:
            TextContent: Text content containing the status (e.g., "pending", "processing", "success", "failed") and other details of the subtitle removal task,
            including the URL to the processed video if the task has completed successfully.
        """
        all_voice_lab = get_client()
        logging.info(f"Tool called: get_removal_info")
        logging.info(f"Project ID: {project_id}")
    
        # Validate parameters
        if not project_id:
            logging.warning("Project ID parameter is empty")
            return TextContent(
                type="text",
                text="project_id parameter cannot be empty"
            )
    
        try:
            logging.info("Getting subtitle removal task information")
            removal_info = all_voice_lab.get_removal_info(project_id)
            logging.info(f"Subtitle removal info retrieved successfully for ID: {project_id}")
    
            # Format the result
            buffer = []
            buffer.append(f"Project ID: {removal_info.project_id}\n")
            buffer.append(f"Status: {removal_info.status}\n")
    
            if removal_info.name:
                buffer.append(f"Project Name: {removal_info.name}\n")
    
            if removal_info.output_url and removal_info.status == "done":
                buffer.append(f"Output URL: {removal_info.output_url}\n")
                buffer.append(
                    f"The subtitle removal task has been completed. You can download the processed video from the output URL.\n")
            else:
                buffer.append(
                    f"The subtitle removal task is still in progress. Please check again later using the project ID.\n")
    
            # Join the list into a string
            result = "".join(buffer)
            return TextContent(
                type="text",
                text=result
            )
        except Exception as e:
            logging.error(f"Failed to get subtitle removal information: {str(e)}")
            return TextContent(
                type="text",
                text=f"Failed to get subtitle removal information, tool temporarily unavailable"
            )
  • Registers the helper tool 'get_removal_info' for checking remove_subtitle task status.
    mcp.tool(
        name="get_removal_info",
        description="""[AllVoiceLab Tool] Retrieve status and details of a subtitle removal task.
        
        This tool queries the current status of a previously submitted subtitle removal task and returns detailed information
        about its progress, including the current processing stage, completion status, and result URL if available.
        
        Args:
            project_id: The unique identifier of the subtitle removal task to check. This ID is returned from the remove_subtitle tool. Required.
            
        Returns:
            TextContent containing the status (e.g., "pending", "processing", "success", "failed") and other details of the subtitle removal task,
            including the URL to the processed video if the task has completed successfully.
            
        Limitations:
            - The project_id must be valid and properly formatted
            - The task must have been previously submitted to the AllVoiceLab API
        """
    )(get_removal_info)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so effectively. It explains the asynchronous nature of processing, polling for completion, file format limitations (MP4/MOV), size limits (2GB), processing time expectations, and quality limitations (works best with clear subtitles, may not remove stylized ones). This provides comprehensive behavioral context beyond basic parameter documentation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, process explanation, args, returns, limitations). While somewhat lengthy, every sentence earns its place by providing essential information. The front-loaded purpose statement immediately clarifies the tool's function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (asynchronous video processing with OCR), no annotations, and no output schema, the description provides excellent completeness. It covers the full workflow, parameter semantics, return values (both success and error cases), and important limitations. This gives the agent sufficient context to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by providing detailed semantic information for all 4 parameters. It explains what each parameter means, provides examples ('en', 'zh', 'auto'), specifies defaults, and adds important constraints (format support, size limits). This adds substantial value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('remove hardcoded subtitles from videos') and the technology used ('using OCR technology'). It distinguishes this tool from siblings like 'subtitle_extraction' (which extracts rather than removes) and 'video_translation_dubbing' (which focuses on translation/dubbing rather than subtitle removal).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool (for removing burned-in subtitles from MP4/MOV videos up to 2GB). It doesn't explicitly mention when NOT to use it or name specific alternatives among sibling tools, but the purpose differentiation is strong enough to imply appropriate usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/allvoicelab/AllVoiceLab-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server