Auto-Snap MCP

Overview Schema Related Servers Score Discussions

capture_document_pages

Capture multiple document pages automatically by navigating through them with configurable keys and saving screenshots to a specified directory.

Instructions

Capture multiple pages from a document window with automatic navigation.

Args:
    window_id: Window ID containing the document
    page_count: Number of pages to capture
    output_dir: Directory to save captured pages
    navigation_key: Key to press for navigation (Page_Down, Right, space)
    delay_seconds: Delay between navigation and capture

Returns:
    JSON string with capture results and list of captured files.

Input Schema

TableJSON Schema

Name	Required	Default
`window_id`	Yes
`page_count`	Yes
`output_dir`	No
`navigation_key`	No	Page_Down
`delay_seconds`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

server.py:173-256 (handler)

The handler function for the 'capture_document_pages' tool, registered via @mcp.tool(). It captures multiple pages from a document window, using the window manager's multi-page capture if available, or a fallback loop with individual captures and delays. Returns JSON results with captured file paths and metadata.

@mcp.tool()
async def capture_document_pages(
    window_id: str, 
    page_count: int,
    output_dir: Optional[str] = None,
    navigation_key: str = "Page_Down",
    delay_seconds: float = 1.0
) -> str:
    """
    Capture multiple pages from a document window with automatic navigation.
    
    Args:
        window_id: Window ID containing the document
        page_count: Number of pages to capture
        output_dir: Directory to save captured pages
        navigation_key: Key to press for navigation (Page_Down, Right, space)
        delay_seconds: Delay between navigation and capture
    
    Returns:
        JSON string with capture results and list of captured files.
    """
    try:
        # Get configured output directory (with backward compatibility)
        config = get_config()
        if output_dir is None:
            if config.should_use_legacy_mode():
                output_dir = "captures"  # Legacy default
            else:
                actual_output_dir = get_output_directory()
        else:
            actual_output_dir = get_output_directory(output_dir)
        
        # Convert to string for compatibility with existing manager interface
        output_dir_str = str(actual_output_dir)
        
        # For multi-page capture, use the underlying manager if it supports it
        wm = get_window_manager()
        if hasattr(wm.manager, 'capture_multiple_pages'):
            captured_files = wm.manager.capture_multiple_pages(
                window_id=window_id,
                page_count=page_count,
                output_dir=output_dir_str,
                navigation_key=navigation_key,
                delay_seconds=delay_seconds
            )
        else:
            # Fallback: capture individual pages manually
            actual_output_dir.mkdir(parents=True, exist_ok=True)
            
            captured_files = []
            for page_num in range(1, page_count + 1):
                filename = generate_page_filename(page_num)
                output_path = actual_output_dir / filename
                captured_path = wm.capture_window(window_id, str(output_path))
                captured_files.append(captured_path)
                
                # Simple delay between captures (no navigation for Windows apps yet)
                if page_num < page_count:
                    time.sleep(delay_seconds)
        
        import os
        total_size = sum(os.path.getsize(f) for f in captured_files if os.path.exists(f))
        
        result = {
            "status": "success",
            "window_id": window_id,
            "pages_captured": len(captured_files),
            "output_directory": output_dir_str,
            "captured_files": captured_files,
            "total_size_mb": round(total_size / (1024 * 1024), 2)
        }
        
        return json.dumps(result, indent=2)
        
    except Exception as e:
        logger.error(f"Failed to capture document pages: {e}")
        return json.dumps({
            "status": "error",
            "error": str(e),
            "window_id": window_id,
            "page_count": page_count,
            "captured_files": []
        })

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes the automatic navigation behavior and mentions the return format (JSON string with results and file list), which is valuable. However, it doesn't disclose important behavioral traits like whether this requires specific permissions, what happens if navigation fails, whether it modifies the document, or any rate limits. The description adds some context but leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured and concise: a clear purpose statement followed by well-organized Args and Returns sections. Every sentence earns its place, with no wasted words. The information is front-loaded with the core functionality, followed by parameter details and return information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, automatic navigation behavior) and the presence of an output schema (which handles return values), the description is reasonably complete. It covers the core functionality, all parameters, and mentions the return format. However, for a tool with no annotations and significant behavioral complexity, it could benefit from more disclosure about error conditions, permissions, or what 'automatic navigation' entails beyond key pressing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must compensate for the lack of parameter documentation in the schema. It provides clear explanations for all 5 parameters in the Args section, adding meaningful context about what each parameter does (e.g., 'Key to press for navigation', 'Delay between navigation and capture'). This significantly enhances understanding beyond the bare schema, though it doesn't cover all possible edge cases or format details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('capture multiple pages', 'automatic navigation') and distinguishes it from siblings like capture_full_screen and capture_window by focusing on document pages with navigation. It explicitly mentions the resource (document window) and the multi-page capture functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (capturing multiple pages from a document with automatic navigation), but doesn't explicitly state when not to use it or name specific alternatives. It implies usage for document page capture with navigation, which is helpful but lacks explicit exclusions or comparisons to siblings like capture_window or full_document_workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PovedaAqui/auto-snap-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server