Auto-Snap MCP

Overview Schema Related Servers Score Discussions

list_windows

Retrieve available windows for screenshot capture, providing IDs and titles to select targets for automated document processing and PDF conversion.

Instructions

List all available windows for screenshot capture.

Returns:
    JSON string containing list of windows with their IDs, titles, and properties.

Input Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

server.py:71-102 (handler)

Primary MCP tool handler and registration for 'list_windows'. Initializes CrossPlatformWindowManager, lists windows, adds environment info, and returns formatted JSON response.

@mcp.tool()
async def list_windows() -> str:
    """
    List all available windows for screenshot capture.
    
    Returns:
        JSON string containing list of windows with their IDs, titles, and properties.
    """
    try:
        wm = get_window_manager()
        windows = wm.list_windows()
        env_info = wm.get_environment_info()
        
        result = {
            "status": "success",
            "windows": windows,
            "count": len(windows),
            "environment": env_info
        }
        
        return json.dumps(result, indent=2)
        
    except Exception as e:
        logger.error(f"Failed to list windows: {e}")
        return json.dumps({
            "status": "error",
            "error": str(e),
            "windows": [],
            "count": 0,
            "environment": {"error": "Could not determine environment"}
        })

capture.py:1375-1390 (helper)

CrossPlatformWindowManager.list_windows(): Platform-agnostic wrapper that delegates to specific manager (WindowsWindowManager or WindowCapture) and enriches window data with environment information.

def list_windows(self) -> List[Dict[str, str]]:
    """List all available windows using the appropriate manager."""
    try:
        windows = self.manager.list_windows()
        
        # Add environment info to each window
        for window in windows:
            window['environment'] = self.environment
            if 'type' not in window:
                window['type'] = 'x11' if self.environment == 'linux' else self.environment
                
        return windows
    except Exception as e:
        logger.error(f"Failed to list windows: {e}")
        return []

capture.py:75-234 (helper)

WindowsWindowManager.list_windows(): Core implementation for Windows using detailed PowerShell script with Win32 API calls to enumerate capturable windows (visible, minimized, maximized) with properties like ID, title, process info, state.

    def list_windows(self) -> List[Dict[str, str]]:
        """
        List all Windows applications with visible windows using PowerShell.
        Returns list of window info dictionaries.
        """
        if not self.powershell_available:
            logger.error("PowerShell not available - cannot list Windows applications")
            return []
            
        try:
            # Enhanced PowerShell script to get comprehensive window information
            ps_script = '''
            Add-Type -TypeDefinition @"
                using System;
                using System.Runtime.InteropServices;
                using System.Text;
                
                public class Win32 {
                    [DllImport("user32.dll")]
                    public static extern bool IsWindowVisible(IntPtr hWnd);
                    
                    [DllImport("user32.dll")]
                    public static extern bool IsIconic(IntPtr hWnd);
                    
                    [DllImport("user32.dll")]
                    public static extern bool IsZoomed(IntPtr hWnd);
                    
                    [DllImport("user32.dll")]
                    public static extern int GetWindowText(IntPtr hWnd, StringBuilder lpString, int nMaxCount);
                    
                    [DllImport("user32.dll")]
                    public static extern int GetWindowTextLength(IntPtr hWnd);
                    
                    [DllImport("user32.dll")]
                    public static extern uint GetWindowThreadProcessId(IntPtr hWnd, out uint lpdwProcessId);
                }
"@

            $windows = @()

            # Get processes with capturable windows only
            Get-Process | Where-Object { 
                $_.MainWindowHandle -ne 0 -and $_.ProcessName -notmatch "^(dwm|csrss|winlogon|wininit)$"
            } | ForEach-Object {
                $handle = [IntPtr]$_.MainWindowHandle
                $isVisible = [Win32]::IsWindowVisible($handle)
                $isMinimized = [Win32]::IsIconic($handle)
                $isMaximized = [Win32]::IsZoomed($handle)
                
                # Only include windows that are in capturable states
                # Skip hidden windows that can't be meaningfully captured
                if (-not ($isVisible -or $isMinimized)) {
                    return  # Skip this window
                }
                
                # Get window title using Windows API (more reliable than MainWindowTitle)
                $titleLength = [Win32]::GetWindowTextLength($handle)
                if ($titleLength -gt 0) {
                    $title = New-Object System.Text.StringBuilder($titleLength + 1)
                    [Win32]::GetWindowText($handle, $title, $title.Capacity) | Out-Null
                    $windowTitle = $title.ToString()
                } else {
                    $windowTitle = $_.MainWindowTitle
                }
                
                # Include window even if title is empty, but provide useful info
                if ([string]::IsNullOrEmpty($windowTitle)) {
                    $windowTitle = "[$($_.ProcessName) - $($_.Id)]"
                }
                
                # Determine window state - only capturable states
                $windowState = if ($isMinimized) { 
                    "minimized" 
                } elseif ($isMaximized) { 
                    "maximized" 
                } else { 
                    "normal" 
                }
                
                $windows += @{
                    id = $_.MainWindowHandle.ToString()
                    title = $windowTitle
                    process_name = $_.ProcessName
                    process_id = $_.Id.ToString()
                    window_handle = $_.MainWindowHandle.ToString()
                    is_visible = $isVisible
                    is_minimized = $isMinimized
                    is_maximized = $isMaximized
                    window_state = $windowState
                    type = "windows"
                }
            }

            # Convert to JSON
            $windows | ConvertTo-Json -Compress
            '''
            
            result = subprocess.run(
                ['powershell.exe', '-Command', ps_script],
                capture_output=True,
                text=True,
                check=True,
                timeout=30,  # 30 second timeout for window enumeration
                encoding='utf-8',
                errors='ignore'  # Ignore encoding errors to handle special characters
            )
            
            import json
            if result.stdout.strip():
                try:
                    # Handle both single object and array responses
                    data = json.loads(result.stdout.strip())
                    if isinstance(data, dict):
                        data = [data]  # Convert single result to list
                    
                    windows = []
                    for window_info in data:
                        # Only include windows with valid window handles (> 0)
                        window_handle = str(window_info.get('window_handle', '0'))
                        if window_handle == '0':
                            continue  # Skip processes without actual windows
                        
                        # Use the enhanced window information from the new PowerShell script
                        windows.append({
                            'id': str(window_info.get('id', '')),
                            'title': window_info.get('title', ''),
                            'process_name': window_info.get('process_name', ''),
                            'process_id': str(window_info.get('process_id', '')),
                            'window_handle': window_handle,
                            'is_visible': window_info.get('is_visible', False),
                            'is_minimized': window_info.get('is_minimized', False),
                            'is_maximized': window_info.get('is_maximized', False),
                            'window_state': window_info.get('window_state', 'normal'),  # Default to normal instead of unknown
                            'type': 'windows'
                        })
                    
                    logger.info(f"Found {len(windows)} capturable windows using enhanced detection")
                    if windows:
                        # Log stats about capturable window states for debugging
                        normal_count = sum(1 for w in windows if w['window_state'] == 'normal')
                        minimized_count = sum(1 for w in windows if w['window_state'] == 'minimized')
                        maximized_count = sum(1 for w in windows if w['window_state'] == 'maximized')
                        logger.info(f"Window states: {normal_count} normal, {minimized_count} minimized, {maximized_count} maximized")
                    
                    return windows
                except json.JSONDecodeError as e:
                    logger.error(f"Failed to parse PowerShell JSON output: {e}")
                    logger.error(f"PowerShell output was: {result.stdout[:500]}...")  # Log first 500 chars for debugging
                    return []
            else:
                logger.warning("PowerShell returned empty output - no windows detected")
                return []
            
        except subprocess.TimeoutExpired:
            logger.error("PowerShell window enumeration timed out after 30 seconds")
            return []
        except subprocess.CalledProcessError as e:
            logger.error(f"Failed to list Windows applications: {e}")
            return []

capture.py:1439-1479 (helper)

WindowCapture.list_windows(): Linux/X11-specific implementation using 'wmctrl -l' command to list windows with basic properties (ID, title, desktop, machine).

def list_windows(self) -> List[Dict[str, str]]:
    """
    List all available windows using wmctrl command.
    Returns list of window info dictionaries.
    """
    try:
        # Use wmctrl to list windows
        result = subprocess.run(
            ['wmctrl', '-l'],
            capture_output=True,
            text=True,
            check=True,
            timeout=10  # 10 second timeout for window listing
        )
        
        windows = []
        for line in result.stdout.strip().split('\n'):
            if line.strip():
                parts = line.split(None, 3)
                if len(parts) >= 4:
                    window_id = parts[0]
                    desktop = parts[1]
                    machine = parts[2]
                    title = parts[3]
                    
                    windows.append({
                        'id': window_id,
                        'title': title,
                        'desktop': desktop,
                        'machine': machine
                    })
        
        return windows
        
    except subprocess.CalledProcessError as e:
        logger.error(f"Failed to list windows with wmctrl: {e}")
        return []
    except FileNotFoundError:
        logger.error("wmctrl not found. Please install: sudo apt-get install wmctrl")
        return []

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool returns a JSON string with window IDs, titles, and properties, which adds useful context beyond the basic purpose. However, it doesn't mention behavioral traits like whether this is a read-only operation, potential performance impacts, or how it interacts with system permissions for window capture.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, and the second provides essential return format details. There's no wasted text, and both sentences earn their place by adding value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (0 parameters, simple list operation) and the presence of an output schema (which handles return values), the description is reasonably complete. It covers the purpose and output format adequately. However, it could be more complete by including usage guidelines or behavioral context, especially since no annotations are provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so the schema already fully documents the lack of inputs. The description doesn't need to add parameter semantics, but it correctly implies no parameters are required by not mentioning any. This meets the baseline for zero parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('List') and resource ('all available windows for screenshot capture'), making it immediately understandable. However, it doesn't explicitly distinguish this tool from sibling tools like 'capture_window' or 'debug_window_detection', which might also involve window operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, context for use, or comparison with sibling tools like 'capture_window' (which might require window selection) or 'debug_window_detection' (which might involve troubleshooting).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PovedaAqui/auto-snap-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server