Skip to main content
Glama

list_windows

Retrieve available windows for screenshot capture, providing IDs and titles to select targets for automated document processing and PDF conversion.

Instructions

List all available windows for screenshot capture.

Returns:
    JSON string containing list of windows with their IDs, titles, and properties.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Primary MCP tool handler and registration for 'list_windows'. Initializes CrossPlatformWindowManager, lists windows, adds environment info, and returns formatted JSON response.
    @mcp.tool()
    async def list_windows() -> str:
        """
        List all available windows for screenshot capture.
        
        Returns:
            JSON string containing list of windows with their IDs, titles, and properties.
        """
        try:
            wm = get_window_manager()
            windows = wm.list_windows()
            env_info = wm.get_environment_info()
            
            result = {
                "status": "success",
                "windows": windows,
                "count": len(windows),
                "environment": env_info
            }
            
            return json.dumps(result, indent=2)
            
        except Exception as e:
            logger.error(f"Failed to list windows: {e}")
            return json.dumps({
                "status": "error",
                "error": str(e),
                "windows": [],
                "count": 0,
                "environment": {"error": "Could not determine environment"}
            })
  • CrossPlatformWindowManager.list_windows(): Platform-agnostic wrapper that delegates to specific manager (WindowsWindowManager or WindowCapture) and enriches window data with environment information.
    def list_windows(self) -> List[Dict[str, str]]:
        """List all available windows using the appropriate manager."""
        try:
            windows = self.manager.list_windows()
            
            # Add environment info to each window
            for window in windows:
                window['environment'] = self.environment
                if 'type' not in window:
                    window['type'] = 'x11' if self.environment == 'linux' else self.environment
                    
            return windows
        except Exception as e:
            logger.error(f"Failed to list windows: {e}")
            return []
  • WindowsWindowManager.list_windows(): Core implementation for Windows using detailed PowerShell script with Win32 API calls to enumerate capturable windows (visible, minimized, maximized) with properties like ID, title, process info, state.
        def list_windows(self) -> List[Dict[str, str]]:
            """
            List all Windows applications with visible windows using PowerShell.
            Returns list of window info dictionaries.
            """
            if not self.powershell_available:
                logger.error("PowerShell not available - cannot list Windows applications")
                return []
                
            try:
                # Enhanced PowerShell script to get comprehensive window information
                ps_script = '''
                Add-Type -TypeDefinition @"
                    using System;
                    using System.Runtime.InteropServices;
                    using System.Text;
                    
                    public class Win32 {
                        [DllImport("user32.dll")]
                        public static extern bool IsWindowVisible(IntPtr hWnd);
                        
                        [DllImport("user32.dll")]
                        public static extern bool IsIconic(IntPtr hWnd);
                        
                        [DllImport("user32.dll")]
                        public static extern bool IsZoomed(IntPtr hWnd);
                        
                        [DllImport("user32.dll")]
                        public static extern int GetWindowText(IntPtr hWnd, StringBuilder lpString, int nMaxCount);
                        
                        [DllImport("user32.dll")]
                        public static extern int GetWindowTextLength(IntPtr hWnd);
                        
                        [DllImport("user32.dll")]
                        public static extern uint GetWindowThreadProcessId(IntPtr hWnd, out uint lpdwProcessId);
                    }
    "@
    
                $windows = @()
    
                # Get processes with capturable windows only
                Get-Process | Where-Object { 
                    $_.MainWindowHandle -ne 0 -and $_.ProcessName -notmatch "^(dwm|csrss|winlogon|wininit)$"
                } | ForEach-Object {
                    $handle = [IntPtr]$_.MainWindowHandle
                    $isVisible = [Win32]::IsWindowVisible($handle)
                    $isMinimized = [Win32]::IsIconic($handle)
                    $isMaximized = [Win32]::IsZoomed($handle)
                    
                    # Only include windows that are in capturable states
                    # Skip hidden windows that can't be meaningfully captured
                    if (-not ($isVisible -or $isMinimized)) {
                        return  # Skip this window
                    }
                    
                    # Get window title using Windows API (more reliable than MainWindowTitle)
                    $titleLength = [Win32]::GetWindowTextLength($handle)
                    if ($titleLength -gt 0) {
                        $title = New-Object System.Text.StringBuilder($titleLength + 1)
                        [Win32]::GetWindowText($handle, $title, $title.Capacity) | Out-Null
                        $windowTitle = $title.ToString()
                    } else {
                        $windowTitle = $_.MainWindowTitle
                    }
                    
                    # Include window even if title is empty, but provide useful info
                    if ([string]::IsNullOrEmpty($windowTitle)) {
                        $windowTitle = "[$($_.ProcessName) - $($_.Id)]"
                    }
                    
                    # Determine window state - only capturable states
                    $windowState = if ($isMinimized) { 
                        "minimized" 
                    } elseif ($isMaximized) { 
                        "maximized" 
                    } else { 
                        "normal" 
                    }
                    
                    $windows += @{
                        id = $_.MainWindowHandle.ToString()
                        title = $windowTitle
                        process_name = $_.ProcessName
                        process_id = $_.Id.ToString()
                        window_handle = $_.MainWindowHandle.ToString()
                        is_visible = $isVisible
                        is_minimized = $isMinimized
                        is_maximized = $isMaximized
                        window_state = $windowState
                        type = "windows"
                    }
                }
    
                # Convert to JSON
                $windows | ConvertTo-Json -Compress
                '''
                
                result = subprocess.run(
                    ['powershell.exe', '-Command', ps_script],
                    capture_output=True,
                    text=True,
                    check=True,
                    timeout=30,  # 30 second timeout for window enumeration
                    encoding='utf-8',
                    errors='ignore'  # Ignore encoding errors to handle special characters
                )
                
                import json
                if result.stdout.strip():
                    try:
                        # Handle both single object and array responses
                        data = json.loads(result.stdout.strip())
                        if isinstance(data, dict):
                            data = [data]  # Convert single result to list
                        
                        windows = []
                        for window_info in data:
                            # Only include windows with valid window handles (> 0)
                            window_handle = str(window_info.get('window_handle', '0'))
                            if window_handle == '0':
                                continue  # Skip processes without actual windows
                            
                            # Use the enhanced window information from the new PowerShell script
                            windows.append({
                                'id': str(window_info.get('id', '')),
                                'title': window_info.get('title', ''),
                                'process_name': window_info.get('process_name', ''),
                                'process_id': str(window_info.get('process_id', '')),
                                'window_handle': window_handle,
                                'is_visible': window_info.get('is_visible', False),
                                'is_minimized': window_info.get('is_minimized', False),
                                'is_maximized': window_info.get('is_maximized', False),
                                'window_state': window_info.get('window_state', 'normal'),  # Default to normal instead of unknown
                                'type': 'windows'
                            })
                        
                        logger.info(f"Found {len(windows)} capturable windows using enhanced detection")
                        if windows:
                            # Log stats about capturable window states for debugging
                            normal_count = sum(1 for w in windows if w['window_state'] == 'normal')
                            minimized_count = sum(1 for w in windows if w['window_state'] == 'minimized')
                            maximized_count = sum(1 for w in windows if w['window_state'] == 'maximized')
                            logger.info(f"Window states: {normal_count} normal, {minimized_count} minimized, {maximized_count} maximized")
                        
                        return windows
                    except json.JSONDecodeError as e:
                        logger.error(f"Failed to parse PowerShell JSON output: {e}")
                        logger.error(f"PowerShell output was: {result.stdout[:500]}...")  # Log first 500 chars for debugging
                        return []
                else:
                    logger.warning("PowerShell returned empty output - no windows detected")
                    return []
                
            except subprocess.TimeoutExpired:
                logger.error("PowerShell window enumeration timed out after 30 seconds")
                return []
            except subprocess.CalledProcessError as e:
                logger.error(f"Failed to list Windows applications: {e}")
                return []
  • WindowCapture.list_windows(): Linux/X11-specific implementation using 'wmctrl -l' command to list windows with basic properties (ID, title, desktop, machine).
    def list_windows(self) -> List[Dict[str, str]]:
        """
        List all available windows using wmctrl command.
        Returns list of window info dictionaries.
        """
        try:
            # Use wmctrl to list windows
            result = subprocess.run(
                ['wmctrl', '-l'],
                capture_output=True,
                text=True,
                check=True,
                timeout=10  # 10 second timeout for window listing
            )
            
            windows = []
            for line in result.stdout.strip().split('\n'):
                if line.strip():
                    parts = line.split(None, 3)
                    if len(parts) >= 4:
                        window_id = parts[0]
                        desktop = parts[1]
                        machine = parts[2]
                        title = parts[3]
                        
                        windows.append({
                            'id': window_id,
                            'title': title,
                            'desktop': desktop,
                            'machine': machine
                        })
            
            return windows
            
        except subprocess.CalledProcessError as e:
            logger.error(f"Failed to list windows with wmctrl: {e}")
            return []
        except FileNotFoundError:
            logger.error("wmctrl not found. Please install: sudo apt-get install wmctrl")
            return []
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool returns a JSON string with window IDs, titles, and properties, which adds useful context beyond the basic purpose. However, it doesn't mention behavioral traits like whether this is a read-only operation, potential performance impacts, or how it interacts with system permissions for window capture.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, and the second provides essential return format details. There's no wasted text, and both sentences earn their place by adding value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (0 parameters, simple list operation) and the presence of an output schema (which handles return values), the description is reasonably complete. It covers the purpose and output format adequately. However, it could be more complete by including usage guidelines or behavioral context, especially since no annotations are provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so the schema already fully documents the lack of inputs. The description doesn't need to add parameter semantics, but it correctly implies no parameters are required by not mentioning any. This meets the baseline for zero parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('List') and resource ('all available windows for screenshot capture'), making it immediately understandable. However, it doesn't explicitly distinguish this tool from sibling tools like 'capture_window' or 'debug_window_detection', which might also involve window operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, context for use, or comparison with sibling tools like 'capture_window' (which might require window selection) or 'debug_window_detection' (which might involve troubleshooting).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PovedaAqui/auto-snap-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server