πΈ MCP ACS Screenshot Server
Give AI agents visual superpowers to see, analyze, and document your applications like senior UX designers.
This enterprise-grade MCP server transforms AI from code-only assistants into visual experts capable of UI analysis, accessibility auditing, documentation generation, and responsive design testing.
π Repository
This package is now maintained in its own repository: https://github.com/Digital-Defiance/mcp-screenshot
This repository is part of the AI Capabilitites Suite on GitHub.
π€ Why Do AI Agents Need Visual Capabilities?
AI agents today are powerful but visually blind:
β Can read HTML/CSS but can't see actual layouts
β Can suggest UI improvements without seeing the real user experience
β Can't detect accessibility issues like poor contrast or spacing
β Can't create visual documentation or bug reports
β Can't analyze responsive design across different screen sizes
Result: You're stuck manually creating screenshots, documentation, and visual analysis that AI should handle.
π― Revolutionary Use Cases
π "AI, create professional documentation"
π "AI, audit this page for accessibility"
π "AI, create a detailed bug report"
π¨ "AI, compare these design variations"
π± "AI, test responsive design"
β¨ What This Changes
Before: AI worked blind, relying on code descriptions
β "The button looks wrong" β AI guesses the issue
β "Create documentation" β AI writes generic text
β "Check accessibility" β AI only reviews code
β "Test responsive design" β AI can't see actual breakpoints
After: AI sees and analyzes your actual user interface
β Visual debugging - AI identifies exact pixel-level issues
β Smart documentation - AI creates guides with real screenshots and annotations
β Accessibility audits - AI measures actual contrast ratios and spacing
β Responsive testing - AI captures and compares different screen sizes
β Design analysis - AI evaluates visual hierarchy and user experience
β Professional reports - AI creates detailed visual evidence for bugs and improvements
π Features
Multi-format Support: PNG, JPEG, WebP, BMP with configurable quality
Flexible Capture: Full screen, specific windows, or custom regions
Privacy Protection: PII masking with OCR-based detection for emails, phone numbers, and credit cards
Security Controls: Path validation, rate limiting, audit logging, and configurable policies
Cross-platform: Linux (X11/Wayland), macOS, Windows with native APIs
Multi-monitor Support: Capture from specific displays in multi-monitor setups
Enterprise Security: Window exclusion, audit logging, rate limiting
AI-Optimized: Structured responses perfect for AI agent workflows
Installation
NPM Installation
System Requirements
Linux:
X11:
imagemagickpackage (providesimportcommand)Wayland:
grimpackage
macOS:
Built-in
screencapturecommand (no additional dependencies)Screen Recording permission required (System Preferences > Security & Privacy > Privacy > Screen Recording)
Windows:
No additional dependencies required
MCP Configuration
Add to your MCP settings file (e.g., ~/.kiro/settings/mcp.json or .kiro/settings/mcp.json):
π οΈ 5 Professional MCP Tools
Purpose-built for AI agents to capture, analyze, and work with visual information:
The server exposes 5 comprehensive MCP tools that enable AI agents to see and understand your applications:
1. screenshot_capture_full
Capture full screen or specific display.
Parameters:
display(string, optional): Display ID to capture (defaults to primary display)format(string, optional): Image format -png,jpeg,webp, orbmp(default:png)quality(number, optional): Compression quality 1-100 for lossy formats (default: 90)savePath(string, optional): File path to save screenshot (returns base64 if not provided)enablePIIMasking(boolean, optional): Enable PII detection and masking (default: false)
Example:
Response:
2. screenshot_capture_window
Capture specific application window by ID or title pattern.
Parameters:
windowId(string, optional): Window identifier (usewindowIdorwindowTitle)windowTitle(string, optional): Window title pattern to match (usewindowIdorwindowTitle)includeFrame(boolean, optional): Include window frame and title bar (default: false)format(string, optional): Image format (default:png)quality(number, optional): Compression quality 1-100 (default: 90)savePath(string, optional): File path to save screenshot
Example:
Response:
3. screenshot_capture_region
Capture specific rectangular region of the screen.
Parameters:
x(number, required): X coordinate of top-left cornery(number, required): Y coordinate of top-left cornerwidth(number, required): Width of region in pixelsheight(number, required): Height of region in pixelsformat(string, optional): Image format (default:png)quality(number, optional): Compression quality 1-100 (default: 90)savePath(string, optional): File path to save screenshot
Example:
Response:
4. screenshot_list_displays
List all connected displays with resolution and position information.
Parameters: None
Example:
Response:
5. screenshot_list_windows
List all visible windows with title, process, and position information.
Parameters: None
Example:
Response:
Security Configuration
The server enforces security policies to control screenshot operations. Configure via environment variables or security policy file.
Environment Variables
SCREENSHOT_ALLOWED_DIRS: Comma-separated list of allowed directories for saving screenshotsSCREENSHOT_MAX_CAPTURES_PER_MIN: Maximum captures per minute (default: 60)SCREENSHOT_ENABLE_AUDIT_LOG: Enable audit logging (default: true)SCREENSHOT_BLOCKED_WINDOWS: Comma-separated list of window title patterns to exclude
Security Policy File
Create a security-policy.json file:
Load the policy when starting the server:
Error Handling
All tools return structured error responses with error codes and remediation suggestions.
Error Codes
Code | Description | Remediation |
| Insufficient permissions to capture | Grant Screen Recording permission (macOS) or check user permissions |
| File path outside allowed directories | Use a path within configured allowed directories |
| Specified window does not exist | Use
to find available windows |
| Specified display does not exist | Use
to find available displays |
| Requested format not supported | Use png, jpeg, webp, or bmp |
| Screenshot capture failed | Check permissions and try again |
| Too many captures in time window | Wait before making additional requests |
| Invalid region coordinates or dimensions | Ensure coordinates are non-negative and dimensions are positive |
| Insufficient memory for operation | Reduce capture size or close other applications |
| Image encoding failed | Try different format or reduce quality |
| File system operation failed | Check permissions and disk space |
Error Response Format
Troubleshooting
Linux Issues
Problem: import: command not found or grim: command not found
Solution: Install required packages:
Problem: Black screen or empty captures
Solution: Check display server environment variables:
macOS Issues
Problem: PERMISSION_DENIED error
Solution: Grant Screen Recording permission:
Open System Preferences > Security & Privacy > Privacy
Select "Screen Recording" from the list
Add your terminal application or Node.js to the allowed list
Restart the application
Problem: Retina display captures are double resolution
Solution: This is expected behavior. Retina displays have 2x pixel density. Use the width and height from metadata to determine actual dimensions.
Windows Issues
Problem: Capture fails with access denied
Solution: Run the application with administrator privileges or check Windows Defender settings.
Problem: Multi-monitor captures show wrong display
Solution: Use screenshot_list_displays to get correct display IDs and positions.
General Issues
Problem: RATE_LIMIT_EXCEEDED error
Solution: The server limits captures to prevent abuse. Wait 60 seconds or adjust maxCapturesPerMinute in security policy.
Problem: INVALID_PATH error when saving
Solution: Ensure the save path is within allowed directories configured in security policy.
Problem: PII masking not working
Solution:
Ensure tesseract.js is properly installed
Check that
eng.traineddatalanguage file is availablePII masking requires OCR which may be slow on large images
Problem: Large file sizes
Solution:
Use JPEG format with lower quality (60-80) for smaller files
Use WebP format for best compression
Reduce capture region size if possible
Problem: Out of memory errors
Solution:
Capture smaller regions instead of full screen
Reduce quality settings
Close other applications to free memory
Use streaming for very large captures
Programmatic Usage
TypeScript/JavaScript
Direct Capture Engine Usage
Development
This package is part of the AI Capabilities Suite monorepo.
Build
Test
Project Structure
Contributing
Contributions are welcome! Please ensure:
All tests pass (
npm test)Code follows TypeScript best practices
New features include tests and documentation
Security considerations are addressed
License
MIT
Support
For issues and questions:
GitHub Issues: Create an issue
Documentation: See TESTING.md for testing guide
Security: Report security issues privately to security@example.com