Mentioned as a future development for cross-browser testing to compare screenshots across different browsers.
Mentioned as a future development for workflow integration to commit screenshots with code changes.
Listed as a future development for performance monitoring to gather Lighthouse scores and other performance metrics.
Enables capturing and analyzing web page screenshots at multiple viewport breakpoints, along with collecting console logs, JavaScript errors, and network issues. Also supports page interactions like clicking, typing, scrolling, and form filling.
Mentioned as a future development for cross-browser testing to compare screenshots across different browsers.
PuppeteerMCP Server
Developing website UI's with MCP just got a lot easier. A Model Context Protocol (MCP) server that provides screenshot tools for AI assistants using Puppeteer. This server integrates with MCP-compatible hosts like Cursor to enable AI agents to capture and analyze web page screenshots, console logs, errors, and warnings.
Overview
PuppeteerMCP implements the Model Context Protocol to bridge AI assistants with web page screenshot capabilities. When working with AI-assisted development, this server allows AI agents to:
- Navigate to any URL via tools
- Capture screenshots at multiple viewport breakpoints
- Return visual feedback with structured metadata
- Support both headless and headful browser modes
This enables more effective AI-assisted development by providing visual context through the standardized MCP protocol.
Features
Current Features
- ✅ MCP server implementation with TypeScript SDK
- ✅ Screenshot tools for AI agents with multi-breakpoint capture
- ✅ stdio transport for seamless Cursor integration
- ✅ Multi-breakpoint screenshots (mobile, tablet, desktop)
- ✅ Automatic page height detection for full-page capture
- ✅ Structured tool responses with detailed metadata
- ✅ Error reporting - JavaScript errors, console logs, network issues
- ✅ Performance optimization - JPEG compression and width limiting
- ✅ Page interaction capabilities - Click, type, scroll, hover, form filling, waiting
In Progress
- ✅ Completed - Error reporting and debugging features
Future Developments
🚀 High Priority Features
- 📋 Element-specific screenshots - Target CSS selectors for component-level captures
- 📋 Performance monitoring - Lighthouse scores, Core Web Vitals, bundle analysis
- 📋 Accessibility testing - WCAG violations, color contrast, keyboard navigation
🎯 Advanced Testing & Analysis
- 📋 Visual regression testing - Compare screenshots against baselines
- 📋 Cross-browser testing - Firefox, Safari, Edge screenshot comparison
- 📋 Content extraction - Pull text, links, SEO data for analysis
- 📋 Form validation testing - Auto-fill and validate form behavior
- 📋 Animation capture - Record CSS animations and transitions
- 📋 Multi-step user flows - Test complete user journeys
🛠️ Development Workflow Integration
- 📋 Local development watching - Auto-screenshot on file changes
- 📋 Git integration - Commit screenshots with code changes
- 📋 Hot reload capture - Screenshot after development server updates
- 📋 API-driven testing - Screenshot pages with different data sets
- 📋 Database integration - Test with real/mock data scenarios
📱 Device & Platform Testing
- 📋 Real device emulation - iPhone, Android, tablet testing
- 📋 Mobile-specific features - Touch gestures, device orientation
- 📋 Progressive Web App testing - Offline states, service workers
🤖 AI-Powered Analysis
- 📋 Design review automation - AI analysis of UI/UX patterns
- 📋 Code quality insights - Spot code smells through visual patterns
- 📋 Automated bug detection - Visual anomaly detection
- 📋 Performance recommendations - AI-driven optimization suggestions
Page Interaction Capabilities
✅ NEW: Automated Page Actions
The screenshot tool now supports executing a sequence of page interactions before capturing screenshots, enabling:
- 🎯 Form Testing: Fill forms, select dropdowns, check boxes
- 🖱️ User Interactions: Click buttons, hover elements, scroll to sections
- ⏱️ Wait Conditions: Wait for elements to appear or specific durations
- 🧭 Navigation: Navigate between pages or reload current page
- 📝 Input Management: Type text, clear fields, select options
Available Action Types:
click
- Click an element by CSS selectortype
- Type text into an input fieldclear
- Clear an input field's valuescroll
- Scroll to coordinates or elementhover
- Hover over an elementselect
- Select option from dropdownwait
- Wait for specified durationwaitForElement
- Wait for element to appearnavigate
- Navigate to a different URL
Example Usage:
MCP Tool Specification
screenshot
Captures screenshots of web pages at one or more viewport breakpoints using Puppeteer.
Tool Schema
Default Breakpoints
If no breakpoints are specified, the tool uses these standard responsive breakpoints:
- Mobile: 375px width (height auto-detected)
- Tablet: 768px width (height auto-detected)
- Desktop: 1280px width (height auto-detected)
Tool Response
Error Reporting & Debugging
✅ NEW: Comprehensive Error Monitoring
The screenshot tool now captures and reports all page activity, making it perfect for debugging web applications:
What Gets Captured:
- 🟥 JavaScript Errors: Runtime errors with stack traces, line numbers, and sources
- 🟨 Console Messages: All
console.log()
,console.warn()
,console.error()
output - 🟦 Network Issues: Failed requests (404s, 500s), CORS violations, timeouts
- 🟪 Security Problems: CORS policy violations, blocked requests
Error Types:
Summary Statistics:
- Total count of errors, warnings, and console logs
- Quick flags for JavaScript and network error presence
- Instant overview of page health
How It Appears in Cursor:
When you take a screenshot, Cursor will show:
- Visual Screenshot - The actual page capture
- Activity Summary - "📊 Page Activity Detected: • 2 error(s) • 1 warning(s) • 5 console log(s)"
- Detailed Report - Grouped by error type with full context
This makes the screenshot tool incredibly powerful for debugging, development, and code review - you can literally see what's happening on the page while viewing how it looks!
Installation
Prerequisites
- Node.js 18+
- npm or yarn
- Chrome/Chromium browser (for Puppeteer)
Setup
Cursor Integration
To use this MCP server with Cursor:
1. Build the Server
2. Configure Cursor
Add the MCP server to your Cursor configuration. The exact location depends on your OS:
macOS: ~/.cursor/mcp.json
Windows: %APPDATA%\Cursor\mcp.json
Linux: ~/.config/cursor/mcp.json
Important: Use the absolute path to your built JavaScript file.
3. Restart Cursor
Restart Cursor to load the MCP server. You should see the screenshot tool available in Cursor's AI interface.
4. Usage in Cursor
You can now ask Cursor to take screenshots and they will appear as inline images in the chat:
Basic Screenshots:
- "Take a screenshot of https://example.com"
- "Capture mobile and desktop screenshots of this website"
- "Show me how this page looks on different screen sizes"
- "Take a high-quality PNG screenshot of this website"
- "Get optimized JPEG screenshots for faster loading"
✅ NEW - Error Debugging:
- "Take a screenshot of my app and show me any JavaScript errors"
- "Debug this webpage - capture screenshots and check for console errors"
- "Screenshot this site and tell me about any network failures"
- "Show me the page visually and report any CORS issues"
- "Take screenshots and analyze all console output for debugging"
The screenshots will appear directly in Cursor's chat interface with comprehensive error reporting, allowing multimodal AI models (GPT-4o, Claude 3, Gemini Pro) to analyze them visually AND provide feedback on both design/layout AND technical issues like JavaScript errors, failed network requests, and console warnings.
Development
Project Structure
Scripts
npm run build
: Build TypeScript to JavaScriptnpm run watch
: Build in watch mode during developmentnpm run test
: Run test suite (when implemented)npm run lint
: Run ESLint (when configured)
Testing with MCP Inspector
The MCP inspector is the primary tool for testing MCP servers:
This opens a web interface where you can:
- View available tools
- Test tool calls with different parameters
- Inspect tool responses
- Debug server behavior
Architecture
Core Components
- MCP Server: Main server using
@modelcontextprotocol/sdk
- stdio Transport: Communication layer for Cursor integration
- Screenshot Tools: Tool implementations using Puppeteer
- Puppeteer Service: Browser automation and screenshot capture
Communication Flow
Key Differences from HTTP APIs
Aspect | HTTP API | MCP Server |
---|---|---|
Communication | HTTP requests/responses | stdio + JSON-RPC 2.0 |
Discovery | Documentation | Tool schema registration |
Integration | Manual API calls | Native MCP protocol support |
AI Usage | Requires custom code | Direct tool calling |
Transport | Network-based | Process-based (subprocess) |
Configuration
Environment Variables
PUPPETEER_EXECUTABLE_PATH
: Custom Chrome/Chromium pathNODE_ENV
: Environment mode (development/production)
Tool Configuration
Tools can be configured through their input parameters:
- Viewport breakpoints
- Browser mode (headless/headful)
- Navigation timeouts
- Wait conditions
Error Handling
The server uses MCP's structured error handling:
InvalidParams
: Invalid tool parametersInternalError
: Server-side errors (browser failures, timeouts)MethodNotFound
: Unknown tool names
All errors include descriptive messages for debugging.
Security Considerations
- URL validation to prevent malicious requests
- Timeout controls to prevent hanging processes
- Browser sandboxing through Puppeteer
- Input sanitization via JSON schema validation
Contributing
- Fork the repository
- Create a feature branch
- Make your changes following MCP patterns
- Test with MCP inspector
- Test integration with Cursor
- Submit a pull request
Troubleshooting
Common Issues
Server not appearing in Cursor:
- Check the absolute path in your Cursor configuration
- Ensure the build/ directory exists and contains index.js
- Restart Cursor after configuration changes
Tool calls failing:
- Test the server with MCP inspector first
- Check console output for error messages
- Verify Puppeteer can launch browsers on your system
Browser launch failures:
- Install Chrome/Chromium if not present
- Set PUPPETEER_EXECUTABLE_PATH if using custom browser location
- Check for missing dependencies on Linux systems
Debugging
- Test with MCP Inspector: Primary debugging tool
- Check Console Output: Server logs errors to stderr
- Verify Configuration: Ensure Cursor config uses absolute paths
- Browser Testing: Test Puppeteer separately if needed
License
MIT License - see LICENSE file for details.
Support
For issues and questions:
- Create an issue in the GitHub repository
- Check existing documentation and examples
- Test with MCP inspector before reporting integration issues
Standard Viewport Breakpoints
Name | Width | Description |
---|---|---|
Mobile | 375px | Typical smartphone width |
Tablet | 768px | Standard tablet width |
Desktop | 1280px | Common desktop width |
All screenshots automatically detect page height for full content capture.
Image Optimization
To ensure screenshots work well with Cursor's chat interface and don't exceed token limits:
Automatic Optimization
- Format: JPEG by default (80% quality) for smaller file sizes
- Width Limiting: Images wider than 1280px are automatically clipped
- Full Page Capture: Height is always full page content
Custom Options
Size Considerations
- Large base64 images can hit Cursor's 10MB message limit
- JPEG format recommended for most use cases
- PNG only for cases requiring transparency or pixel-perfect quality
This server cannot be installed
An MCP server that enables AI assistants to capture and analyze web page screenshots using Puppeteer, supporting multi-breakpoint captures, error reporting, and page interactions.
Related MCP Servers
- -securityFlicense-qualityEnables capturing screenshots of web pages and local HTML files through a simple MCP tool interface using Puppeteer with configurable options for dimensions and output paths.Last updated -104JavaScript
- AsecurityAlicenseAqualityAn official MCP server implementation that allows AI assistants to capture website screenshots through the ScreenshotOne API, enabling visual context from web pages during conversations.Last updated -16TypeScriptMIT License
- -securityFlicense-qualityA MCP server that allows AI assistants to interact with the browser, including getting page content as markdown, modifying page styles, and searching browser history.Last updated -5TypeScript
- -security-license-qualityAn MCP server that provides web development tools including taking screenshots of screens, enabling AI agents to capture and analyze visual content during development.Last updated -2TypeScript