Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@CUA MCP ServerOn my-sandbox, open Chrome and search for 'MCP servers' on GitHub"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
CUA MCP Server
An agentic Model Context Protocol (MCP) server for CUA Cloud - delegate desktop automation tasks to an autonomous vision-based agent. Images never leave the server; only text summaries are returned.
Production URL: https://cua-mcp-server.vercel.app/mcp
What is CUA?
CUA (Computer Use Agent) provides cloud-based virtual machine sandboxes that AI agents can control. This MCP server exposes CUA's capabilities through a clean task-delegation API:
Create and manage VMs (Linux, Windows, macOS)
Delegate tasks - "Open Chrome and navigate to google.com"
Get text summaries - No images in your context window
Query screen state - Vision-based descriptions without taking action
Architecture
Project Structure
Available Tools (9 total)
Sandbox Management (5 tools)
Tool | Description |
| List all CUA cloud sandboxes with their current status |
| Get details of a specific sandbox including API URLs |
| Start a stopped sandbox |
| Stop a running sandbox |
| Restart a sandbox |
Note: Create and delete sandboxes via the CUA Dashboard - the Cloud API doesn't expose these operations.
Agentic Tools (4 tools)
Tool | Description |
| Get a text description of current screen state using vision AI. No actions taken. |
| Execute a computer task autonomously. Returns immediately with task_id for polling. |
| Poll progress of running tasks. Returns current step, last action, and reasoning. |
| Retrieve results of a previously executed task by ID. |
Quick Start
1. Get a CUA API Key
Go to cua.ai/signin
Navigate to Dashboard > API Keys > New API Key
Copy your API key (starts with
sk_cua-api01_...)
2. Configure Claude Code
Add to your ~/.claude.json:
3. Use with Claude Code
Usage Examples
Automate a Web Task
Check Screen State
Ask Specific Questions
Self-Hosting
Prerequisites
Vercel account with Pro plan (for 800s function timeout)
Vercel Blob storage
Anthropic API key
Deploy Your Own Instance
Environment Variables
Variable | Description | Required |
| Your CUA Cloud API key | Yes |
| Anthropic API key for vision processing | Yes |
| Vercel Blob token (auto-added) | Yes |
| Custom API base URL (default: https://api.cua.ai) | No |
| Model to use: | No |
Setting Up Vercel Blob
Go to your Vercel project dashboard
Navigate to Storage → Create → Blob
The
BLOB_READ_WRITE_TOKENwill be automatically added
Pass API Key Per-Request
If you don't want to store the CUA API key on the server:
API Reference
MCP Endpoint
URL: POST /mcp
Content-Type: application/json
Example: Run Task
Response:
Example: Describe Screen
Model Support
Model | Env Variable | Tool Version | Features |
Claude Opus 4.5 (default) |
|
| Zoom support, higher accuracy |
Claude Sonnet 4.5 |
|
| Faster, lower cost |
Supported Computer Actions
The agent can perform the following actions autonomously:
UI Actions:
screenshot- Capture current screenleft_click,right_click,double_click,triple_click,middle_click- Mouse clicks at coordinatesmouse_move- Move cursor to coordinatesleft_click_drag- Click and drag from start to end coordinatesleft_mouse_down,left_mouse_up- Press/release mouse buttonscroll- Scroll up/down/left/rightwait- Pause executionzoom- View specific screen region at full resolution (Opus 4.5 only, defaults to center if no coordinate)
Keyboard:
type- Type textkey- Press key or key combination (e.g., "ctrl+c")hold_key- Hold a modifier key down (auto-releases after next action)
Constraints
Constraint | Value |
Function timeout | 800 seconds (Vercel Pro) |
Max steps per task | 100 |
Default steps | 100 |
Default timeout | 750 seconds |
Task history TTL | 24 hours |
Display resolution | Dynamic (default 1024x768) |
Sandbox Types
OS | Size | CPU | RAM | Use Case |
Linux | small | 2 | 4GB | Development, testing |
Linux | medium | 4 | 8GB | Build tasks, CI/CD |
Linux | large | 8 | 16GB | Heavy workloads |
Windows | small | 2 | 4GB | Basic Windows apps |
Windows | medium | 4 | 8GB | Office, development |
Windows | large | 8 | 16GB | Enterprise apps |
macOS | small | 2 | 4GB | iOS development |
macOS | medium | 4 | 8GB | Xcode builds |
macOS | large | 8 | 16GB | Heavy compilation |
Regions
north-america- US East (lowest latency for US users)europe- EU Westasia- Asia Pacific
Troubleshooting
"CUA API key required"
Set CUA_API_KEY environment variable in Vercel or pass via X-CUA-API-Key header.
"ANTHROPIC_API_KEY not configured"
The server needs an Anthropic API key for vision processing. Add it to your Vercel environment variables.
Task times out
Default timeout is 750 seconds
Reduce task complexity or break into smaller steps
Check if sandbox is responsive with
describe_screen
Task exceeds max steps
Default is 100 steps (max 100)
Break complex tasks into smaller subtasks
Use more specific task descriptions
Resources
License
MIT