Enables computer control capabilities including switching to Firefox application, mouse and keyboard operations, and screen capture through ByteBot's Desktop API.
Used as an example target in workflow demonstrations for autonomous task execution, including navigation and screenshot capture through ByteBot's Agent API.
Enables computer control capabilities including switching to Chrome application, mouse and keyboard operations, and screen capture through ByteBot's Desktop API.
Enables computer control capabilities including switching to Safari application on macOS, mouse and keyboard operations, and screen capture through ByteBot's Desktop API.
Used as an example target for autonomous task execution, demonstrating navigation and search capabilities through ByteBot's Agent API.
ByteBot MCP Server
Production-grade Model Context Protocol (MCP) server for ByteBot's dual-API architecture, providing intelligent hybrid workflow orchestration for autonomous task execution and desktop computer control.
Overview
This MCP server integrates ByteBot's Agent API (task management) and Desktop API (computer control) into a unified interface for AI assistants like Claude. It enables:
Autonomous Task Execution: Create and manage tasks for ByteBot to execute independently
Direct Computer Control: Mouse, keyboard, screen capture, and file operations
Hybrid Workflows: Intelligent orchestration with automatic monitoring and intervention handling
Real-time Updates: Optional WebSocket support for live task status notifications
Features
Agent API Tools (Task Management)
bytebot_create_task- Create new tasks with priority levelsbytebot_list_tasks- List and filter tasks by status/prioritybytebot_get_task- Get detailed task information with message historybytebot_get_in_progress_task- Check currently running taskbytebot_update_task- Update task status or prioritybytebot_delete_task- Delete tasks
Desktop API Tools (Computer Control)
Mouse Operations:
bytebot_move_mouse- Move cursor to coordinatesbytebot_click- Click with left/right/middle buttonbytebot_drag- Drag from one position to anotherbytebot_scroll- Scroll in any direction
Keyboard Operations:
bytebot_type_text- Type text stringsbytebot_paste_text- Paste text (for special characters)bytebot_press_keys- Keyboard shortcuts (Ctrl+C, Alt+Tab, etc.)
Screen Operations:
bytebot_screenshot- Capture screen as base64 PNGbytebot_cursor_position- Get current cursor position
File I/O:
bytebot_read_file- Read file content (base64)bytebot_write_file- Write file content (base64)
System:
bytebot_switch_application- Switch to applicationbytebot_wait- Wait for specified duration
Hybrid Orchestration Tools (Priority 1)
bytebot_create_and_monitor_task- Create task and wait for completionbytebot_monitor_task- Monitor existing task until terminal statebytebot_intervene_in_task- Provide help when task needs interventionbytebot_execute_workflow- Multi-step workflow with automatic error recovery
Prerequisites
Node.js: 20.x or higher
ByteBot Instance: Running and accessible at configured endpoints
Agent API (default:
http://localhost:9991)Desktop API (default:
http://localhost:9990)
Installation
Configuration
1. Create Environment File
Copy the example environment file and customize:
2. Edit .env File
3. Remote ByteBot Configuration
If ByteBot is running on a remote server:
MCP Client Setup
Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Zed Editor
Add to your Zed settings:
Continue.dev
Add to .continue/config.json:
Usage Examples
Example 1: Basic Task Creation
Example 2: Hybrid Workflow (Create → Monitor → Complete)
Example 3: Task Needs Intervention
Example 4: Interactive Desktop Control
Example 5: Multi-Step Workflow
Example 6: File Operations
Troubleshooting
Error: "Cannot connect to ByteBot server"
Cause: ByteBot is not running or endpoint URL is incorrect
Solution:
Verify ByteBot is running:
curl http://localhost:9991/tasksCheck
.envfile has correct URLsEnsure no firewall blocking connections
Error: "Request to ByteBot timed out"
Cause: Task took longer than configured timeout
Solution:
Increase
REQUEST_TIMEOUTin.envfor Agent API callsIncrease
DESKTOP_ACTION_TIMEOUTfor Desktop API callsUse
bytebot_create_and_monitor_taskwith custom timeout:{ "description": "Long running task", "timeout": 600000 }
Error: "Task with ID xyz not found"
Cause: Task was deleted or ID is incorrect
Solution:
List all tasks:
bytebot_list_tasksVerify task ID from response
Check if task was accidentally deleted
Warning: "Screenshot size is 8.5MB"
Cause: Screenshot is very large (high resolution display)
Solution:
This is just a warning, screenshot still works
Consider reducing screen resolution if frequently capturing screenshots
Screenshots >5MB will show this warning
Error: "Task must be in NEEDS_HELP state"
Cause: Attempting to intervene in task that doesn't need help
Solution:
Check task status first:
bytebot_get_taskOnly use
bytebot_intervene_in_taskwhen status isNEEDS_HELPUse
bytebot_update_taskto manually change status if needed
WebSocket Connection Failed
Cause: WebSocket URL incorrect or ByteBot doesn't support WebSocket
Solution:
Set
ENABLE_WEBSOCKET=falsein.envto disable WebSocketServer will automatically fall back to HTTP polling
WebSocket is optional - all features work without it
Error: "File size exceeds maximum allowed size"
Cause: Trying to upload/read file larger than 10MB
Solution:
Increase
MAX_FILE_SIZEin.env(in bytes)Split large files into smaller chunks
Compress files before uploading
API Reference
Task Priority Levels
LOW- Background tasks, non-urgentMEDIUM- Default priority (recommended)HIGH- Important tasks, process soonURGENT- Critical tasks, process immediately
Task Lifecycle States
PENDING- Task created, waiting to startIN_PROGRESS- Task currently executingNEEDS_HELP- Task blocked, requires interventionNEEDS_REVIEW- Task complete but needs verificationCOMPLETED- Task finished successfullyCANCELLED- Task cancelled by userFAILED- Task failed with error
Mouse Buttons
left- Primary button (default)right- Context menu buttonmiddle- Scroll wheel click
Scroll Directions
up- Scroll updown- Scroll downleft- Scroll leftright- Scroll right
Common Applications
firefox- Mozilla Firefoxchrome- Google Chromesafari- Safari (macOS)terminal- Terminal/Command Promptvscode- Visual Studio Code
Architecture
Development
Build
Type Check
Watch Mode
Environment Variables Reference
Variable | Default | Description |
|
| ByteBot Agent API endpoint |
|
| ByteBot Desktop API endpoint |
|
| WebSocket endpoint for real-time updates |
|
| Enable WebSocket connections |
|
| Server identifier |
|
| HTTP request timeout (ms) |
|
| Desktop action timeout (ms) |
|
| Maximum retry attempts for failed requests |
|
| Initial retry delay (ms) |
|
| Task status polling interval (ms) |
|
| Maximum task monitoring duration (ms) |
|
| Maximum file size in bytes (10MB) |
|
| Logging level (debug/info/warn/error) |
License
MIT
Support
For issues and questions:
ByteBot Documentation: https://docs.bytebot.ai
MCP Specification: https://modelcontextprotocol.io
Report issues: Create an issue in this repository
Version History
1.0.0 (2024-01-15)
Initial release
Agent API integration (task management)
Desktop API integration (computer control)
Hybrid orchestration tools
WebSocket support for real-time updates
Comprehensive error handling and retry logic
Full TypeScript implementation with strict typing
This server cannot be installed