Skip to main content
Glama
README.md14.6 kB
# ByteBot MCP Server Production-grade Model Context Protocol (MCP) server for ByteBot's dual-API architecture, providing intelligent hybrid workflow orchestration for autonomous task execution and desktop computer control. ## Overview This MCP server integrates ByteBot's Agent API (task management) and Desktop API (computer control) into a unified interface for AI assistants like Claude. It enables: - **Autonomous Task Execution**: Create and manage tasks for ByteBot to execute independently - **Direct Computer Control**: Mouse, keyboard, screen capture, and file operations - **Hybrid Workflows**: Intelligent orchestration with automatic monitoring and intervention handling - **Real-time Updates**: Optional WebSocket support for live task status notifications ## Features ### Agent API Tools (Task Management) - `bytebot_create_task` - Create new tasks with priority levels - `bytebot_list_tasks` - List and filter tasks by status/priority - `bytebot_get_task` - Get detailed task information with message history - `bytebot_get_in_progress_task` - Check currently running task - `bytebot_update_task` - Update task status or priority - `bytebot_delete_task` - Delete tasks ### Desktop API Tools (Computer Control) **Mouse Operations:** - `bytebot_move_mouse` - Move cursor to coordinates - `bytebot_click` - Click with left/right/middle button - `bytebot_drag` - Drag from one position to another - `bytebot_scroll` - Scroll in any direction **Keyboard Operations:** - `bytebot_type_text` - Type text strings - `bytebot_paste_text` - Paste text (for special characters) - `bytebot_press_keys` - Keyboard shortcuts (Ctrl+C, Alt+Tab, etc.) **Screen Operations:** - `bytebot_screenshot` - Capture screen as base64 PNG - `bytebot_cursor_position` - Get current cursor position **File I/O:** - `bytebot_read_file` - Read file content (base64) - `bytebot_write_file` - Write file content (base64) **System:** - `bytebot_switch_application` - Switch to application - `bytebot_wait` - Wait for specified duration ### Hybrid Orchestration Tools (Priority 1) - `bytebot_create_and_monitor_task` - Create task and wait for completion - `bytebot_monitor_task` - Monitor existing task until terminal state - `bytebot_intervene_in_task` - Provide help when task needs intervention - `bytebot_execute_workflow` - Multi-step workflow with automatic error recovery ## Prerequisites - **Node.js**: 20.x or higher - **ByteBot Instance**: Running and accessible at configured endpoints - Agent API (default: `http://localhost:9991`) - Desktop API (default: `http://localhost:9990`) ## Installation ```bash # Clone or download this repository cd bytebot-mcp-server # Install dependencies npm install # Build TypeScript code npm run build ``` ## Configuration ### 1. Create Environment File Copy the example environment file and customize: ```bash cp .env.example .env ``` ### 2. Edit `.env` File ```env # ByteBot Agent API (Task Management) BYTEBOT_AGENT_URL=http://localhost:9991 # ByteBot Desktop API (Computer Control) BYTEBOT_DESKTOP_URL=http://localhost:9990 # WebSocket Configuration (Optional) BYTEBOT_WS_URL=ws://localhost:9991 ENABLE_WEBSOCKET=false # Server Configuration MCP_SERVER_NAME=bytebot-mcp # Timeouts (milliseconds) REQUEST_TIMEOUT=30000 DESKTOP_ACTION_TIMEOUT=10000 # Retry Configuration MAX_RETRIES=3 RETRY_DELAY=1000 # Monitoring Configuration TASK_POLL_INTERVAL=2000 TASK_MONITOR_TIMEOUT=300000 # File Configuration MAX_FILE_SIZE=10485760 # Logging LOG_LEVEL=info ``` ### 3. Remote ByteBot Configuration If ByteBot is running on a remote server: ```env BYTEBOT_AGENT_URL=http://your-server.com:9991 BYTEBOT_DESKTOP_URL=http://your-server.com:9990 BYTEBOT_WS_URL=ws://your-server.com:9991 ``` ## MCP Client Setup ### Claude Desktop Add to your Claude Desktop configuration file: **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` **Windows**: `%APPDATA%\Claude\claude_desktop_config.json` ```json { "mcpServers": { "bytebot": { "command": "node", "args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"], "env": { "BYTEBOT_AGENT_URL": "http://localhost:9991", "BYTEBOT_DESKTOP_URL": "http://localhost:9990" } } } } ``` ### Zed Editor Add to your Zed settings: ```json { "context_servers": { "bytebot": { "command": { "path": "node", "args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"] }, "env": { "BYTEBOT_AGENT_URL": "http://localhost:9991", "BYTEBOT_DESKTOP_URL": "http://localhost:9990" } } } } ``` ### Continue.dev Add to `.continue/config.json`: ```json { "mcpServers": [ { "name": "bytebot", "command": "node", "args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"], "env": { "BYTEBOT_AGENT_URL": "http://localhost:9991", "BYTEBOT_DESKTOP_URL": "http://localhost:9990" } } ] } ``` ## Usage Examples ### Example 1: Basic Task Creation ``` User: Create a task for ByteBot to search Wikipedia for "quantum computing" Claude uses: bytebot_create_task { "description": "Go to wikipedia.org and search for 'quantum computing'", "priority": "MEDIUM" } Response: { "id": "task-123", "status": "PENDING", "priority": "MEDIUM", "createdAt": "2024-01-15T10:30:00Z" } ``` ### Example 2: Hybrid Workflow (Create → Monitor → Complete) ``` User: Create a task to log into example.com and wait for it to complete Claude uses: bytebot_create_and_monitor_task { "description": "Navigate to example.com and log in with credentials from keychain", "timeout": 60000, "pollInterval": 2000 } Response: { "taskId": "task-456", "finalStatus": "COMPLETED", "completedAt": "2024-01-15T10:31:45Z", "messagesCount": 12, "task": { ... full task details ... } } ``` ### Example 3: Task Needs Intervention ``` User: Create a task to fill out a complex form Claude uses: bytebot_create_and_monitor_task { "description": "Fill out the registration form at example.com/register" } Response (after monitoring): { "taskId": "task-789", "finalStatus": "NEEDS_HELP", "task": { "id": "task-789", "status": "NEEDS_HELP", "messages": [ { "role": "assistant", "content": "I need the user's phone number to complete this form" } ] } } User: My phone number is 555-1234 Claude uses: bytebot_intervene_in_task { "taskId": "task-789", "message": "User's phone number is 555-1234", "action": "resume", "continueMonitoring": true } Response: { "taskId": "task-789", "status": "COMPLETED", "intervention": "applied" } ``` ### Example 4: Interactive Desktop Control ``` User: Take a screenshot and click at position (500, 300) Claude uses: bytebot_screenshot Response: { "screenshot": "iVBORw0KG..." } Claude uses: bytebot_click { "x": 500, "y": 300, "button": "left" } Response: ✓ bytebot_click completed successfully ``` ### Example 5: Multi-Step Workflow ``` User: Execute a workflow to open Firefox, navigate to GitHub, and take a screenshot Claude uses: bytebot_execute_workflow { "steps": [ { "name": "Open Firefox", "description": "Switch to Firefox browser application" }, { "name": "Navigate to GitHub", "description": "Navigate to github.com in the browser" }, { "name": "Take Screenshot", "description": "Capture a screenshot of the GitHub homepage" } ], "priority": "HIGH" } Response: { "steps": [ { "name": "Open Firefox", "taskId": "task-001", "status": "COMPLETED" }, { "name": "Navigate to GitHub", "taskId": "task-002", "status": "COMPLETED" }, { "name": "Take Screenshot", "taskId": "task-003", "status": "COMPLETED" } ], "overallStatus": "completed", "totalInterventions": 0 } ``` ### Example 6: File Operations ``` User: Read the contents of /home/user/data.txt Claude uses: bytebot_read_file { "path": "/home/user/data.txt" } Response: { "content": "SGVsbG8gV29ybGQh..." } // Base64 encoded ``` ## Troubleshooting ### Error: "Cannot connect to ByteBot server" **Cause**: ByteBot is not running or endpoint URL is incorrect **Solution**: 1. Verify ByteBot is running: `curl http://localhost:9991/tasks` 2. Check `.env` file has correct URLs 3. Ensure no firewall blocking connections ### Error: "Request to ByteBot timed out" **Cause**: Task took longer than configured timeout **Solution**: 1. Increase `REQUEST_TIMEOUT` in `.env` for Agent API calls 2. Increase `DESKTOP_ACTION_TIMEOUT` for Desktop API calls 3. Use `bytebot_create_and_monitor_task` with custom timeout: ```json { "description": "Long running task", "timeout": 600000 } ``` ### Error: "Task with ID xyz not found" **Cause**: Task was deleted or ID is incorrect **Solution**: 1. List all tasks: `bytebot_list_tasks` 2. Verify task ID from response 3. Check if task was accidentally deleted ### Warning: "Screenshot size is 8.5MB" **Cause**: Screenshot is very large (high resolution display) **Solution**: 1. This is just a warning, screenshot still works 2. Consider reducing screen resolution if frequently capturing screenshots 3. Screenshots >5MB will show this warning ### Error: "Task must be in NEEDS_HELP state" **Cause**: Attempting to intervene in task that doesn't need help **Solution**: 1. Check task status first: `bytebot_get_task` 2. Only use `bytebot_intervene_in_task` when status is `NEEDS_HELP` 3. Use `bytebot_update_task` to manually change status if needed ### WebSocket Connection Failed **Cause**: WebSocket URL incorrect or ByteBot doesn't support WebSocket **Solution**: 1. Set `ENABLE_WEBSOCKET=false` in `.env` to disable WebSocket 2. Server will automatically fall back to HTTP polling 3. WebSocket is optional - all features work without it ### Error: "File size exceeds maximum allowed size" **Cause**: Trying to upload/read file larger than 10MB **Solution**: 1. Increase `MAX_FILE_SIZE` in `.env` (in bytes) 2. Split large files into smaller chunks 3. Compress files before uploading ## API Reference ### Task Priority Levels - `LOW` - Background tasks, non-urgent - `MEDIUM` - Default priority (recommended) - `HIGH` - Important tasks, process soon - `URGENT` - Critical tasks, process immediately ### Task Lifecycle States 1. `PENDING` - Task created, waiting to start 2. `IN_PROGRESS` - Task currently executing 3. `NEEDS_HELP` - Task blocked, requires intervention 4. `NEEDS_REVIEW` - Task complete but needs verification 5. `COMPLETED` - Task finished successfully 6. `CANCELLED` - Task cancelled by user 7. `FAILED` - Task failed with error ### Mouse Buttons - `left` - Primary button (default) - `right` - Context menu button - `middle` - Scroll wheel click ### Scroll Directions - `up` - Scroll up - `down` - Scroll down - `left` - Scroll left - `right` - Scroll right ### Common Applications - `firefox` - Mozilla Firefox - `chrome` - Google Chrome - `safari` - Safari (macOS) - `terminal` - Terminal/Command Prompt - `vscode` - Visual Studio Code ## Architecture ``` ┌─────────────────────────────────────────────┐ │ MCP Client (Claude) │ └─────────────────┬───────────────────────────┘ │ stdio transport ┌─────────────────▼───────────────────────────┐ │ ByteBot MCP Server │ │ ┌────────────────────────────────────────┐ │ │ │ Agent Tools │ Desktop Tools │ │ │ │ Hybrid Orchestrator │ │ │ └────────────┬──────────────┬─────────────┘ │ └───────────────┼──────────────┼───────────────┘ │ │ ┌──────────▼──┐ ┌──────▼──────┐ │ Agent API │ │ Desktop API │ │ (port 9991) │ │ (port 9990) │ └─────────────┘ └─────────────┘ │ │ ┌──────▼───────────────────▼──────┐ │ ByteBot Instance │ └─────────────────────────────────┘ ``` ## Development ### Build ```bash npm run build ``` ### Type Check ```bash npm run type-check ``` ### Watch Mode ```bash npm run dev ``` ## Environment Variables Reference | Variable | Default | Description | |----------|---------|-------------| | `BYTEBOT_AGENT_URL` | `http://localhost:9991` | ByteBot Agent API endpoint | | `BYTEBOT_DESKTOP_URL` | `http://localhost:9990` | ByteBot Desktop API endpoint | | `BYTEBOT_WS_URL` | `ws://localhost:9991` | WebSocket endpoint for real-time updates | | `ENABLE_WEBSOCKET` | `false` | Enable WebSocket connections | | `MCP_SERVER_NAME` | `bytebot-mcp` | Server identifier | | `REQUEST_TIMEOUT` | `30000` | HTTP request timeout (ms) | | `DESKTOP_ACTION_TIMEOUT` | `10000` | Desktop action timeout (ms) | | `MAX_RETRIES` | `3` | Maximum retry attempts for failed requests | | `RETRY_DELAY` | `1000` | Initial retry delay (ms) | | `TASK_POLL_INTERVAL` | `2000` | Task status polling interval (ms) | | `TASK_MONITOR_TIMEOUT` | `300000` | Maximum task monitoring duration (ms) | | `MAX_FILE_SIZE` | `10485760` | Maximum file size in bytes (10MB) | | `LOG_LEVEL` | `info` | Logging level (debug/info/warn/error) | ## License MIT ## Support For issues and questions: - ByteBot Documentation: https://docs.bytebot.ai - MCP Specification: https://modelcontextprotocol.io - Report issues: Create an issue in this repository ## Version History ### 1.0.0 (2024-01-15) - Initial release - Agent API integration (task management) - Desktop API integration (computer control) - Hybrid orchestration tools - WebSocket support for real-time updates - Comprehensive error handling and retry logic - Full TypeScript implementation with strict typing

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sensuslab/spark-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server