USAGE_EXAMPLES.md•12.4 kB
# ByteBot MCP Server - Usage Examples
This document provides practical examples of using the ByteBot MCP Server with AI assistants.
## Table of Contents
1. [Basic Task Creation](#basic-task-creation)
2. [Task Monitoring and Intervention](#task-monitoring-and-intervention)
3. [Desktop Control Operations](#desktop-control-operations)
4. [Multi-Step Workflows](#multi-step-workflows)
5. [File Operations](#file-operations)
6. [Advanced Scenarios](#advanced-scenarios)
---
## Basic Task Creation
### Example 1: Simple Web Search
**User**: "Create a task to search Google for 'TypeScript MCP servers'"
**Tool Call**: `bytebot_create_task`
```json
{
"description": "Navigate to Google and search for 'TypeScript MCP servers'",
"priority": "MEDIUM"
}
```
**Expected Response**:
```json
{
"id": "task_abc123",
"status": "PENDING",
"priority": "MEDIUM",
"description": "Navigate to Google and search for 'TypeScript MCP servers'",
"createdAt": "2024-01-15T10:00:00Z"
}
```
### Example 2: Task with Priority
**User**: "Urgently create a task to close all browser tabs"
**Tool Call**: `bytebot_create_task`
```json
{
"description": "Close all open browser tabs in the active browser window",
"priority": "URGENT"
}
```
---
## Task Monitoring and Intervention
### Example 3: Create and Wait for Completion
**User**: "Create a task to log into Gmail and wait until it's done"
**Tool Call**: `bytebot_create_and_monitor_task`
```json
{
"description": "Navigate to gmail.com and log in using saved credentials",
"timeout": 120000,
"pollInterval": 2000
}
```
**Expected Response** (after task completes):
```json
{
"taskId": "task_def456",
"finalStatus": "COMPLETED",
"completedAt": "2024-01-15T10:02:15Z",
"messagesCount": 8,
"task": {
"id": "task_def456",
"status": "COMPLETED",
"messages": [
{
"role": "assistant",
"content": "Navigating to gmail.com"
},
{
"role": "assistant",
"content": "Login form detected, entering credentials"
},
{
"role": "assistant",
"content": "Successfully logged in"
}
]
}
}
```
### Example 4: Task Needs Help - Provide Intervention
**Scenario**: Task encounters a CAPTCHA and needs help
**Step 1 - Create and Monitor**:
```json
{
"description": "Sign up for a new account on example.com"
}
```
**Response**:
```json
{
"taskId": "task_ghi789",
"finalStatus": "NEEDS_HELP",
"task": {
"status": "NEEDS_HELP",
"messages": [
{
"role": "assistant",
"content": "Encountered CAPTCHA verification. I cannot solve this automatically. Please help."
}
]
}
}
```
**Step 2 - User Solves CAPTCHA Manually, Then**:
**Tool Call**: `bytebot_intervene_in_task`
```json
{
"taskId": "task_ghi789",
"message": "CAPTCHA has been solved manually. Please continue with the signup process.",
"action": "resume",
"continueMonitoring": true
}
```
**Expected Response**:
```json
{
"taskId": "task_ghi789",
"status": "COMPLETED",
"intervention": "applied",
"task": {
"status": "COMPLETED",
"messages": [...]
}
}
```
---
## Desktop Control Operations
### Example 5: Take Screenshot and Analyze
**User**: "Take a screenshot of my screen"
**Tool Call**: `bytebot_screenshot`
```json
{}
```
**Expected Response**:
```json
{
"success": true,
"screenshot": "iVBORw0KGgoAAAANSUhEUgAAA...",
"action": "screenshot",
"executedAt": "2024-01-15T10:05:00Z"
}
```
### Example 6: Click at Specific Coordinates
**User**: "Click at position (800, 400) with a double-click"
**Tool Call**: `bytebot_click`
```json
{
"x": 800,
"y": 400,
"button": "left",
"count": 2
}
```
### Example 7: Type Text in Active Window
**User**: "Type 'Hello, World!' in the active text field"
**Tool Call**: `bytebot_type_text`
```json
{
"text": "Hello, World!",
"delay": 50
}
```
### Example 8: Keyboard Shortcut
**User**: "Press Ctrl+C to copy"
**Tool Call**: `bytebot_press_keys`
```json
{
"keys": ["ctrl", "c"]
}
```
### Example 9: Complex Mouse Operation
**User**: "Drag from (100, 200) to (500, 600)"
**Tool Call**: `bytebot_drag`
```json
{
"from_x": 100,
"from_y": 200,
"to_x": 500,
"to_y": 600
}
```
---
## Multi-Step Workflows
### Example 10: Open Browser and Navigate
**User**: "Execute a workflow to open Firefox, go to GitHub, and take a screenshot"
**Tool Call**: `bytebot_execute_workflow`
```json
{
"steps": [
{
"name": "Launch Firefox",
"description": "Switch to or launch Firefox browser application",
"timeout": 30000
},
{
"name": "Navigate to GitHub",
"description": "Navigate to github.com in the browser address bar",
"timeout": 30000
},
{
"name": "Wait for Page Load",
"description": "Wait for the GitHub homepage to fully load",
"timeout": 15000
},
{
"name": "Capture Screenshot",
"description": "Take a screenshot of the GitHub homepage",
"timeout": 10000
}
],
"priority": "HIGH",
"stopOnFailure": true
}
```
**Expected Response**:
```json
{
"steps": [
{
"name": "Launch Firefox",
"taskId": "task_001",
"status": "COMPLETED",
"completedAt": "2024-01-15T10:10:05Z"
},
{
"name": "Navigate to GitHub",
"taskId": "task_002",
"status": "COMPLETED",
"completedAt": "2024-01-15T10:10:15Z"
},
{
"name": "Wait for Page Load",
"taskId": "task_003",
"status": "COMPLETED",
"completedAt": "2024-01-15T10:10:20Z"
},
{
"name": "Capture Screenshot",
"taskId": "task_004",
"status": "COMPLETED",
"completedAt": "2024-01-15T10:10:22Z"
}
],
"overallStatus": "completed",
"totalInterventions": 0
}
```
### Example 11: Workflow with Error Recovery
**User**: "Create a workflow with retry on failure"
**Tool Call**: `bytebot_execute_workflow`
```json
{
"steps": [
{
"name": "Download File",
"description": "Download report.pdf from company server",
"timeout": 60000,
"retryOnFailure": true,
"maxRetries": 3
},
{
"name": "Open File",
"description": "Open the downloaded report.pdf",
"timeout": 30000
}
],
"stopOnFailure": false
}
```
---
## File Operations
### Example 12: Read File Content
**User**: "Read the contents of /home/user/notes.txt"
**Tool Call**: `bytebot_read_file`
```json
{
"path": "/home/user/notes.txt"
}
```
**Expected Response**:
```json
{
"success": true,
"content": "VGhpcyBpcyBteSBub3RlIGZpbGUuCkxpbmUgMg==",
"action": "read_file",
"executedAt": "2024-01-15T10:15:00Z"
}
```
**Decoded Content**: "This is my note file.\nLine 2"
### Example 13: Write File Content
**User**: "Write 'Hello World' to /tmp/test.txt"
**Tool Call**: `bytebot_write_file`
```json
{
"path": "/tmp/test.txt",
"content": "SGVsbG8gV29ybGQ="
}
```
---
## Advanced Scenarios
### Example 14: List and Monitor All Tasks
**Step 1 - List Tasks**:
**Tool Call**: `bytebot_list_tasks`
```json
{
"status": "IN_PROGRESS"
}
```
**Response**:
```json
{
"count": 2,
"tasks": [
{
"id": "task_123",
"status": "IN_PROGRESS",
"description": "Download large file"
},
{
"id": "task_456",
"status": "IN_PROGRESS",
"description": "Process data"
}
]
}
```
**Step 2 - Monitor Specific Task**:
**Tool Call**: `bytebot_monitor_task`
```json
{
"taskId": "task_123",
"timeout": 300000,
"pollInterval": 5000
}
```
### Example 15: Cancel a Running Task
**Tool Call**: `bytebot_update_task`
```json
{
"taskId": "task_789",
"status": "CANCELLED",
"message": "User requested cancellation"
}
```
### Example 16: Get Current Cursor Position
**Tool Call**: `bytebot_cursor_position`
```json
{}
```
**Expected Response**:
```json
{
"success": true,
"position": {
"x": 1024,
"y": 768
},
"action": "cursor_position",
"executedAt": "2024-01-15T10:20:00Z"
}
```
### Example 17: Complex Desktop Interaction
**Scenario**: Fill out a web form
```
1. Switch to browser
2. Click on name field (300, 200)
3. Type name
4. Click on email field (300, 250)
5. Type email
6. Click submit button (400, 350)
```
**Implementation**:
```json
{
"steps": [
{
"name": "Switch to Browser",
"description": "Switch to Firefox browser application"
},
{
"name": "Enter Name",
"description": "Click at coordinates (300, 200) and type 'John Doe'"
},
{
"name": "Enter Email",
"description": "Click at coordinates (300, 250) and type 'john@example.com'"
},
{
"name": "Submit Form",
"description": "Click the submit button at coordinates (400, 350)"
}
]
}
```
---
## Tips and Best Practices
### 1. Task Descriptions
- Be specific and clear in task descriptions
- Include all necessary context
- Mention expected outcomes
**Good**: "Navigate to amazon.com, search for 'wireless mouse', and filter by 4+ star ratings"
**Bad**: "Search Amazon"
### 2. Timeouts
- Use longer timeouts for complex tasks (>60s)
- Use shorter timeouts for simple actions (<10s)
- Default timeout is 5 minutes for monitoring
### 3. Priority Levels
- Use `URGENT` sparingly for critical operations
- Default to `MEDIUM` for most tasks
- Use `LOW` for background/cleanup tasks
### 4. Intervention Handling
- Always monitor tasks that may need intervention
- Provide clear, actionable guidance in intervention messages
- Use `continueMonitoring: true` to wait for completion after intervention
### 5. Workflows
- Break complex operations into discrete steps
- Use `retryOnFailure` for unreliable operations (network requests)
- Set `stopOnFailure: false` for non-critical step sequences
### 6. Desktop Control
- Take screenshots first to identify coordinates
- Use `bytebot_wait` between rapid actions to allow UI updates
- Test coordinate positions with `bytebot_move_mouse` before clicking
---
## Error Scenarios and Handling
### Timeout Error
```json
{
"error": "Task monitoring timeout after 300000ms. Task may still be running.",
"details": "Consider increasing timeout or checking task manually"
}
```
**Solution**: Increase timeout or check task status separately
### Task Not Found
```json
{
"error": "Task with ID task_xyz not found"
}
```
**Solution**: Verify task ID with `bytebot_list_tasks`
### ByteBot Unreachable
```json
{
"error": "Cannot connect to ByteBot server. Please ensure ByteBot is running and the endpoint URL is correct.",
"details": {
"endpoint": "http://localhost:9991"
}
}
```
**Solution**: Start ByteBot or check network configuration
---
## Performance Optimization
### Caching
- Task data is cached for 5 seconds to reduce API calls
- Set `useCache: false` in `bytebot_get_task` for fresh data
### Polling Intervals
- Default: 2 seconds
- Adjust based on task complexity:
- Fast tasks: 1000ms
- Slow tasks: 5000ms
### WebSocket vs Polling
- Enable WebSocket for real-time updates (zero latency)
- Polling works fine for most use cases
- WebSocket reduces server load with many concurrent tasks
---
## Integration Examples
### Example: Automated Testing Workflow
```json
{
"steps": [
{
"name": "Open Test Environment",
"description": "Navigate to http://localhost:3000 in Firefox"
},
{
"name": "Run Login Test",
"description": "Fill login form with test credentials and submit"
},
{
"name": "Verify Dashboard",
"description": "Check that dashboard loads correctly"
},
{
"name": "Take Evidence Screenshot",
"description": "Capture screenshot of successful login"
},
{
"name": "Logout",
"description": "Click logout button"
}
],
"priority": "HIGH"
}
```
### Example: Data Entry Automation
```json
{
"description": "Open CRM system, navigate to contacts page, and enter new contact details from contacts.csv file",
"priority": "MEDIUM"
}
```
Monitor for `NEEDS_HELP` if field validation fails, then intervene with corrections.
---
## Conclusion
These examples demonstrate the full range of ByteBot MCP Server capabilities. Start with simple task creation, then progress to hybrid workflows and complex multi-step automation. Always monitor long-running tasks and be prepared to provide intervention when needed.
For more information, see the main [README.md](../README.md).