ByteBot MCP Server

spark-mcp
examples

USAGE_EXAMPLES.md•12.4 kB

# ByteBot MCP Server - Usage Examples This document provides practical examples of using the ByteBot MCP Server with AI assistants. ## Table of Contents 1. [Basic Task Creation](#basic-task-creation) 2. [Task Monitoring and Intervention](#task-monitoring-and-intervention) 3. [Desktop Control Operations](#desktop-control-operations) 4. [Multi-Step Workflows](#multi-step-workflows) 5. [File Operations](#file-operations) 6. [Advanced Scenarios](#advanced-scenarios) --- ## Basic Task Creation ### Example 1: Simple Web Search **User**: "Create a task to search Google for 'TypeScript MCP servers'" **Tool Call**: `bytebot_create_task` ```json { "description": "Navigate to Google and search for 'TypeScript MCP servers'", "priority": "MEDIUM" } ``` **Expected Response**: ```json { "id": "task_abc123", "status": "PENDING", "priority": "MEDIUM", "description": "Navigate to Google and search for 'TypeScript MCP servers'", "createdAt": "2024-01-15T10:00:00Z" } ``` ### Example 2: Task with Priority **User**: "Urgently create a task to close all browser tabs" **Tool Call**: `bytebot_create_task` ```json { "description": "Close all open browser tabs in the active browser window", "priority": "URGENT" } ``` --- ## Task Monitoring and Intervention ### Example 3: Create and Wait for Completion **User**: "Create a task to log into Gmail and wait until it's done" **Tool Call**: `bytebot_create_and_monitor_task` ```json { "description": "Navigate to gmail.com and log in using saved credentials", "timeout": 120000, "pollInterval": 2000 } ``` **Expected Response** (after task completes): ```json { "taskId": "task_def456", "finalStatus": "COMPLETED", "completedAt": "2024-01-15T10:02:15Z", "messagesCount": 8, "task": { "id": "task_def456", "status": "COMPLETED", "messages": [ { "role": "assistant", "content": "Navigating to gmail.com" }, { "role": "assistant", "content": "Login form detected, entering credentials" }, { "role": "assistant", "content": "Successfully logged in" } ] } } ``` ### Example 4: Task Needs Help - Provide Intervention **Scenario**: Task encounters a CAPTCHA and needs help **Step 1 - Create and Monitor**: ```json { "description": "Sign up for a new account on example.com" } ``` **Response**: ```json { "taskId": "task_ghi789", "finalStatus": "NEEDS_HELP", "task": { "status": "NEEDS_HELP", "messages": [ { "role": "assistant", "content": "Encountered CAPTCHA verification. I cannot solve this automatically. Please help." } ] } } ``` **Step 2 - User Solves CAPTCHA Manually, Then**: **Tool Call**: `bytebot_intervene_in_task` ```json { "taskId": "task_ghi789", "message": "CAPTCHA has been solved manually. Please continue with the signup process.", "action": "resume", "continueMonitoring": true } ``` **Expected Response**: ```json { "taskId": "task_ghi789", "status": "COMPLETED", "intervention": "applied", "task": { "status": "COMPLETED", "messages": [...] } } ``` --- ## Desktop Control Operations ### Example 5: Take Screenshot and Analyze **User**: "Take a screenshot of my screen" **Tool Call**: `bytebot_screenshot` ```json {} ``` **Expected Response**: ```json { "success": true, "screenshot": "iVBORw0KGgoAAAANSUhEUgAAA...", "action": "screenshot", "executedAt": "2024-01-15T10:05:00Z" } ``` ### Example 6: Click at Specific Coordinates **User**: "Click at position (800, 400) with a double-click" **Tool Call**: `bytebot_click` ```json { "x": 800, "y": 400, "button": "left", "count": 2 } ``` ### Example 7: Type Text in Active Window **User**: "Type 'Hello, World!' in the active text field" **Tool Call**: `bytebot_type_text` ```json { "text": "Hello, World!", "delay": 50 } ``` ### Example 8: Keyboard Shortcut **User**: "Press Ctrl+C to copy" **Tool Call**: `bytebot_press_keys` ```json { "keys": ["ctrl", "c"] } ``` ### Example 9: Complex Mouse Operation **User**: "Drag from (100, 200) to (500, 600)" **Tool Call**: `bytebot_drag` ```json { "from_x": 100, "from_y": 200, "to_x": 500, "to_y": 600 } ``` --- ## Multi-Step Workflows ### Example 10: Open Browser and Navigate **User**: "Execute a workflow to open Firefox, go to GitHub, and take a screenshot" **Tool Call**: `bytebot_execute_workflow` ```json { "steps": [ { "name": "Launch Firefox", "description": "Switch to or launch Firefox browser application", "timeout": 30000 }, { "name": "Navigate to GitHub", "description": "Navigate to github.com in the browser address bar", "timeout": 30000 }, { "name": "Wait for Page Load", "description": "Wait for the GitHub homepage to fully load", "timeout": 15000 }, { "name": "Capture Screenshot", "description": "Take a screenshot of the GitHub homepage", "timeout": 10000 } ], "priority": "HIGH", "stopOnFailure": true } ``` **Expected Response**: ```json { "steps": [ { "name": "Launch Firefox", "taskId": "task_001", "status": "COMPLETED", "completedAt": "2024-01-15T10:10:05Z" }, { "name": "Navigate to GitHub", "taskId": "task_002", "status": "COMPLETED", "completedAt": "2024-01-15T10:10:15Z" }, { "name": "Wait for Page Load", "taskId": "task_003", "status": "COMPLETED", "completedAt": "2024-01-15T10:10:20Z" }, { "name": "Capture Screenshot", "taskId": "task_004", "status": "COMPLETED", "completedAt": "2024-01-15T10:10:22Z" } ], "overallStatus": "completed", "totalInterventions": 0 } ``` ### Example 11: Workflow with Error Recovery **User**: "Create a workflow with retry on failure" **Tool Call**: `bytebot_execute_workflow` ```json { "steps": [ { "name": "Download File", "description": "Download report.pdf from company server", "timeout": 60000, "retryOnFailure": true, "maxRetries": 3 }, { "name": "Open File", "description": "Open the downloaded report.pdf", "timeout": 30000 } ], "stopOnFailure": false } ``` --- ## File Operations ### Example 12: Read File Content **User**: "Read the contents of /home/user/notes.txt" **Tool Call**: `bytebot_read_file` ```json { "path": "/home/user/notes.txt" } ``` **Expected Response**: ```json { "success": true, "content": "VGhpcyBpcyBteSBub3RlIGZpbGUuCkxpbmUgMg==", "action": "read_file", "executedAt": "2024-01-15T10:15:00Z" } ``` **Decoded Content**: "This is my note file.\nLine 2" ### Example 13: Write File Content **User**: "Write 'Hello World' to /tmp/test.txt" **Tool Call**: `bytebot_write_file` ```json { "path": "/tmp/test.txt", "content": "SGVsbG8gV29ybGQ=" } ``` --- ## Advanced Scenarios ### Example 14: List and Monitor All Tasks **Step 1 - List Tasks**: **Tool Call**: `bytebot_list_tasks` ```json { "status": "IN_PROGRESS" } ``` **Response**: ```json { "count": 2, "tasks": [ { "id": "task_123", "status": "IN_PROGRESS", "description": "Download large file" }, { "id": "task_456", "status": "IN_PROGRESS", "description": "Process data" } ] } ``` **Step 2 - Monitor Specific Task**: **Tool Call**: `bytebot_monitor_task` ```json { "taskId": "task_123", "timeout": 300000, "pollInterval": 5000 } ``` ### Example 15: Cancel a Running Task **Tool Call**: `bytebot_update_task` ```json { "taskId": "task_789", "status": "CANCELLED", "message": "User requested cancellation" } ``` ### Example 16: Get Current Cursor Position **Tool Call**: `bytebot_cursor_position` ```json {} ``` **Expected Response**: ```json { "success": true, "position": { "x": 1024, "y": 768 }, "action": "cursor_position", "executedAt": "2024-01-15T10:20:00Z" } ``` ### Example 17: Complex Desktop Interaction **Scenario**: Fill out a web form ``` 1. Switch to browser 2. Click on name field (300, 200) 3. Type name 4. Click on email field (300, 250) 5. Type email 6. Click submit button (400, 350) ``` **Implementation**: ```json { "steps": [ { "name": "Switch to Browser", "description": "Switch to Firefox browser application" }, { "name": "Enter Name", "description": "Click at coordinates (300, 200) and type 'John Doe'" }, { "name": "Enter Email", "description": "Click at coordinates (300, 250) and type 'john@example.com'" }, { "name": "Submit Form", "description": "Click the submit button at coordinates (400, 350)" } ] } ``` --- ## Tips and Best Practices ### 1. Task Descriptions - Be specific and clear in task descriptions - Include all necessary context - Mention expected outcomes **Good**: "Navigate to amazon.com, search for 'wireless mouse', and filter by 4+ star ratings" **Bad**: "Search Amazon" ### 2. Timeouts - Use longer timeouts for complex tasks (>60s) - Use shorter timeouts for simple actions (<10s) - Default timeout is 5 minutes for monitoring ### 3. Priority Levels - Use `URGENT` sparingly for critical operations - Default to `MEDIUM` for most tasks - Use `LOW` for background/cleanup tasks ### 4. Intervention Handling - Always monitor tasks that may need intervention - Provide clear, actionable guidance in intervention messages - Use `continueMonitoring: true` to wait for completion after intervention ### 5. Workflows - Break complex operations into discrete steps - Use `retryOnFailure` for unreliable operations (network requests) - Set `stopOnFailure: false` for non-critical step sequences ### 6. Desktop Control - Take screenshots first to identify coordinates - Use `bytebot_wait` between rapid actions to allow UI updates - Test coordinate positions with `bytebot_move_mouse` before clicking --- ## Error Scenarios and Handling ### Timeout Error ```json { "error": "Task monitoring timeout after 300000ms. Task may still be running.", "details": "Consider increasing timeout or checking task manually" } ``` **Solution**: Increase timeout or check task status separately ### Task Not Found ```json { "error": "Task with ID task_xyz not found" } ``` **Solution**: Verify task ID with `bytebot_list_tasks` ### ByteBot Unreachable ```json { "error": "Cannot connect to ByteBot server. Please ensure ByteBot is running and the endpoint URL is correct.", "details": { "endpoint": "http://localhost:9991" } } ``` **Solution**: Start ByteBot or check network configuration --- ## Performance Optimization ### Caching - Task data is cached for 5 seconds to reduce API calls - Set `useCache: false` in `bytebot_get_task` for fresh data ### Polling Intervals - Default: 2 seconds - Adjust based on task complexity: - Fast tasks: 1000ms - Slow tasks: 5000ms ### WebSocket vs Polling - Enable WebSocket for real-time updates (zero latency) - Polling works fine for most use cases - WebSocket reduces server load with many concurrent tasks --- ## Integration Examples ### Example: Automated Testing Workflow ```json { "steps": [ { "name": "Open Test Environment", "description": "Navigate to http://localhost:3000 in Firefox" }, { "name": "Run Login Test", "description": "Fill login form with test credentials and submit" }, { "name": "Verify Dashboard", "description": "Check that dashboard loads correctly" }, { "name": "Take Evidence Screenshot", "description": "Capture screenshot of successful login" }, { "name": "Logout", "description": "Click logout button" } ], "priority": "HIGH" } ``` ### Example: Data Entry Automation ```json { "description": "Open CRM system, navigate to contacts page, and enter new contact details from contacts.csv file", "priority": "MEDIUM" } ``` Monitor for `NEEDS_HELP` if field validation fails, then intervene with corrections. --- ## Conclusion These examples demonstrate the full range of ByteBot MCP Server capabilities. Start with simple task creation, then progress to hybrid workflows and complex multi-step automation. Always monitor long-running tasks and be prepared to provide intervention when needed. For more information, see the main [README.md](../README.md).

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sensuslab/spark-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server