Gemini Web Automation MCP

MIT License

Overview InspectNew Endpoints Schema Related Servers Reviews Score

computer-use-mcp

README.md•28.8 kB

# Gemini Web Automation MCP [![PyPI](https://img.shields.io/pypi/v/computer-use-mcp.svg)](https://pypi.org/project/computer-use-mcp/) [![Python](https://img.shields.io/badge/python-3.10+-blue.svg)](https://python.org) [![Tests](https://github.com/vincenthopf/computer-use-mcp/actions/workflows/test.yml/badge.svg)](https://github.com/vincenthopf/computer-use-mcp/actions/workflows/test.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![MCP](https://img.shields.io/badge/MCP-2025-green.svg)](https://modelcontextprotocol.io) Production-ready Model Context Protocol (MCP) server providing AI-powered web browsing automation using Google's Gemini 2.5 Computer Use API. Built with FastMCP and optimized for 4-5x faster performance than baseline implementations. ## Table of Contents - [Overview](#overview) - [Features](#features) - [Quick Start](#quick-start) - [Installation](#installation) - [Configuration](#configuration) - [Usage](#usage) - [Architecture](#architecture) - [Browser Actions](#browser-actions) - [Performance](#performance) - [Best Practices](#best-practices) - [Troubleshooting](#troubleshooting) - [Development](#development) - [Deployment](#deployment) - [Roadmap](#roadmap) - [Changelog](#changelog) - [Security](#security) - [License](#license) - [Support](#support) ## Overview Gemini Web Automation MCP enables Claude Desktop and other MCP clients to perform intelligent web browsing automation. The AI agent navigates websites, clicks buttons, fills forms, searches for information, and interacts with web pages like a human user. **Key Statistics:** - 7 MCP tools for comprehensive browser control - 13 browser actions supported (click, type, scroll, navigate, etc.) - 4-5x faster than naive implementations - 90% context reduction with compact progress mode - 1440x900 optimized resolution (Gemini recommended) ## Features ### Core Capabilities **7 Production-Ready MCP Tools:** - `browse_web` - Synchronous web browsing with immediate completion - `start_web_task` - Start long-running tasks in background - `check_web_task` - Monitor progress with compact/full modes - `wait` - Intelligent rate limiting (1-60 seconds) - `stop_web_task` - Cancel running background tasks - `list_web_tasks` - View all active and completed tasks - `get_web_screenshots` - Retrieve session screenshots for verification **Advanced Features:** - Real-time progress tracking with timestamped events - Automatic screenshot capture at each step - Safety decision framework (Gemini safety controls) - Context-aware polling with recommended delays - Background task management with status tracking - Session-based screenshot organization ### Technical Highlights - **MCP Protocol Compliant:** Follows 2025 Model Context Protocol best practices - **Performance Optimized:** Conditional wait states (0.3-3s), fast page loads, single screenshot per turn - **Safety-First:** Implements Gemini's safety decision acknowledgment framework - **Production Ready:** Comprehensive error handling, logging, and validation - **User-Friendly:** Action-oriented tool names, clear descriptions, helpful examples ## Quick Start ### Prerequisites - Python 3.10 or higher - UV package manager ([Install UV](https://github.com/astral-sh/uv)) - Gemini API key ([Get one here](https://ai.google.dev/)) ### Installation (Super Simple!) **Option A: One-Line Install (PyPI)** ```json // Add to Claude Desktop config { "mcpServers": { "computer-use": { "command": "uvx", "args": ["computer-use-mcp"], "env": {"GEMINI_API_KEY": "your_key_here"} } } } ``` Then: `playwright install chromium` and restart Claude Desktop. Done! **Option B: Local Development** ```bash git clone https://github.com/vincenthopf/computer-use-mcp.git cd computer-use-mcp uv sync playwright install chromium cp .env.sample .env # Edit .env with your GEMINI_API_KEY uv run mcp dev server.py ``` The MCP Inspector will open at `http://localhost:6274` where you can test all tools interactively. ### Verification Run validation tests to ensure everything is set up correctly: ```bash uv run python3 test_server.py ``` All tests should pass with ✓ markers. ## Installation ### Method 1: UVX (Recommended - One Command) The simplest way to use this MCP server. Add to your Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): ```json { "mcpServers": { "computer-use": { "command": "uvx", "args": ["computer-use-mcp"], "env": { "GEMINI_API_KEY": "your_api_key_here" } } } } ``` **That's it!** `uvx` automatically downloads and installs the package from PyPI. Restart Claude Desktop and you're ready to go. **Note:** You'll also need to install Playwright browsers once: ```bash playwright install chromium ``` ### Method 2: Local Development For development or contributing: ```bash git clone https://github.com/vincenthopf/computer-use-mcp.git cd computer-use-mcp uv sync playwright install chromium cp .env.sample .env # Edit .env with your GEMINI_API_KEY uv run mcp dev server.py ``` ### Method 3: Direct Git Install Install directly from GitHub without cloning: ```json { "mcpServers": { "computer-use": { "command": "uvx", "args": ["--from", "git+https://github.com/vincenthopf/computer-use-mcp.git", "computer-use-mcp"], "env": { "GEMINI_API_KEY": "your_api_key_here" } } } } ``` ## Configuration Create a `.env` file in the project root with the following variables: ### Required Configuration ```env # Gemini API Configuration GEMINI_API_KEY=your_api_key_here GEMINI_MODEL=gemini-2.5-computer-use-preview-10-2025 ``` ### Optional Configuration ```env # Browser Configuration SCREEN_WIDTH=1440 # Recommended by Google (don't change) SCREEN_HEIGHT=900 # Recommended by Google (don't change) HEADLESS=false # Set to 'true' for faster headless mode # Output Configuration (optional - defaults to system temp directory) # SCREENSHOT_OUTPUT_DIR=/custom/path/to/screenshots ``` **Note:** Screen resolution of 1440x900 is optimized for Gemini's Computer Use model. Other resolutions may degrade performance. ## Usage ### Synchronous Workflow For quick tasks that complete in under 10 seconds, use the synchronous `browse_web` tool: ```python # Example: Quick product search result = browse_web( task="Go to Amazon and find the top 3 gaming laptops with prices", url="https://www.amazon.com" ) # Response includes full results and progress history { "ok": true, "data": "Found top 3 gaming laptops...", "session_id": "20251017_143022_abc123", "screenshot_dir": "output_screenshots/20251017_143022_abc123", "progress": [ {"timestamp": "...", "type": "info", "message": "Started browser automation"}, {"timestamp": "...", "type": "turn", "message": "Turn 1/30"}, {"timestamp": "...", "type": "function_call", "message": "Action: navigate"} ] } ``` ### Asynchronous Workflow For long-running tasks (15+ seconds), use the async workflow to monitor progress: **Step 1: Start the task** ```python result = start_web_task( task="Research top 10 AI companies and compile details", url="https://www.google.com" ) # Returns immediately with task_id # {"task_id": "abc-123-def", "status": "running"} ``` **Step 2: Check progress** ```python # Wait 5 seconds wait(5) # Check status (compact mode - recommended) status = check_web_task(task_id="abc-123-def", compact=true) # Response shows recent progress summary { "ok": true, "task_id": "abc-123-def", "status": "running", "progress_summary": { "total_steps": 12, "recent_actions": [ "Turn 8/30", "Action: type_text_at", "Action: click_at" ] }, "recommended_poll_after": "2025-01-17T14:30:15Z" } ``` **Step 3: Get final result** ```python # Continue polling until status is "completed" final = check_web_task(task_id="abc-123-def") # When completed, result field contains data { "ok": true, "status": "completed", "result": { "ok": true, "data": "Compiled research on 10 AI companies...", "session_id": "..." } } ``` ### All 7 MCP Tools #### 1. browse_web Synchronous web browsing that waits for task completion. **Parameters:** - `task` (string, required): Natural language description of what to accomplish - `url` (string, optional): Starting URL (defaults to Google) **Returns:** - `ok` (boolean): Success status - `data` (string): Task completion message with results - `session_id` (string): Unique session identifier - `screenshot_dir` (string): Path to saved screenshots - `progress` (array): Full progress history with timestamps - `error` (string): Error message if task failed **Example:** ```python browse_web( task="Go to example.com and click the login button", url="https://example.com" ) ``` #### 2. start_web_task Start a long-running web task in the background. **Parameters:** - `task` (string, required): Natural language task description - `url` (string, optional): Starting URL **Returns:** - `ok` (boolean): Task started successfully - `task_id` (string): Unique ID for checking progress - `status` (string): Always "running" initially - `message` (string): Instructions for checking progress **Example:** ```python start_web_task( task="Research and compare prices for iPhone 15 on 5 different sites" ) ``` #### 3. check_web_task Monitor progress of a background task. **Parameters:** - `task_id` (string, required): Task ID from start_web_task - `compact` (boolean, optional): Return summary only (default: true) **Returns (compact mode):** - `ok` (boolean): Success status - `task_id` (string): Task identifier - `status` (string): "pending" | "running" | "completed" | "failed" | "cancelled" - `progress_summary` (object): Recent actions and total steps - `result` (object): Task results (when completed) - `error` (string): Error message (when failed) - `recommended_poll_after` (string): ISO timestamp for next check **Example:** ```python check_web_task(task_id="abc-123", compact=true) ``` #### 4. wait Pause execution for rate limiting and polling delays. **Parameters:** - `seconds` (integer, required): Wait time in seconds (1-60) **Returns:** - `ok` (boolean): Success status - `waited_seconds` (integer): Actual wait time - `message` (string): Confirmation message **Example:** ```python wait(5) # Wait 5 seconds between status checks ``` #### 5. stop_web_task Cancel a running background task. **Parameters:** - `task_id` (string, required): Task ID to cancel **Returns:** - `ok` (boolean): Cancellation success - `message` (string): Confirmation message - `task_id` (string): Cancelled task ID - `error` (string): Error if task not found **Example:** ```python stop_web_task(task_id="abc-123") ``` #### 6. list_web_tasks View all tasks (active and completed). **Parameters:** None **Returns:** - `ok` (boolean): Success status - `tasks` (array): List of task status objects - `count` (integer): Total number of tasks - `active_count` (integer): Number of running tasks **Example:** ```python list_web_tasks() ``` #### 7. get_web_screenshots Retrieve screenshots from a completed session. **Parameters:** - `session_id` (string, required): Session ID from browse_web or check_web_task **Returns:** - `ok` (boolean): Success status - `screenshots` (array): List of screenshot file paths - `session_id` (string): Session identifier - `count` (integer): Number of screenshots - `error` (string): Error if session not found **Example:** ```python get_web_screenshots(session_id="20251017_143022_abc123") ``` ## Architecture ``` ┌─────────────────────────────────────────┐ │ MCP Client (Claude Desktop, etc.) │ └─────────────────┬───────────────────────┘ │ MCP Protocol (JSON-RPC) ┌─────────────────▼───────────────────────┐ │ FastMCP Server │ │ 7 Tools: browse_web, start_web_task, │ │ check_web_task, wait, stop_web_task, │ │ list_web_tasks, get_web_screenshots │ └─────────────────┬───────────────────────┘ │ ┌─────────────────▼───────────────────────┐ │ BrowserTaskManager │ │ Manages async task queue, status │ │ tracking, and background threads │ └─────────────────┬───────────────────────┘ │ ┌─────────────────▼───────────────────────┐ │ GeminiBrowserAgent │ │ Executes browser automation loop: │ │ Screenshot → Gemini → Actions → Loop │ └─────────────────┬───────────────────────┘ │ ┌─────────┴─────────┐ │ │ ┌───────▼────────┐ ┌──────▼────────┐ │ Gemini API │ │ Playwright │ │ (Vision+AI) │ │ (Chromium) │ └────────────────┘ └───────────────┘ ``` ### How It Works 1. **Task Submission:** MCP client sends natural language task + optional URL 2. **Browser Launch:** Playwright launches Chromium (1440x900 viewport) 3. **Gemini Loop:** Screenshot → Gemini vision analysis → Browser actions → Screenshot (repeat) 4. **Completion:** Gemini returns text response when task is complete 5. **Cleanup:** Browser closed, screenshots saved to session directory ## Browser Actions The Gemini Computer Use API supports 13 browser automation actions: | Action | Description | Example Use | |--------|-------------|-------------| | `navigate` | Go to a URL | Navigate to https://example.com | | `click_at` | Click at normalized coordinates (0-999) | Click button at position (500, 300) | | `hover_at` | Hover at normalized coordinates | Hover over menu item | | `type_text_at` | Type text at coordinates | Type search query in input field | | `key_combination` | Press keyboard combinations | Press Enter, Ctrl+A, etc. | | `scroll_document` | Page-level scrolling | Scroll down one page | | `scroll_at` | Scroll at specific coordinates | Scroll within specific element | | `drag_and_drop` | Drag from source to destination | Reorder list items | | `go_back` | Navigate backward in history | Return to previous page | | `go_forward` | Navigate forward in history | Go to next page | | `search` | Navigate to Google search | Start a Google search | | `wait_5_seconds` | Wait 5 seconds | Wait for dynamic content | | `open_web_browser` | No-op (browser already open) | - | **Coordinate System:** Gemini uses a normalized 1000x1000 grid. The agent automatically converts to actual pixels based on your screen resolution. ## Performance ### Benchmark Results | Task Type | Duration | Turns | Description | |-----------|----------|-------|-------------| | Simple | 5-8 seconds | 1-2 | Navigate to a page, verify content | | Medium | 20-40 seconds | 5-8 | Search, click, extract information | | Complex | 60-120 seconds | 15-30 | Multi-step workflows, data compilation | ### Optimization Strategies **Before Optimization:** - 6 seconds wait after every action - `networkidle` page loads (waits for all network requests) - 3 screenshots captured per turn - Sequential action execution **After Optimization:** - 0.3-3 seconds conditional waits (navigation only) - `domcontentloaded` page loads (DOM ready) - 1 screenshot per turn (reused for all responses) - Parallel function execution when possible **Performance Improvement:** 4-5x faster execution ### Latency Breakdown (Per Turn) - Gemini API call: 2-4 seconds (vision processing) - Browser action: 0.3-3 seconds (optimized waits) - Screenshot capture: <0.5 seconds - **Total per turn:** ~3-8 seconds ## Best Practices ### Task Design **1. Be Specific and Clear** ```python # ❌ Bad: Vague task "Search for stuff" # ✅ Good: Specific task "Go to Amazon, search for 'wireless headphones', and find the top 3 results with prices" ``` **2. Choose the Right Tool** ```python # ❌ Bad: Async for simple tasks start_web_task("Navigate to google.com") # Requires 3+ tool calls # ✅ Good: Sync for simple tasks browse_web("Navigate to google.com") # Single tool call ``` **3. Use Appropriate Polling** ```python # ❌ Bad: Poll too frequently check_web_task(task_id) # Called every 0.5 seconds # ✅ Good: Poll every 3-5 seconds wait(5) check_web_task(task_id) ``` ### Error Handling **4. Handle All Status States** ```python status = check_web_task(task_id) if status["status"] == "completed": process_result(status["result"]) elif status["status"] == "failed": handle_error(status["error"]) elif status["status"] == "running": continue_polling() ``` **5. Use Compact Mode** ```python # ✅ Recommended: Compact mode (90% smaller) check_web_task(task_id, compact=true) # ⚠️ Only when needed: Full progress check_web_task(task_id, compact=false) ``` ### Security **6. Review Screenshots** Always check saved screenshots to verify agent behavior, especially for sensitive operations. **7. Environment Variables** Never commit API keys. Always use environment variables or secure vaults. **8. Rate Limiting** Use the `wait` tool to respect rate limits and avoid overwhelming services. **9. Domain Validation** Be cautious with user-provided URLs. Consider implementing domain allowlists for production. **10. Logging and Audit** All actions are logged with timestamps. Review logs for debugging and compliance. ## Troubleshooting ### Common Issues **Error: 400 INVALID_ARGUMENT (safety decision)** - **Solution:** This is fixed in v1.0.0. Update to latest version. **Context window filling up too fast** - **Solution:** Use `compact=true` in `check_web_task` (default behavior). **Tasks timing out at 30 turns** - **Solution:** Break complex tasks into smaller subtasks or increase `max_turns` in `browser_agent.py`. **Browser not visible during execution** - **Solution:** Set `HEADLESS=false` in `.env` to see the browser window. **"No module named 'mcp'" error** - **Solution:** Activate virtual environment and run `uv sync`. **"GEMINI_API_KEY environment variable not set"** - **Solution:** Create `.env` file with your API key or set it in Claude Desktop config. **"Executable doesn't exist" (Playwright)** - **Solution:** Run `playwright install chromium`. **MCP server not showing in Claude Desktop** - **Solution:** Verify absolute paths in config, ensure `uv` is in PATH, restart Claude Desktop completely. ### FAQ **Q: Can I use both synchronous and asynchronous workflows?** A: Yes! Use `browse_web` for quick tasks (<10s) and `start_web_task` for long tasks (>15s). **Q: What happens to old completed tasks?** A: Tasks auto-cleanup after 24 hours to free memory. **Q: Can I check progress of a synchronous task?** A: No, but the response includes full progress history after completion. **Q: How many tasks can run simultaneously?** A: Unlimited. Each task runs in its own browser instance and thread. **Q: Does Claude automatically poll async tasks?** A: No, Claude must manually call `check_web_task()` multiple times with `wait()` between calls. **Q: Can I cancel a task mid-execution?** A: Yes, use `stop_web_task(task_id)` to cancel any running task. ## Development ### Setup for Contributors **1. Fork and clone:** ```bash git clone https://github.com/YOUR_USERNAME/gemini-web-automation-mcp.git cd gemini-web-automation-mcp ``` **2. Install dependencies:** ```bash uv sync playwright install chromium ``` **3. Create `.env`:** ```bash cp .env.sample .env # Add your GEMINI_API_KEY ``` **4. Test your setup:** ```bash uv run mcp dev server.py ``` ### Testing **Manual Testing:** ```bash # Start MCP Inspector uv run mcp dev server.py # Opens web interface at http://localhost:6274 # Test all tools interactively ``` **Validation Tests:** ```bash # Run automated validation uv run python3 test_server.py ``` ### Coding Standards - Follow PEP 8 style guide - Use type hints for all function signatures - Write docstrings for public functions/classes - Keep functions focused and small - Use meaningful variable names **Example Function:** ```python async def check_web_task(task_id: str, compact: bool = True) -> dict[str, Any]: """ Check progress of a background web browsing task. Args: task_id: Task ID from start_web_task() compact: Return summary only (default: True) Returns: Dictionary containing task status and progress Raises: ValueError: If task_id is invalid """ # Implementation pass ``` ### Contributing **How to Contribute:** 1. Fork the repository 2. Create a feature branch: `git checkout -b feature/amazing-feature` 3. Make your changes following coding standards 4. Test your changes with MCP Inspector 5. Update documentation if needed 6. Commit with clear message: `git commit -m "Add: Brief description"` 7. Push to your fork: `git push origin feature/amazing-feature` 8. Open a Pull Request **Commit Message Prefixes:** - `Add:` - New features - `Fix:` - Bug fixes - `Update:` - Updates to existing features - `Refactor:` - Code refactoring - `Docs:` - Documentation changes - `Test:` - Adding or updating tests **Pull Request Checklist:** - All tests pass - Documentation updated - Code follows style guidelines - No breaking changes (or clearly documented) - Commit messages are clear ### Tool Design Principles When adding or modifying MCP tools: 1. **User-focused naming:** `browse_web` not `execute_browser_automation` 2. **Clear descriptions:** Explain what users accomplish, not technical details 3. **Action-oriented:** Use verbs (check, start, stop, wait) 4. **Proper validation:** Validate inputs and provide helpful error messages 5. **Consistent responses:** Always return `{"ok": bool, ...}` format ## Deployment ### Pre-Deployment Checklist **Project Files:** - [x] README.md with comprehensive documentation - [x] LICENSE (MIT) - [x] CHANGELOG.md - [x] pyproject.toml with proper metadata - [x] .gitignore with comprehensive rules - [x] .env.sample template - [x] Core files (server.py, browser_agent.py, task_manager.py) **Code Quality:** - [x] All tests passing - [x] Safety decision bug fixed - [x] Compact progress mode implemented - [x] MCP best practices followed - [x] Performance optimizations applied ### GitHub Release Process **1. Create GitHub Repository:** ```bash # On GitHub, create new repository "gemini-web-automation-mcp" # Do NOT initialize with README (we have it) ``` **2. Push to GitHub:** ```bash git remote add origin https://github.com/YOUR_USERNAME/gemini-web-automation-mcp.git git branch -M main git push -u origin main ``` **3. Create Release Tag:** ```bash git tag -a v1.0.0 -m "Version 1.0.0 - Initial production release" git push origin v1.0.0 ``` **4. Create GitHub Release:** - Go to Releases → Create a new release - Choose tag: `v1.0.0` - Release title: `v1.0.0 - Initial Production Release` - Description: Copy from CHANGELOG.md - Publish release ### Repository Settings **Configure:** - Description: Production-ready MCP server for AI-powered web automation - Topics: `mcp`, `gemini`, `browser-automation`, `claude-desktop`, `ai-agents`, `playwright` - Enable Issues for bug reports - Enable Discussions for community support ### Distribution Methods **Method 1: Direct Git Clone (Recommended)** ```bash git clone https://github.com/yourusername/gemini-web-automation-mcp.git cd gemini-web-automation-mcp uv sync playwright install chromium ``` **Method 2: UVX (Future - when published to PyPI)** ```bash uvx gemini-web-automation-mcp ``` ## Roadmap ### Planned Features **Priority 1: Security Enhancements** - Human-in-the-loop confirmation UI for safety decisions - Domain allowlist/blocklist for navigation - Input sanitization and validation - Container-based sandboxing for production **Priority 2: Reliability** - Retry logic with exponential backoff - Better error recovery mechanisms - Network resilience improvements - Connection timeout handling **Priority 3: Functionality** - Cookie and session management - Form auto-fill templates - Multi-tab support - Mobile viewport emulation **Priority 4: Developer Experience** - Proxy support for corporate environments - Custom wait conditions - Advanced screenshot comparison - Performance profiling tools ### Known Limitations - Maximum 30 turns per task (configurable but not recommended to increase) - Browser automation only (no desktop OS-level control) - Single browser instance per task (no tab switching) - Limited to Chromium (Firefox/WebKit not supported) - Safety confirmations not yet implemented (future enhancement) ## Changelog ### [1.0.0] - 2025-01-17 **Initial production-ready release** **Added:** - 7 MCP tools for comprehensive browser automation - Real-time progress tracking with compact mode (90% size reduction) - Safety decision framework (fixes 400 INVALID_ARGUMENT error) - Context-aware polling with recommended delay timestamps - Performance optimizations (4-5x faster than baseline) - Automatic screenshot capture at each step - Background task management with status tracking - Comprehensive documentation and examples **Performance:** - Conditional wait states (0.3-3s vs 6s per action) - Fast page loads (`domcontentloaded` vs `networkidle`) - Single screenshot per turn (eliminates duplicates) - Parallel function execution for batch operations **Security:** - Safety decision acknowledgment implementation - Environment-based API key configuration - Comprehensive logging for audit trails - Screenshot-based verification capability [Full changelog →](CHANGELOG.md) ## Security ### Safety Decision Framework This MCP server implements Gemini's safety decision framework to prevent: - Financial transactions without confirmation - Sensitive data access without review - System-level changes without approval - CAPTCHA bypassing attempts - Potentially harmful actions ### Best Practices 1. **Sandboxing:** Run in containerized environment for production deployment 2. **API Key Security:** Use environment variables, never commit keys to version control 3. **Rate Limiting:** Respect built-in polling delays (recommended 5 seconds) 4. **Human-in-Loop:** Review outputs before taking actions based on results 5. **Logging:** All actions are logged with timestamps for audit trail 6. **Screenshot Review:** Check saved screenshots to verify agent behavior ### Reporting Vulnerabilities For security vulnerabilities, please email security@example.com with: - Description of the vulnerability - Steps to reproduce - Potential impact - Suggested fix (if available) **Do not** open public issues for security vulnerabilities. ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## Acknowledgments Built with excellence using: - **Google Gemini Team** - Gemini 2.5 Computer Use API - **Anthropic** - Model Context Protocol specification - **FastMCP** - Excellent MCP server framework - **Playwright** - Robust browser automation library ## Support ### Get Help - **Documentation:** You're reading it! - **GitHub Issues:** [Report bugs or request features](https://github.com/yourusername/gemini-web-automation-mcp/issues) - **GitHub Discussions:** [Ask questions and share ideas](https://github.com/yourusername/gemini-web-automation-mcp/discussions) ### Useful Links - [Gemini API Documentation](https://ai.google.dev/gemini-api/docs/computer-use) - [Model Context Protocol Docs](https://modelcontextprotocol.io) - [FastMCP Documentation](https://gofastmcp.com) - [Playwright Documentation](https://playwright.dev) ### Community - Share your use cases and automations - Contribute improvements and bug fixes - Help others in GitHub Discussions - Star the repository if you find it useful --- **Built with Gemini 2.5 Computer Use API** MCP Protocol 2025 | Production Ready | Open Source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vincenthopf/computer-use-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server