Skip to main content
Glama

AutoProbeMCP

by Wladastic

AutoProbeMCP - a browser for your Agent

A Model Context Protocol (MCP) server that provides browser automation capabilities using Playwright. This server enables AI assistants to interact with web pages through a standardized interface.

Perfect for web automation, testing, and debugging workflows with AI assistants including:

  • Chat.fans agents - Empower AI agents with web interaction capabilities in VS Code
  • GitHub Copilot Chat - Enhance your development workflow with browser automation
  • Any MCP-compatible AI assistant - Universal browser automation for AI tools

Features

  • Multi-browser support: Chromium, Firefox, and WebKit
  • Comprehensive automation: Navigate, click, type, screenshot, and more
  • JavaScript execution: Run custom scripts in the browser context
  • Element interaction: Wait for elements, get text content, and interact with forms
  • Screenshot capabilities: Capture full pages or viewport screenshots
  • Type-safe: Built with TypeScript and runtime validation using Zod image

Installation

npm install npm run build

Make sure Playwright browsers are installed:

npx playwright install

For system dependencies (Linux):

sudo npx playwright install-deps

Usage

VS Code Integration

Configure the MCP server in VS Code by adding to your settings.json or workspace configuration:

"mcp": { "servers": { "browser-automation": { "command": "node", "args": [ "/home/yourUserName/mcp-browser-server/build/index.js" ], "env": {} } } }

Once configured, Chat.fans agents and GitHub Copilot Chat can use browser automation tools for web testing, scraping, and automation tasks.

Available VS Code Tasks
  • Build: Ctrl+Shift+P → "Tasks: Run Task" → "build"
  • Development Mode: Ctrl+Shift+P → "Tasks: Run Task" → "dev"
  • Test MCP Server: Ctrl+Shift+P → "Tasks: Run Task" → "test-mcp-server"

Available Tools

  1. launch_browser - Start a new browser instance
  2. navigate - Go to a specific URL
  3. click_element - Click on page elements
  4. type_text - Enter text into form fields
  5. screenshot - Capture page screenshots
  6. get_element_text - Extract text from elements
  7. wait_for_element - Wait for elements to appear/disappear
  8. evaluate_javascript - Run custom JavaScript
  9. get_console_logs - Get browser console logs (log, info, warn, error, debug)
  10. analyze_screenshot - AI-powered screenshot analysis using Gemma3 (requires Ollama)
  11. get_page_info - Get current page information
  12. close_browser - Close the browser instance
  13. scroll - Scroll the page in the specified direction (up/down/left/right)
  14. check_scrollability - Check if the page is scrollable in specific directions

Example: Web Application Testing

// Launch browser in headed mode for visual debugging await launch_browser({ browser: "chromium", headless: false }); // Navigate to login page await navigate({ url: "http://localhost:3000/login" }); // Fill in credentials await type_text({ selector: "input[type='email']", text: "user@example.com" }); await type_text({ selector: "input[type='password']", text: "password123" }); // Submit form await click_element({ selector: "button[type='submit']" }); // Wait for successful login await wait_for_element({ selector: ".dashboard", timeout: 10000 }); // Check for any console errors during login await get_console_logs({ level: "error" }); // Take screenshot of dashboard await screenshot({ fullPage: true, path: "dashboard.png" }); // Get all console logs for debugging await get_console_logs(); // Scroll down to see more content await scroll({ direction: "down", pixels: 500, behavior: "smooth" }); // Check if page can be scrolled vertically await check_scrollability({ direction: "vertical" }); // Scroll back to top await scroll({ direction: "up", pixels: 500 });

Page Scrolling and Navigation

The MCP Browser Server includes comprehensive scrolling tools for navigating long pages and checking scroll capabilities:

Scroll Tool

The scroll tool allows you to scroll the page in any direction with fine-grained control:

// Scroll down by default amount (100px) await scroll(); // Scroll in specific directions with custom distances await scroll({ direction: "down", pixels: 300, behavior: "smooth" }); await scroll({ direction: "up", pixels: 200, behavior: "auto" }); await scroll({ direction: "left", pixels: 150 }); await scroll({ direction: "right", pixels: 150 }); // Smooth scrolling for better user experience await scroll({ direction: "down", pixels: 500, behavior: "smooth" });

Parameters:

  • direction: "up", "down", "left", "right" (default: "down")
  • pixels: Number of pixels to scroll (default: 100)
  • behavior: "auto" or "smooth" (default: "auto")

Scrollability Check Tool

The check_scrollability tool determines whether a page can be scrolled in specific directions:

// Check both vertical and horizontal scrollability await check_scrollability({ direction: "both" }); // Check only vertical scrolling await check_scrollability({ direction: "vertical" }); // Check only horizontal scrolling await check_scrollability({ direction: "horizontal" });

Response includes:

  • Current scroll position
  • Maximum scroll distance
  • Whether scrolling is possible in each direction
  • Detailed position information

AI-Powered Screenshot Analysis

The analyze_screenshot tool provides AI-powered analysis of web pages using local Gemma3 models via Ollama. This feature can describe what's visible on a page, analyze page structure, and look for specific elements based on context.

Prerequisites

  1. Install Ollama: Download from ollama.ai
  2. Install Gemma3 model:
    ollama pull gemma3:4b
  3. Start Ollama service:
    ollama serve

Usage Examples

Basic Screenshot Analysis
// Take and analyze a screenshot with AI await analyze_screenshot({ fullPage: true, model: "gemma3:4b" });
Detailed Structural Analysis
// Get detailed analysis of page structure await analyze_screenshot({ detailed: true, pretext: "Focus on navigation elements and form fields" });
Context-Specific Analysis
// Look for specific elements or issues await analyze_screenshot({ pretext: "Check if there are any error messages or broken layouts", path: "error-check.png" });

Parameters

  • fullPage (boolean): Capture entire scrollable page vs viewport only
  • path (string): Optional file path to save the screenshot
  • pretext (string): Additional context or specific instructions for the AI
  • model (string): AI model to use (default: "gemma3:4b")
  • detailed (boolean): Request detailed structural analysis

Supported Models

  • gemma3:4b (default, good balance of speed and quality)
  • Any other vision-capable model available in your Ollama installation

Development & Testing

Quick Setup

# One-command setup (installs dependencies, browsers, and builds) npm run setup # Or step by step: npm install npx playwright install npm run build

Development Commands

# Build the project npm run build # Run in development mode npm run dev # Start the server npm run start # Development helper (shows all available commands) npm run dev-helper help

Testing

The project includes comprehensive tests in the tests/ directory:

# Run basic communication test npm run test # Run browser automation demo npm run test:demo # Run AI analysis test (requires Ollama) npm run test:ai-simple # Check system status npm run test:status # Run all tests npm run test:all

Development Helper

Use the development helper for common tasks:

# Show all available commands npm run dev-helper help # Quick setup from scratch npm run dev-helper setup # Run comprehensive tests npm run dev-helper test # Clean generated files npm run dev-helper clean

For more details about testing, see tests/README.md.

Project Structure

mcp-browser-server/ ├── src/ # TypeScript source code │ └── index.ts # Main MCP server implementation ├── build/ # Compiled JavaScript output ├── tests/ # Test scripts and documentation │ ├── README.md # Testing documentation │ ├── simple-test.mjs # Basic communication test │ ├── demo-test.mjs # Browser automation demo │ └── *.mjs # Additional test files ├── screenshots/ # Generated screenshots from tests ├── package.json # Project configuration └── README.md # This file

License

Dual License:

  • Personal Use: Free for personal, educational, and non-commercial use
  • Commercial Use: Requires a separate commercial license

See LICENSE for full terms. For commercial licensing inquiries, please contact us.

Install Server
A
security – no known vulnerabilities
F
license - not found
A
quality - confirmed to work

A Model Context Protocol server that provides browser automation capabilities using Playwright, enabling AI assistants to interact with web pages through a standardized interface.

  1. Features
    1. Installation
      1. Usage
        1. VS Code Integration
        2. Available Tools
      2. Example: Web Application Testing
        1. Page Scrolling and Navigation
          1. Scroll Tool
          2. Scrollability Check Tool
        2. AI-Powered Screenshot Analysis
          1. Prerequisites
          2. Usage Examples
          3. Parameters
          4. Supported Models
        3. Development & Testing
          1. Quick Setup
          2. Development Commands
          3. Testing
          4. Development Helper
          5. Project Structure
        4. License

          Related MCP Servers

          • A
            security
            A
            license
            A
            quality
            A Model Context Protocol server that provides browser automation capabilities using Playwright. This server enables LLMs to interact with web pages, take screenshots, and execute JavaScript in a real browser environment.
            Last updated -
            13
            10,534
            3,956
            TypeScript
            MIT License
            • Linux
            • Apple
          • -
            security
            F
            license
            -
            quality
            Provides a server utilizing Model Context Protocol to enable human-like browser automation with Playwright, allowing control over browser actions such as navigation, element interaction, and scrolling.
            Last updated -
            2
            TypeScript
          • -
            security
            A
            license
            -
            quality
            A Model Context Protocol server that provides browser automation capabilities using Playwright, enabling LLMs to interact with web pages, take screenshots, and execute JavaScript in a real browser environment.
            Last updated -
            3
            Python
            Apache 2.0
          • A
            security
            A
            license
            A
            quality
            A Model Context Protocol server that provides browser automation capabilities using Playwright, enabling LLMs to interact with web pages through structured accessibility snapshots without requiring screenshots or vision models.
            Last updated -
            21
            70,036
            TypeScript
            Apache 2.0
            • Apple
            • Linux

          View all related MCP servers

          MCP directory API

          We provide all the information about MCP servers via our MCP API.

          curl -X GET 'https://glama.ai/api/mcp/v1/servers/Wladastic/AutoProbeMCP'

          If you have feedback or need assistance with the MCP directory API, please join our Discord server