Skip to main content
Glama

Web-LLM MCP Server

An MCP server that uses Playwright to load and interact with an HTML page containing @mlc-ai/web-llm for local LLM inference.

Features

  • Browser-based LLM: Uses @mlc-ai/web-llm running in a Chromium browser instance

  • Playwright Integration: Automates browser interactions for seamless LLM operations

  • Multiple Tools: Generate text, chat, check status, change models, and take screenshots

  • Model Management: Support for various Web-LLM models with dynamic switching

Related MCP server: Playwright MCP

Available Tools

playwright_llm_generate

Generate text using Web-LLM through the browser interface.

Parameters:

  • prompt (string): The prompt to generate text from

  • systemPrompt (string, optional): System prompt to set context

  • maxTokens (number, optional): Maximum tokens to generate

  • temperature (number, optional): Temperature for generation (0-1)

  • model (string, optional): Model to use (will reinitialize if different)

playwright_llm_chat

Start an interactive chat session and return the response.

Parameters:

  • message (string): Message to send in the chat

  • clearHistory (boolean, optional): Clear chat history before sending

playwright_llm_status

Get the current status of the Web-LLM Playwright interface.

playwright_llm_set_model

Change the current Web-LLM model.

Parameters:

  • model (string): Model ID to switch to

playwright_llm_screenshot

Take a screenshot of the Web-LLM interface.

Parameters:

  • path (string, optional): Path to save screenshot

Supported Models

  • Llama-3.2-1B-Instruct-q4f32_1-MLC (default)

  • Llama-3.2-3B-Instruct-q4f32_1-MLC

  • Phi-3.5-mini-instruct-q4f16_1-MLC

  • gemma-2-2b-it-q4f32_1-MLC

  • Mistral-7B-Instruct-v0.3-q4f16_1-MLC

  • Qwen2.5-1.5B-Instruct-q4f32_1-MLC

Installation

  1. Install dependencies:

pnpm install
  1. Install Playwright browsers:

npx playwright install chromium

Usage

Start the MCP server:

node index.js

Or run the test:

node test.js

Technical Details

The server works by:

  1. Launching a headless Chromium browser using Playwright

  2. Loading the index.html file which contains the Web-LLM interface

  3. Waiting for the Web-LLM model to initialize

  4. Exposing browser functions through the window.webllmInterface object

  5. Providing MCP tools that call these browser functions

The HTML interface provides a complete Web-LLM implementation with:

  • Model initialization and loading progress

  • Chat interface for testing

  • JavaScript API for programmatic access

  • Error handling and status reporting

Notes

  • First run will be slower as it downloads and initializes the LLM model

  • The browser runs in headless mode by default

  • Screenshots can be taken for debugging the interface

  • Model switching requires reinitialization which takes time

  • The interface is fully self-contained in the HTML file# project

project

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Looking for Admin?

Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ragingwind/web-llm-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server