README.md•2.95 kB
# Web-LLM MCP Server
An MCP server that uses Playwright to load and interact with an HTML page containing @mlc-ai/web-llm for local LLM inference.
## Features
- **Browser-based LLM**: Uses @mlc-ai/web-llm running in a Chromium browser instance
- **Playwright Integration**: Automates browser interactions for seamless LLM operations
- **Multiple Tools**: Generate text, chat, check status, change models, and take screenshots
- **Model Management**: Support for various Web-LLM models with dynamic switching
## Available Tools
### `playwright_llm_generate`
Generate text using Web-LLM through the browser interface.
**Parameters:**
- `prompt` (string): The prompt to generate text from
- `systemPrompt` (string, optional): System prompt to set context
- `maxTokens` (number, optional): Maximum tokens to generate
- `temperature` (number, optional): Temperature for generation (0-1)
- `model` (string, optional): Model to use (will reinitialize if different)
### `playwright_llm_chat`
Start an interactive chat session and return the response.
**Parameters:**
- `message` (string): Message to send in the chat
- `clearHistory` (boolean, optional): Clear chat history before sending
### `playwright_llm_status`
Get the current status of the Web-LLM Playwright interface.
### `playwright_llm_set_model`
Change the current Web-LLM model.
**Parameters:**
- `model` (string): Model ID to switch to
### `playwright_llm_screenshot`
Take a screenshot of the Web-LLM interface.
**Parameters:**
- `path` (string, optional): Path to save screenshot
## Supported Models
- `Llama-3.2-1B-Instruct-q4f32_1-MLC` (default)
- `Llama-3.2-3B-Instruct-q4f32_1-MLC`
- `Phi-3.5-mini-instruct-q4f16_1-MLC`
- `gemma-2-2b-it-q4f32_1-MLC`
- `Mistral-7B-Instruct-v0.3-q4f16_1-MLC`
- `Qwen2.5-1.5B-Instruct-q4f32_1-MLC`
## Installation
1. Install dependencies:
```bash
pnpm install
```
2. Install Playwright browsers:
```bash
npx playwright install chromium
```
## Usage
Start the MCP server:
```bash
node index.js
```
Or run the test:
```bash
node test.js
```
## Technical Details
The server works by:
1. Launching a headless Chromium browser using Playwright
2. Loading the `index.html` file which contains the Web-LLM interface
3. Waiting for the Web-LLM model to initialize
4. Exposing browser functions through the `window.webllmInterface` object
5. Providing MCP tools that call these browser functions
The HTML interface provides a complete Web-LLM implementation with:
- Model initialization and loading progress
- Chat interface for testing
- JavaScript API for programmatic access
- Error handling and status reporting
## Notes
- First run will be slower as it downloads and initializes the LLM model
- The browser runs in headless mode by default
- Screenshots can be taken for debugging the interface
- Model switching requires reinitialization which takes time
- The interface is fully self-contained in the HTML file# project
# project