Web-LLM MCP Server

CLAUDE.md•3.75 kB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a Model Context Protocol (MCP) server that integrates Web-LLM with Playwright for browser-based local LLM inference. The server runs a headless Chromium browser to load an HTML interface containing @mlc-ai/web-llm and provides MCP tools for text generation, chat, model management, and debugging. ## Key Commands ### Development ```bash # Install dependencies pnpm install # Install Playwright browsers (required for first setup) npx playwright install chromium # TypeScript development npm run build # Compile TypeScript to dist/ npm run typecheck # Check TypeScript types without emitting npm start # Start server in development mode (tsx) npm run start:prod # Start compiled server from dist/ in production # Testing npm test # Run the test suite # Linting and formatting (using Biome) npm run lint # Check for linting issues npm run lint:fix # Fix linting issues automatically npm run format # Format code npm run check # Run both linting and formatting checks npm run check:fix # Fix both linting and formatting issues ``` ## Architecture ### Core Components - **index.ts**: Main MCP server implementation with embedded HTML content and CDN-based Web-LLM loading (TypeScript) - **test.ts**: Integration test that validates MCP protocol communication (TypeScript) - **tsconfig.json**: TypeScript configuration with strict type checking - **biome.json**: Code formatting and linting configuration ### Browser Integration Pattern The server follows a unique architecture where: 1. TypeScript compiles server code to `dist/index.js` 2. Playwright launches a headless Chromium browser 3. Server sets HTML content programmatically via `page.setContent()` 4. Web-LLM is loaded from CDN via module script in the HTML 5. Browser interface code is injected directly via `page.addScriptTag()` 6. Loaded code exposes fully-typed `window.webllmInterface` API 7. MCP tools communicate with the browser via `page.evaluate()` calls 8. All LLM operations happen in the browser context using CDN-loaded Web-LLM ### MCP Tools Structure All tools follow the pattern: - Input validation using Zod schemas - Browser initialization check (`initializeBrowser()`) - Browser function calls via `page.evaluate()` - Error handling with console logging to stderr - Structured response format with content arrays Available tools: - `playwright_llm_generate`: Text generation with optional parameters - `playwright_llm_chat`: Interactive chat with history management - `playwright_llm_status`: Status and model information - `playwright_llm_set_model`: Dynamic model switching - `playwright_llm_screenshot`: Interface debugging ### Supported Models Default: `Llama-3.2-1B-Instruct-q4f32_1-MLC` Also supports: Llama-3.2-3B, Phi-3.5-mini, gemma-2-2b, Mistral-7B, Qwen2.5-1.5B ## Important Notes - Run `npm run build` before first use to compile TypeScript to `dist/` - HTML content and interface code are embedded/loaded programmatically in the main server - Web-LLM is loaded from CDN (esm.run) via module script in HTML - No external HTML file dependencies - everything is self-contained - Server logic is fully TypeScript with comprehensive type safety - First run downloads and initializes the LLM model (slow) - Model switching triggers reinitialization - Browser runs headless by default but can take screenshots for debugging - All console output from browser is logged to stderr with `[browser]` prefix - Server uses cleanup handlers for graceful browser termination - Web-LLM is loaded externally from CDN, not bundled locally

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ragingwind/web-llm-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server