Low Cost Browsing MCP Server
MCP server aggregator for browser automation, parsing and optional LLM data cleaning using Playwright.
Features
- Browser Automation: Control real browsers with JavaScript execution, login, clicks, typing
- Content Extraction: Extract text, HTML, tables, attributes, and screenshots
- Session Management: Persistent browser sessions with authentication flows
- LLM Integration: Transform and clean extracted data using various LLM providers
- Multiple Providers: Support for OpenAI, Anthropic, Ollama, and JAN AI
- IDE Integration: Works with Claude Desktop and Cursor IDE
Installation
Configuration
Create a config/default.yaml
file or set CONFIG_PATH
environment variable:
Environment Variables
For Local LLMs (Recommended)
Ollama (free, no API keys required)
JAN AI (free, with optional API key)
For JAN, also configure environment variable if required:
For External LLM Providers (Optional)
Create .env
file only if you want to use external APIs:
How to get API keys:
OpenAI:
- Go to https://platform.openai.com/api-keys
- Create a new API key
- Copy the key in format
sk-...
Anthropic:
- Go to https://console.anthropic.com/
- Navigate to API Keys section
- Create a new key in format
sk-ant-...
JAN AI:
- Download and install JAN from https://jan.ai/
- Launch JAN and load a model
- If API key is required, configure it in JAN Settings
For Ollama and JAN (local models):
API keys are usually not required, just configure host
, port
and janPort
in configuration.
Usage
Quick Start
- Install dependencies:
- Configure LLM (choose one option):
Option A - Ollama (recommended, free)
Configure in config/default.yaml
:
Option B - JAN AI (free, graphical interface)
Configure in config/default.yaml
:
If JAN requires API key, add to .env
:
Option C - External APIs (paid)
- Build project:
- Start server:
Configuration for Claude Desktop
- Find Claude Desktop configuration file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
- macOS:
- Add MCP server configuration:
Important: Replace /path/to/your
with the actual absolute path to your project.
To find the full path, run in project root:
Claude Desktop Configuration Examples:
For Ollama (no API keys):
For JAN AI (with API key):
For external APIs (OpenAI/Anthropic):
Combined configuration (all providers):
- Restart Claude Desktop
Configuration for Cursor IDE
- Find Cursor configuration file:
- macOS:
~/Library/Application Support/Cursor/User/settings.json
- Windows:
%APPDATA%\Cursor\User\settings.json
- Linux:
~/.config/Cursor/User/settings.json
Or use the ready-made
cursor-mcp-config.json
file from the project. - macOS:
- Find the full project path:
- Add MCP server to settings.json (replace paths with yours):
For Ollama (no API keys):
For JAN AI (with API key):
For external APIs:
- Restart Cursor
- Activate MCP in chat:
- Open AI chat in Cursor (
Cmd/Ctrl + L
) - Use
@lc-browser-mcp
to access browsing tools
- Open AI chat in Cursor (
Testing
After setup, new tools will appear in Claude Desktop or Cursor. You can test them:
In Claude Desktop:
In Cursor IDE:
AI should respond something like:
Of course! I'll open example.com and extract the page title.
And execute navigate.open and extract.content commands.
Available Tools
- navigate.open - Open URL and create page context
- navigate.goto - Navigate to URL in existing context
- interact.click - Click elements by CSS/text/role
- interact.type - Type text into input fields
- interact.wait - Wait for conditions
- extract.content - Extract page content (text/html/markdown)
- extract.table - Extract tables as JSON
- extract.attributes - Extract element attributes
- extract.screenshot - Take screenshots
- session.auth - Perform authentication sequences
- llm.transform - Transform data using LLM with custom instructions, JSON schema validation and optional preprocessing
Example: Extract Table from Website
Automatic Preprocessing
What is it?
Automatic preprocessing is an intelligent system that analyzes incoming data and automatically cleans it before main processing through LLM. It's a two-stage process:
- Preprocessing stage (automatic) — local LLM cleans and prepares data
- Main processing stage — target LLM processes already cleaned data
Why is it needed?
🎯 Token and cost savings — expensive APIs (OpenAI, Anthropic) receive already cleaned data
📊 Better quality results — LLM works with clean, structured data
⚡ Automation — no need to manually plan data cleaning
🔧 Smart adaptation — system understands what needs to be cleaned based on data type and task
How does it work?
The system automatically determines when preprocessing is needed:
Automatically enabled for:
- HTML content > 5000 characters
- Text > 3000 characters
- JSON arrays > 10 elements
- JSON objects > 20 fields
- Instructions with keywords: "clean", "extract", "parse", "standardize", "normalize"
Examples of automatic processing:
📄 HTML content — system removes:
- Navigation menus and sidebars
- Advertisement blocks and banners
- JavaScript code and CSS styles
- Comments and service information
- Focuses on main article/product content
📝 Text data — system fixes:
- Typos and grammar errors
- Multiple spaces and line breaks
- Duplicate sentences
- Illogical paragraph arrangement
📊 JSON data — system standardizes:
- Removes null and empty values
- Brings field names to unified style
- Converts dates to YYYY-MM-DD format
- Normalizes numeric values and currencies
- Merges duplicate records
Smart task adaptation:
The system analyzes your instruction and adapts preprocessing:
- "extract table" → preserves table structures
- "find products" → focuses on product cards
- "get article" → preserves main article text
- "structure data" → normalizes formats
Configuring automatic preprocessing:
Comparison: with and without preprocessing
❌ Without preprocessing:
✅ With automatic preprocessing:
Cost savings example:
- Processing 50KB HTML through GPT-4: ~$0.50
- With preprocessing: ~$0.05 (local cleaning) + ~$0.05 (GPT-4 for 5KB) = ~$0.10
- Savings: 80% + better result quality!
Instructions for Cursor IDE
Simple request (automatic preprocessing):
With explicit preprocessing:
For table extraction and cleaning:
Preprocessing Usage Examples
HTML cleaning before analysis:
Text normalization before structuring:
Table data cleaning:
Practical Scenarios for Cursor
Scenario 1: E-commerce product analysis
Scenario 2: News parsing
Scenario 3: Legal document processing
Development
Error Codes
nav_timeout
- Navigation timeoutselector_not_found
- Element not foundcaptcha_required
- CAPTCHA detecteddom_too_large
- Content exceeds size limitsllm_failed
- LLM processing errorpage_not_found
- Invalid page IDinternal_error
- General server error
Contributing
We welcome contributions to the Low Cost Browsing MCP Server! Here's how you can help:
🚀 How to Contribute
- Fork the Repository
- Clone Your Fork
- Create a Feature Branch
- Make Your Changes
- Write clean, well-documented code
- Follow the existing code style
- Add tests for new functionality
- Update documentation as needed
- Test Your Changes
- Commit Your Changes
- Push to Your Fork
- Create a Pull Request
- Go to the original repository on GitHub
- Click "New Pull Request"
- Select your fork and branch
- Fill out the PR template with:
- Clear description of changes
- Link to any related issues
- Screenshots if applicable
- Testing instructions
📋 Pull Request Guidelines
Before submitting:
- ✅ Code builds without errors (
npm run build
) - ✅ All tests pass (
npm test
) - ✅ Docker tests work (
make test-unit
) - ✅ Code follows project conventions
- ✅ Documentation is updated
- ✅ Commit messages are descriptive
PR Requirements:
- Clear, descriptive title
- Detailed description of changes
- Reference to related issues (
Fixes #123
) - Add reviewers if you know who should review
- Use labels:
bug
,feature
,documentation
, etc.
Review Process:
- Automated tests run via GitHub Actions
- Code review by maintainers
- Address any requested changes
- Final approval and merge
🐛 Reporting Issues
Found a bug? Please create an issue with:
- Clear title describing the problem
- Steps to reproduce the issue
- Expected behavior vs actual behavior
- Environment details (OS, Node.js version, etc.)
- Screenshots if applicable
- Error logs if available
💡 Feature Requests
Have an idea? Create an issue with:
- Clear description of the feature
- Use case - why is this needed?
- Proposed solution if you have one
- Alternative solutions you've considered
🏗️ Development Setup
- Prerequisites
- Install Dependencies
- Environment Setup
- Run Development Server
🧪 Testing
📖 Documentation
Help improve our documentation:
- Fix typos and grammar
- Add missing examples
- Improve API documentation
- Translate to other languages
- Add tutorials and guides
🤝 Code of Conduct
- Be respectful and inclusive
- Help others learn and grow
- Focus on constructive feedback
- Follow GitHub's community guidelines
📞 Getting Help
- 📖 Documentation: Check existing docs first
- 🐛 Issues: Search existing issues
- 💬 Discussions: Use GitHub Discussions for questions
- 📧 Contact: Reach out to maintainers
Thank you for contributing to Low Cost Browsing MCP Server! 🎉
License
MIT
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Enables browser automation, web content extraction, and LLM-powered data transformation using Playwright. Supports session management, authentication flows, and works with local LLMs (Ollama, JAN AI) or external providers to clean and structure extracted web data.
- Features
- Installation
- Configuration
- Environment Variables
- Usage
- Quick Start
- Option A - Ollama (recommended, free)
- Option B - JAN AI (free, graphical interface)
- Option C - External APIs (paid)
- Configuration for Claude Desktop
- Claude Desktop Configuration Examples:
- Configuration for Cursor IDE
- Testing
- Available Tools
- Example: Extract Table from Website
- Automatic Preprocessing
- Instructions for Cursor IDE
- Preprocessing Usage Examples
- Practical Scenarios for Cursor
- Development
- Error Codes
- Contributing
- License