Automatically manages SearXNG container deployment and lifecycle, including auto-start, configuration mounting, and health checks
Primary search provider for web search queries with automatic fallback to other providers
Uses Mozilla Readability library for intelligent content extraction and parsing from web pages
Optional integration for progressive summarization of web content using local LLM models, with automatic fallback to truncation if unavailable
Secondary search provider with automatic Docker container management for local deployment, used as fallback when DuckDuckGo fails
Provides persistent caching of scraped content to avoid redundant requests, with 1-hour TTL for deferred content fetching
ISIS MCP
An open-source MCP (Model Context Protocol) server for local web scraping with RAG capabilities. Provides a free, API-key-free alternative to Apify RAG Web Browser.
Features
RAG Tool: Intelligent web search with content extraction (Multi-provider fallback (DuckDuckGo → SearXNG → ScraperAPI) + Mozilla Readability + Markdown conversion)
Scrape Tool: Extract content from specific URLs with optional CSS selectors
Screenshot Tool: Capture visual snapshots of web pages
SQLite Caching: Persistent cache to avoid redundant requests
Parallel Processing: Efficiently handle multiple page extractions
No API Keys Required: Self-contained, privacy-focused approach
Installation
Step 1: Install Globally
Step 2: Register with Claude Code
This registers the MCP in user scope (available across all projects).
Important: Restart Claude Code after installation.
Step 3: Search Providers (Auto-configured)
ISIS-MCP uses an automatic fallback chain - no configuration needed:
Priority | Provider | Config Required | Notes |
1 | DuckDuckGo | None | Primary, always available |
2 | SearXNG Local | Docker installed | Auto-starts container on first use |
3 | ScraperAPI |
env var | Optional paid fallback |
4 | Public SearXNG | None | Free but slower/unreliable |
Option A: Docker SearXNG (Recommended)
Just have Docker installed - ISIS-MCP handles the rest:
Manual commands:
Option B: ScraperAPI (Optional - Paid Fallback)
Create account at ScraperAPI
Set environment variable:
Make it permanent (add to ~/.zshrc or ~/.bashrc):
Alternative: Via Claude Code CLI (Legacy)
If you prefer npx-based installation:
For user-level global installation:
Manual Configuration
Add the following to your claude_desktop_config.json:
Troubleshooting Installation
"All search providers failed"
Cause: No provider configured or available.
Solution:
Configure SearXNG Local (Option A) OR ScraperAPI (Option B)
Verify service is running:
curl http://localhost:8080/search?q=test&format=jsonIf using ScraperAPI, confirm env var:
echo $SCRAPER_API_KEY
Slow Performance
Global vs npx comparison:
Method | Startup | Cache | Re-download | Recommended |
| ~1-3s | NPX cache | Yes (3-7 days) | ❌ |
| ~240ms | Persistent | Never | ✅ |
If still slow:
Is SearXNG Local running?
Is ScraperAPI key configured?
Are public instances overloaded?
Claude Code Not Detecting MCP
Verify installation:
npm list -g isis-mcpRestart Claude Code completely
Check MCP status:
claude mcp list(if available)Re-run:
claude mcp install isis-mcp -s user
Available Tools
rag (Primary Tool)
Web search with intelligent content extraction. Works like Apify RAG Web Browser:
Search via multi-provider fallback (DuckDuckGo → SearXNG → ScraperAPI → Public instances)
Extract content from discovered pages in parallel
Convert to Markdown using Mozilla Readability
Return structured result with caching
Parameters:
query(required): Search termmaxResults(optional): Maximum number of pages to retrieve (1-10, default: 5)outputFormat(optional):markdown|text|html(default:markdown)useJavascript(optional): Render JavaScript with Playwright (default:false)
Example:
scrape
Extract content from a specific URL.
Parameters:
url(required): Page URLselector(optional): CSS selector for specific elementjavascript(optional): Render JavaScript before extraction
Example:
screenshot
Capture a screenshot of a web page.
Parameters:
url(required): Page URLfullPage(optional): Capture entire page (default:false)width(optional): Viewport width in pixels (default:1920)height(optional): Viewport height in pixels (default:1080)
Example:
Architecture
The server uses a modular architecture where each component can be extended independently:
Search Module: Multi-provider fallback chain (DuckDuckGo → SearXNG → ScraperAPI → Public instances)
Docker Integration: Automatic SearXNG container management on port 8080
Extraction Module: Uses Mozilla Readability for intelligent content parsing and Turndown for HTML-to-Markdown conversion
Cache Layer: SQLite-based persistent cache to minimize redundant requests
Processing Pipeline: Parallel extraction of multiple pages for improved performance
Requirements
Node.js 20+ - Required
Docker (recommended) - For local SearXNG. Auto-starts on first use. Fallback providers work without Docker.
Playwright Chromium - Installed automatically
Search Fallback Chain
ISIS-MCP automatically tries providers in order until one succeeds:
Features:
Exponential backoff on rate limits
User-agent rotation for reliability
Automatic Docker container management
Graceful degradation to public instances
Token Optimization Features
The RAG tool has been enhanced with progressive token optimization to handle large content efficiently.
Phase 1: Content Modes
Control how much content is returned per result:
Benefits:
preview: Fast, compact results (~6k tokens vs ~20k)full: Complete content (original behavior)summary: Intelligent 150-200 word summaries via LLM
Phase 2: Deferred Content Fetching
Fetch full content after preview using content handles:
Benefits:
Lazy loading: Only fetch what you need
Cache reuse: No re-scraping required
Deterministic handles: Same URL = same handle
Phase 3: Progressive Summarization
Intelligent content summarization using local Ollama LLM.
Setup (Optional - Zero Config)
Install Ollama (if not already):
Pull a model (recommended):
Start Ollama (if not running):
Usage
Basic summarization (auto-detection):
Custom model:
Configuration via environment variables:
Fallback Behavior
✅ Ollama unavailable → Automatic fallback to truncation
✅ Model doesn't exist → Try default, then truncate
✅ Timeout → Fallback to truncation
✅ Zero configuration required - works out of the box
Recommended Models
Model | Size | Speed | Quality | Use Case |
| 1.3GB | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ✅ Recommended (default) |
| 400MB | ⭐⭐⭐⭐⭐ | ⭐⭐ | Ultra-fast, lighter quality |
| 4GB | ⭐⭐⭐ | ⭐⭐⭐⭐ | Premium quality |
Performance Comparison
Mode | Avg Tokens | Latency | Use Case |
| ~20,000 | 3-5s | Complete research |
| ~6,000 | 3-5s | Quick scanning |
| ~1,500 | 4-8s* | Intelligent digests |
* With Ollama. Falls back to preview performance if unavailable.
Examples
Research workflow:
Troubleshooting:
Q: Summarization seems slow?
Q: Getting truncated results instead of summaries?
Local Development
Clone and Setup
Testing
Build Output
Compiled code is output to the build/ directory. Make sure to run npm run build after making changes to the source.
License
Licensed under the Apache License, Version 2.0. See the LICENSE file for full details.
You may obtain a copy of the License at:
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.