Enables web search through Baidu's search API, specifically for Chinese language content and websites
Supports containerized deployment of the search fusion server with configurable environment variables
Provides free web search capabilities without requiring API keys, with automatic failover support
Integrates with Google Custom Search API to perform high-quality web searches with premium performance
Enables installation and distribution of the search fusion MCP server package through Python's package manager
Built as a Python-based MCP server with support for Python 3.8+ environments
Provides dedicated Wikipedia search functionality for retrieving article content about entities, people, places, and concepts
π Search Fusion MCP Server
π
A High-Availability Multi-Engine Search Aggregation MCP Server providing intelligent failover, unified API, and LLM-optimized content processing. Search Fusion integrates multiple search engines with smart priority-based routing and automatic failover mechanisms.
π What's New in v3.0.0: Major concurrency upgrade! Enhanced multi-threading support with thread-safe operations, intelligent connection pooling, and semaphore-based request limiting. Now supports 50+ concurrent searches without race conditions or data corruption!
β¨ Features
π Multi-Engine Integration
Google Search - Premium performance with API key
Serper Search - Google search alternative with advanced features
Jina AI Search - AI-powered search with intelligent content processing
DuckDuckGo - Free search, no API key required
Exa Search - AI-powered semantic search
Bing Search - Microsoft search API
Baidu Search - Chinese search engine
π Advanced Features
Intelligent Failover - Automatic engine switching on failures or rate limits
Priority-Based Routing - Smart engine selection based on availability and performance
Unified Response Format - Consistent JSON structure across all engines
Rate Limiting Protection - Built-in cooldown mechanisms
π High Concurrency Support - Thread-safe operations with connection pooling
β‘ Performance Optimization - Async operations with semaphore-based concurrency control
LLM-Optimized Content - Advanced web content fetching with pagination support
Wikipedia Integration - Dedicated Wikipedia search tool
Wayback Machine - Historical webpage archive search
Environment Variable Configuration - Pure MCP configuration without config files
π Enhanced Proxy Auto-Detection - Intelligent proxy detection with zero configuration
π Monitoring & Analytics
Real-time engine status monitoring
Success rate tracking
Error handling and recovery
Performance metrics
β‘ Concurrency & Performance
Thread-Safe Operations - All engine statistics and state updates are protected by async locks
Connection Pooling - Shared HTTP client with configurable connection limits (max 100 connections)
Semaphore Control - Concurrent request limiting (max 30 simultaneous searches)
Timeout Protection - 60-second search timeout prevents request accumulation
Resource Management - Efficient memory usage with automatic connection cleanup
Race Condition Prevention - Double-checked locking for SearchManager initialization
ποΈ Architecture
π Quick Start
Installation
Option 1: Install from PyPI (Recommended)
Option 2: Install from Source
π Enhanced Proxy Auto-Detection (New in v2.0!)
Search Fusion now features intelligent proxy auto-detection inspired by concurrent-browser-mcp, providing seamless proxy support with zero configuration!
β¨ Three-Layer Detection Strategy
Environment Variables - Highest priority, checks
HTTP_PROXY
,HTTPS_PROXY
,ALL_PROXY
Port Scanning - Scans common proxy ports using socket connection testing
System Proxy - Detects OS-level proxy settings (macOS supported)
π Supported Proxy Ports (Priority Order)
7890 - Clash default port
1087 - V2Ray common port
8080 - Generic HTTP proxy port
3128 - Squid proxy default port
8888 - Other proxy software port
10809 - V2Ray SOCKS port
20171 - Additional proxy port
π Zero Configuration Usage
Just run directly - proxy will be auto-detected:
Manual override (if needed):
π Detection Process
π Comparison with concurrent-browser-mcp
Feature | Search-Fusion | concurrent-browser-mcp |
Detection Method | β Env vars β Port scan β System proxy | β Same strategy |
Port List | β 7 common ports | β 7 common ports |
Connection Test | β Socket testing | β Socket testing |
Timeout | β 3 seconds | β 3 seconds |
macOS Support | β networksetup | β networksetup |
Language | Python | TypeScript |
MCP Integration
Environment Variable Configuration
Search Fusion uses pure MCP environment variable configuration without requiring config files.
MCP Client Configuration (PyPI Installation):
MCP Client Configuration (Source Installation):
Supported Environment Variables
Search Engine | Environment Variable | Required | Description | Get API Key |
| Both needed | Google Custom Search API | ||
Serper |
| API key | Serper Google Search API | |
Jina AI |
| API key | Jina AI Search API | |
Bing |
| API key | Microsoft Bing Search API | |
Baidu |
| Both needed | Baidu Search API | |
Exa |
| API key | Exa AI Search API | |
DuckDuckGo | None required | - | Free search, no API key needed | - |
Alternative Variable Names:
Engine Priority
Search engines are prioritized automatically:
Google Search (Priority 1) - Premium performance with API key
Serper Search (Priority 1) - Google alternative with advanced features
Jina AI Search (Priority 1.5) - AI-powered search with optional API key for advanced features
DuckDuckGo (Priority 2) - Free, no API key required
Exa Search (Priority 2) - AI-powered search with API key
Bing Search (Priority 3) - Microsoft search API
Baidu Search (Priority 3) - Chinese search engine
π οΈ MCP Tools
1. search
Perform web searches with intelligent engine selection and failover.
Parameters:
query
(required): Search query termsnum_results
(default: 10): Number of results to returnengine
(default: "auto"): Engine preference"auto"
: Automatic engine selection (recommended)"google"
: Prefer Google Search"serper"
: Prefer Serper Search"jina"
: Prefer Jina AI Search"duckduckgo"
: Prefer DuckDuckGo"exa"
: Prefer Exa Search"bing"
: Prefer Bing Search"baidu"
: Prefer Baidu Search
2. fetch_url
Fetch and process web content with intelligent pagination and multi-method fallback.
Parameters:
url
(required): Web URL to fetchuse_jina
(default: true): Whether to prioritize Jina Reader for LLM-optimized contentwith_image_alt
(default: false): Whether to generate alt text for imagesmax_length
(default: 50000): Maximum content length per page (auto-paginate if exceeded)page_number
(default: 1): Retrieve specific page from previously fetched content
Features:
Intelligent Multi-Method Fallback: Tries Jina Reader β Serper Scrape β Direct HTTP
Automatic Pagination: Splits large content into manageable pages
Concurrent-Safe Caching: Unique page IDs prevent conflicts in high-concurrency scenarios
LLM-Optimized Content: Clean markdown format optimized for AI processing
3. get_available_engines
Get current status and availability of all search engines.
4. search_wikipedia
Search Wikipedia articles for entities, people, places, concepts, etc.
Parameters:
entity
(required): Entity to search forfirst_sentences
(default: 10): Number of sentences to return (0 for full content)
5. search_archived_webpage
Search archived versions of websites using Wayback Machine.
Parameters:
url
(required): Website URL to searchyear
(optional): Target yearmonth
(optional): Target monthday
(optional): Target day
π API Examples
Basic Search
Advanced Web Fetching
Wikipedia Search
π§ͺ Development
Development Setup
π§ Configuration Guide
For detailed configuration instructions, see MCP_CONFIG_GUIDE.md.
π Performance
Latency: Sub-second response times with caching
Availability: 99.9% uptime with intelligent failover
Throughput: Handles concurrent requests efficiently
Scalability: Efficient resource utilization and concurrent processing
π Concurrency Benchmarks
Tested Performance (v3.0.0+):
β 50+ concurrent searches - No race conditions or data corruption
β Thread-safe statistics - Accurate request counting and error tracking
β‘ Connection pooling - Efficient HTTP resource management
π‘οΈ Timeout protection - 60s per request prevents system overload
π Real-time monitoring - Live engine status during high load
Recommended Limits:
Concurrent searches: 10 (configurable via semaphore)
Connection pool: 100 max connections, 20 keep-alive
Request timeout: 60 seconds
Memory usage: ~50MB baseline + ~2MB per concurrent request
π€ Contributing
Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π¨ Rate Limiting & Best Practices
Google Search: 100 queries/day (free tier)
Serper API: Varies by plan
Jina AI: Rate limits apply based on subscription
DuckDuckGo: No official limits, but use responsibly
Other engines: Check respective API documentation
Always implement appropriate delays and respect rate limits to ensure sustainable usage.
π Support
π Documentation
π Issue Tracker
π¬ Discussions
Made with β€οΈ for the MCP community
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
π High-Availability Multi-Engine Search Aggregation MCP Server - Intelligent Failover, Unified API, LLM-Optimized Content Processing.
Related MCP Servers
- -securityFlicense-qualityAn MCP server that integrates with SerpApi to retrieve search results from multiple search engines including Google, Bing, Yahoo, and others, enabling fast access to both live and archived search data.Last updated -15
- -securityFlicense-qualityAn MCP server that enables searching and retrieving content from Confluence documentation systems, providing capabilities for both document searches and full page content retrieval.Last updated -1
- -securityAlicense-qualityStores metadata for MCP servers and provides smart search capabilities, allowing users to find appropriate MCP servers for their queries and route requests to the most suitable server.Last updated -10MIT License
- -securityFlicense-qualityAn MCP server that implements Retrieval-Augmented Generation to efficiently retrieve and process important information from various sources, providing accurate and contextually relevant responses.Last updated -