Integrations
Allows containerized deployment with configurable environment variables and persistent configuration between container restarts.
Provides optional Markdown embedding for image results when using the images search category.
Uses Pydantic for data validation and settings management in the server implementation.
MCP SearXNG Enhanced Server
A Model Context Protocol (MCP) server for category-aware web search, website scraping, and date/time tools. Designed for seamless integration with SearXNG and modern MCP clients.
Features
- 🔍 SearXNG-powered web search with category support (general, images, videos, files, map, social media)
- 📄 Website content scraping with citation metadata and automatic Reddit URL conversion
- 💾 In-memory caching with automatic freshness validation
- 🚦 Domain-based rate limiting to prevent service abuse
- 🕒 Timezone-aware date/time tool
- ⚠️ Robust error handling with custom exception types
- 🐳 Dockerized and configurable via environment variables
- ⚙️ Configuration persistence between container restarts
Quick Start
Prerequisites
- Docker installed on your system
- A running SearXNG instance (self-hosted or accessible endpoint)
Installation & Usage
Build the Docker image:
Run with your SearXNG instance (Manual Docker Run):
In this example, SEARXNG_ENGINE_API_BASE_URL
is explicitly set. DESIRED_TIMEZONE
is also explicitly set to America/New_York
, which matches its default value. If an environment variable is not provided using an -e
flag during the docker run
command, the server will automatically use the default value defined in its Dockerfile
(refer to the Environment Variables table below). Thus, if you intend to use the default for DESIRED_TIMEZONE
, you could omit the -e DESIRED_TIMEZONE="America/New_York"
flag. However, SEARXNG_ENGINE_API_BASE_URL
is critical and usually needs to be set to match your specific SearXNG instance's address if the Dockerfile default (http://host.docker.internal:8080/search
) is not appropriate.
Note on Manual Docker Run: This command runs the Docker container independently. If you are using an MCP client (like Cline in VS Code) to manage this server, the client will start its own instance of the container using the settings defined in its own configuration. For the MCP client to use specific environment variables, they must be configured within the client's settings for this server (see below).
Configure your MCP client (e.g., Cline in VS Code):
For your MCP client to correctly manage and run this server, you must define all necessary environment variables within the client's settings for the overtlids/mcp-searxng-enhanced
server. The MCP client will use these settings to construct the docker run
command.
The following is the recommended default configuration for this server within your MCP client's JSON settings (e.g., cline_mcp_settings.json
). This example explicitly lists all environment variables set to their default values as defined in the Dockerfile
. You can copy and paste this directly and then customize any values as needed.
Key Points for MCP Client Configuration:
- The example above provides a complete set of arguments to run the Docker container with all environment variables set to their default values.
- To customize any setting, simply modify the value for the corresponding
-e "VARIABLE_NAME=value"
line within theargs
array in your MCP client's configuration. For instance, to changeSEARXNG_ENGINE_API_BASE_URL
andDESIRED_TIMEZONE
, you would adjust their respective lines. - Refer to the "Environment Variables" table below for a detailed description of each variable and its default.
- The server's behavior is primarily controlled by these environment variables. While an
ods_config.json
file can also influence settings (see Configuration Management), environment variables passed by the MCP client take precedence.
Running Natively (Without Docker)
If you prefer to run the server directly using Python without Docker, follow these steps:
1. Python Installation:
- This server requires Python 3.9 or newer. Python 3.11 (as used in the Docker image) is recommended.
- You can download Python from python.org.
2. Clone the Repository:
- Get the code from GitHub:Copy
3. Create and Activate a Virtual Environment (Recommended):
- Using a virtual environment helps manage dependencies and avoid conflicts with other Python projects.Copy
4. Install Dependencies:
- Install the required Python packages:Key dependencies includeCopy
httpx
,BeautifulSoup4
,pydantic
,trafilatura
,python-dateutil
,cachetools
, andzoneinfo
.
5. Ensure SearXNG is Accessible:
- You still need a running SearXNG instance. Make sure you have its API base URL (e.g.,
http://127.0.0.1:8080/search
).
6. Set Environment Variables:
- The server is configured via environment variables. At a minimum, you'll likely need to set
SEARXNG_ENGINE_API_BASE_URL
. - Linux/macOS (bash/zsh):Copy
- Windows (Command Prompt):Copy
- Windows (PowerShell):Copy
- Refer to the "Environment Variables" table below for all available options. If not set, defaults from the script or an
ods_config.json
file (if present in the root directory or atODS_CONFIG_PATH
) will be used.
7. Run the Server:
- Execute the Python script:Copy
- The server will start and listen for MCP client connections via stdin/stdout.
8. Configuration File (ods_config.json
):
- Alternatively, or in combination with environment variables, you can create an
ods_config.json
file in the project's root directory (or the path specified by theODS_CONFIG_PATH
environment variable). Environment variables will always take precedence over values in this file. Example:json { "searxng_engine_api_base_url": "http://127.0.0.1:8080/search", "desired_timezone": "America/New_York" }
Environment Variables
The following environment variables control the server's behavior. You can set them in your MCP client's configuration (recommended for client-managed servers) or when running Docker manually.
Variable | Description | Default (from Dockerfile) | Notes |
---|---|---|---|
SEARXNG_ENGINE_API_BASE_URL | SearXNG search endpoint | http://host.docker.internal:8080/search | Crucial for server operation |
DESIRED_TIMEZONE | Timezone for date/time tool | America/New_York | E.g., America/Los_Angeles . List of tz database time zones: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones |
ODS_CONFIG_PATH | Path to persistent configuration file | /config/ods_config.json | Typically left as default within the container. |
RETURNED_SCRAPPED_PAGES_NO | Max pages to return per search | 3 | |
SCRAPPED_PAGES_NO | Max pages to attempt scraping | 5 | |
PAGE_CONTENT_WORDS_LIMIT | Max words per scraped page | 5000 | |
CITATION_LINKS | Enable/disable citation events | True | True or False |
MAX_IMAGE_RESULTS | Maximum image results to return | 10 | |
MAX_VIDEO_RESULTS | Maximum video results to return | 10 | |
MAX_FILE_RESULTS | Maximum file results to return | 5 | |
MAX_MAP_RESULTS | Maximum map results to return | 5 | |
MAX_SOCIAL_RESULTS | Maximum social media results to return | 5 | |
TRAFILATURA_TIMEOUT | Content extraction timeout (seconds) | 15 | |
SCRAPING_TIMEOUT | HTTP request timeout (seconds) | 20 | |
CACHE_MAXSIZE | Maximum number of cached websites | 100 | |
CACHE_TTL_MINUTES | Cache time-to-live (minutes) | 5 | |
CACHE_MAX_AGE_MINUTES | Maximum age for cached content (minutes) | 30 | |
RATE_LIMIT_REQUESTS_PER_MINUTE | Max requests per domain per minute | 10 | |
RATE_LIMIT_TIMEOUT_SECONDS | Rate limit tracking window (seconds) | 60 | |
IGNORED_WEBSITES | Comma-separated list of sites to ignore | "" (empty) | E.g., "example.com,another.org" |
Configuration Management
The server uses a three-tier configuration approach:
- Script defaults (hardcoded in Python)
- Config file (loaded from
ODS_CONFIG_PATH
, defaults to/config/ods_config.json
) - Environment variables (highest precedence)
The config file is only updated when:
- The file doesn't exist yet (first-time initialization)
- Environment variables are explicitly provided for the current run
This ensures that user configurations are preserved between container restarts when no new environment variables are set.
Tools & Aliases
Tool Name | Purpose | Aliases |
---|---|---|
search_web | Web search via SearXNG | search , web_search , find , lookup_web , search_online , access_internet , lookup * |
get_website | Scrape website content | fetch_url , scrape_page , get , load_website , lookup * |
get_current_datetime | Current date/time | current_time , get_time , current_date |
*lookup
is context-sensitive:
- If called with a
url
argument, it maps toget_website
- Otherwise, it maps to
search_web
Example: Calling Tools
Web Search
or using an alias:
Category-Specific Search
Website Scraping
or using an alias:
Current Date/Time
or:
Advanced Features
Category-Specific Search
The search_web
tool supports different categories with tailored outputs:
- images: Returns image URLs, titles, and source pages with optional Markdown embedding
- videos: Returns video information including titles, source, and embed URLs
- files: Returns downloadable file information including format and size
- map: Returns location data including coordinates and addresses
- social media: Returns posts and profiles from social platforms
- general: Default category that scrapes and returns full webpage content
Reddit URL Conversion
When scraping Reddit content, URLs are automatically converted to use the old.reddit.com domain for better content extraction.
Rate Limiting
Domain-based rate limiting prevents excessive requests to the same domain within a time window. This prevents overwhelming target websites and potential IP blocking.
Cache Validation
Cached website content is automatically validated for freshness based on age. Stale content is refreshed automatically while valid cached content is served quickly.
Error Handling
The server implements a robust error handling system with these exception types:
MCPServerError
: Base exception class for all server errorsConfigurationError
: Raised when configuration values are invalidSearXNGConnectionError
: Raised when connection to SearXNG failsWebScrapingError
: Raised when web scraping failsRateLimitExceededError
: Raised when rate limit for a domain is exceeded
Errors are properly propagated to the client with informative messages.
Troubleshooting
- Cannot connect to SearXNG: Ensure your SearXNG instance is running and the
SEARXNG_ENGINE_API_BASE_URL
environment variable points to the correct endpoint. - Rate limit errors: Adjust
RATE_LIMIT_REQUESTS_PER_MINUTE
if you're experiencing too many rate limit errors. - Slow content extraction: Increase
TRAFILATURA_TIMEOUT
to allow more time for content processing on complex pages. - Docker networking issues: If using Docker Desktop on Windows/Mac,
host.docker.internal
should resolve to the host machine. On Linux, you may need to use the host's IP address instead.
Acknowledgements
Inspired by:
- SearXNG - Privacy-respecting metasearch engine
- Trafilatura - Web scraping tool for text extraction
- ihor-sokoliuk/mcp-searxng - Original MCP server for SearXNG
- nnaoycurt (Better Web Search Tool)
- @bwoodruff2021 (GetTimeDate Tool)
License
MIT License © 2025 OvertliDS
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
A Model Context Protocol server that enables web search with category support, website content scraping with citation metadata, and timezone-aware date/time tools.
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol (MCP) server implementation that integrates with FireCrawl for advanced web scraping capabilities.Last updated -915,2752,745JavaScriptMIT License
- AsecurityFlicenseAqualityBuilt as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.Last updated -41Python
- -securityAlicense-qualityA Model Context Protocol server that allows LLMs to interact with web content through standardized tools, currently supporting web scraping functionality.Last updated -PythonMIT License
- -securityAlicense-qualityA Model Context Protocol server that enables web search, scraping, crawling, and content extraction through multiple engines including SearXNG, Firecrawl, and Tavily.Last updated -3511TypeScriptMIT License