MCP SearXNG Enhanced

by OvertliDS

Integrations

  • Allows containerized deployment with configurable environment variables and persistent configuration between container restarts.

  • Provides optional Markdown embedding for image results when using the images search category.

  • Uses Pydantic for data validation and settings management in the server implementation.

MCP SearXNG Enhanced Server

A Model Context Protocol (MCP) server for category-aware web search, website scraping, and date/time tools. Designed for seamless integration with SearXNG and modern MCP clients.

Features

  • 🔍 SearXNG-powered web search with category support (general, images, videos, files, map, social media)
  • 📄 Website content scraping with citation metadata and automatic Reddit URL conversion
  • 💾 In-memory caching with automatic freshness validation
  • 🚦 Domain-based rate limiting to prevent service abuse
  • 🕒 Timezone-aware date/time tool
  • ⚠️ Robust error handling with custom exception types
  • 🐳 Dockerized and configurable via environment variables
  • ⚙️ Configuration persistence between container restarts

Quick Start

Prerequisites

  • Docker installed on your system
  • A running SearXNG instance (self-hosted or accessible endpoint)

Installation & Usage

Build the Docker image:

docker build -t overtlids/mcp-searxng-enhanced:latest .

Run with your SearXNG instance (Manual Docker Run):

docker run -i --rm --network=host \ -e SEARXNG_ENGINE_API_BASE_URL="http://127.0.0.1:8080/search" \ -e DESIRED_TIMEZONE="America/New_York" \ overtlids/mcp-searxng-enhanced:latest

In this example, SEARXNG_ENGINE_API_BASE_URL is explicitly set. DESIRED_TIMEZONE is also explicitly set to America/New_York, which matches its default value. If an environment variable is not provided using an -e flag during the docker run command, the server will automatically use the default value defined in its Dockerfile (refer to the Environment Variables table below). Thus, if you intend to use the default for DESIRED_TIMEZONE, you could omit the -e DESIRED_TIMEZONE="America/New_York" flag. However, SEARXNG_ENGINE_API_BASE_URL is critical and usually needs to be set to match your specific SearXNG instance's address if the Dockerfile default (http://host.docker.internal:8080/search) is not appropriate.

Note on Manual Docker Run: This command runs the Docker container independently. If you are using an MCP client (like Cline in VS Code) to manage this server, the client will start its own instance of the container using the settings defined in its own configuration. For the MCP client to use specific environment variables, they must be configured within the client's settings for this server (see below).

Configure your MCP client (e.g., Cline in VS Code):

For your MCP client to correctly manage and run this server, you must define all necessary environment variables within the client's settings for the overtlids/mcp-searxng-enhanced server. The MCP client will use these settings to construct the docker run command.

The following is the recommended default configuration for this server within your MCP client's JSON settings (e.g., cline_mcp_settings.json). This example explicitly lists all environment variables set to their default values as defined in the Dockerfile. You can copy and paste this directly and then customize any values as needed.

{ "mcpServers": { "overtlids/mcp-searxng-enhanced": { "command": "docker", "args": [ "run", "-i", "--rm", "--network=host", "-e", "SEARXNG_ENGINE_API_BASE_URL=http://host.docker.internal:8080/search", "-e", "DESIRED_TIMEZONE=America/New_York", "-e", "ODS_CONFIG_PATH=/config/ods_config.json", "-e", "RETURNED_SCRAPPED_PAGES_NO=3", "-e", "SCRAPPED_PAGES_NO=5", "-e", "PAGE_CONTENT_WORDS_LIMIT=5000", "-e", "CITATION_LINKS=True", "-e", "MAX_IMAGE_RESULTS=10", "-e", "MAX_VIDEO_RESULTS=10", "-e", "MAX_FILE_RESULTS=5", "-e", "MAX_MAP_RESULTS=5", "-e", "MAX_SOCIAL_RESULTS=5", "-e", "TRAFILATURA_TIMEOUT=15", "-e", "SCRAPING_TIMEOUT=20", "-e", "CACHE_MAXSIZE=100", "-e", "CACHE_TTL_MINUTES=5", "-e", "CACHE_MAX_AGE_MINUTES=30", "-e", "RATE_LIMIT_REQUESTS_PER_MINUTE=10", "-e", "RATE_LIMIT_TIMEOUT_SECONDS=60", "-e", "IGNORED_WEBSITES=", "overtlids/mcp-searxng-enhanced:latest" ], "timeout": 60 } } }

Key Points for MCP Client Configuration:

  • The example above provides a complete set of arguments to run the Docker container with all environment variables set to their default values.
  • To customize any setting, simply modify the value for the corresponding -e "VARIABLE_NAME=value" line within the args array in your MCP client's configuration. For instance, to change SEARXNG_ENGINE_API_BASE_URL and DESIRED_TIMEZONE, you would adjust their respective lines.
  • Refer to the "Environment Variables" table below for a detailed description of each variable and its default.
  • The server's behavior is primarily controlled by these environment variables. While an ods_config.json file can also influence settings (see Configuration Management), environment variables passed by the MCP client take precedence.

Running Natively (Without Docker)

If you prefer to run the server directly using Python without Docker, follow these steps:

1. Python Installation:

  • This server requires Python 3.9 or newer. Python 3.11 (as used in the Docker image) is recommended.
  • You can download Python from python.org.

2. Clone the Repository:

  • Get the code from GitHub:
    git clone https://github.com/OvertliDS/mcp-searxng-enhanced.git cd mcp-searxng-enhanced

3. Create and Activate a Virtual Environment (Recommended):

  • Using a virtual environment helps manage dependencies and avoid conflicts with other Python projects.
    # For Linux/macOS python3 -m venv .venv source .venv/bin/activate # For Windows (Command Prompt) python -m venv .venv .\.venv\Scripts\activate.bat # For Windows (PowerShell) python -m venv .venv .\.venv\Scripts\Activate.ps1

4. Install Dependencies:

  • Install the required Python packages:
    pip install -r requirements.txt
    Key dependencies include httpx, BeautifulSoup4, pydantic, trafilatura, python-dateutil, cachetools, and zoneinfo.

5. Ensure SearXNG is Accessible:

  • You still need a running SearXNG instance. Make sure you have its API base URL (e.g., http://127.0.0.1:8080/search).

6. Set Environment Variables:

  • The server is configured via environment variables. At a minimum, you'll likely need to set SEARXNG_ENGINE_API_BASE_URL.
  • Linux/macOS (bash/zsh):
    export SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search" export DESIRED_TIMEZONE="America/Los_Angeles"
  • Windows (Command Prompt):
    set SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search" set DESIRED_TIMEZONE="America/Los_Angeles"
  • Windows (PowerShell):
    $env:SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search" $env:DESIRED_TIMEZONE="America/Los_Angeles"
  • Refer to the "Environment Variables" table below for all available options. If not set, defaults from the script or an ods_config.json file (if present in the root directory or at ODS_CONFIG_PATH) will be used.

7. Run the Server:

  • Execute the Python script:
    python mcp_server.py
  • The server will start and listen for MCP client connections via stdin/stdout.

8. Configuration File (ods_config.json):

  • Alternatively, or in combination with environment variables, you can create an ods_config.json file in the project's root directory (or the path specified by the ODS_CONFIG_PATH environment variable). Environment variables will always take precedence over values in this file. Example: json { "searxng_engine_api_base_url": "http://127.0.0.1:8080/search", "desired_timezone": "America/New_York" }

Environment Variables

The following environment variables control the server's behavior. You can set them in your MCP client's configuration (recommended for client-managed servers) or when running Docker manually.

VariableDescriptionDefault (from Dockerfile)Notes
SEARXNG_ENGINE_API_BASE_URLSearXNG search endpointhttp://host.docker.internal:8080/searchCrucial for server operation
DESIRED_TIMEZONETimezone for date/time toolAmerica/New_YorkE.g., America/Los_Angeles. List of tz database time zones: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
ODS_CONFIG_PATHPath to persistent configuration file/config/ods_config.jsonTypically left as default within the container.
RETURNED_SCRAPPED_PAGES_NOMax pages to return per search3
SCRAPPED_PAGES_NOMax pages to attempt scraping5
PAGE_CONTENT_WORDS_LIMITMax words per scraped page5000
CITATION_LINKSEnable/disable citation eventsTrueTrue or False
MAX_IMAGE_RESULTSMaximum image results to return10
MAX_VIDEO_RESULTSMaximum video results to return10
MAX_FILE_RESULTSMaximum file results to return5
MAX_MAP_RESULTSMaximum map results to return5
MAX_SOCIAL_RESULTSMaximum social media results to return5
TRAFILATURA_TIMEOUTContent extraction timeout (seconds)15
SCRAPING_TIMEOUTHTTP request timeout (seconds)20
CACHE_MAXSIZEMaximum number of cached websites100
CACHE_TTL_MINUTESCache time-to-live (minutes)5
CACHE_MAX_AGE_MINUTESMaximum age for cached content (minutes)30
RATE_LIMIT_REQUESTS_PER_MINUTEMax requests per domain per minute10
RATE_LIMIT_TIMEOUT_SECONDSRate limit tracking window (seconds)60
IGNORED_WEBSITESComma-separated list of sites to ignore"" (empty)E.g., "example.com,another.org"

Configuration Management

The server uses a three-tier configuration approach:

  1. Script defaults (hardcoded in Python)
  2. Config file (loaded from ODS_CONFIG_PATH, defaults to /config/ods_config.json)
  3. Environment variables (highest precedence)

The config file is only updated when:

  • The file doesn't exist yet (first-time initialization)
  • Environment variables are explicitly provided for the current run

This ensures that user configurations are preserved between container restarts when no new environment variables are set.

Tools & Aliases

Tool NamePurposeAliases
search_webWeb search via SearXNGsearch, web_search, find, lookup_web, search_online, access_internet, lookup*
get_websiteScrape website contentfetch_url, scrape_page, get, load_website, lookup*
get_current_datetimeCurrent date/timecurrent_time, get_time, current_date

*lookup is context-sensitive:

  • If called with a url argument, it maps to get_website
  • Otherwise, it maps to search_web

Example: Calling Tools

Web Search

{ "name": "search_web", "arguments": { "query": "open source ai" } }

or using an alias:

{ "name": "search", "arguments": { "query": "open source ai" } }

Category-Specific Search

{ "name": "search_web", "arguments": { "query": "landscapes", "category": "images" } }

Website Scraping

{ "name": "get_website", "arguments": { "url": "example.com" } }

or using an alias:

{ "name": "lookup", "arguments": { "url": "example.com" } }

Current Date/Time

{ "name": "get_current_datetime", "arguments": {} }

or:

{ "name": "current_time", "arguments": {} }

Advanced Features

The search_web tool supports different categories with tailored outputs:

  • images: Returns image URLs, titles, and source pages with optional Markdown embedding
  • videos: Returns video information including titles, source, and embed URLs
  • files: Returns downloadable file information including format and size
  • map: Returns location data including coordinates and addresses
  • social media: Returns posts and profiles from social platforms
  • general: Default category that scrapes and returns full webpage content

Reddit URL Conversion

When scraping Reddit content, URLs are automatically converted to use the old.reddit.com domain for better content extraction.

Rate Limiting

Domain-based rate limiting prevents excessive requests to the same domain within a time window. This prevents overwhelming target websites and potential IP blocking.

Cache Validation

Cached website content is automatically validated for freshness based on age. Stale content is refreshed automatically while valid cached content is served quickly.

Error Handling

The server implements a robust error handling system with these exception types:

  • MCPServerError: Base exception class for all server errors
  • ConfigurationError: Raised when configuration values are invalid
  • SearXNGConnectionError: Raised when connection to SearXNG fails
  • WebScrapingError: Raised when web scraping fails
  • RateLimitExceededError: Raised when rate limit for a domain is exceeded

Errors are properly propagated to the client with informative messages.

Troubleshooting

  • Cannot connect to SearXNG: Ensure your SearXNG instance is running and the SEARXNG_ENGINE_API_BASE_URL environment variable points to the correct endpoint.
  • Rate limit errors: Adjust RATE_LIMIT_REQUESTS_PER_MINUTE if you're experiencing too many rate limit errors.
  • Slow content extraction: Increase TRAFILATURA_TIMEOUT to allow more time for content processing on complex pages.
  • Docker networking issues: If using Docker Desktop on Windows/Mac, host.docker.internal should resolve to the host machine. On Linux, you may need to use the host's IP address instead.

Acknowledgements

Inspired by:

License

MIT License © 2025 OvertliDS

-
security - not tested
A
license - permissive license
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

A Model Context Protocol server that enables web search with category support, website content scraping with citation metadata, and timezone-aware date/time tools.

  1. Features
    1. Quick Start
      1. Prerequisites
      2. Installation & Usage
    2. Running Natively (Without Docker)
      1. Environment Variables
        1. Configuration Management
          1. Tools & Aliases
            1. Example: Calling Tools
          2. Advanced Features
            1. Category-Specific Search
            2. Reddit URL Conversion
            3. Rate Limiting
            4. Cache Validation
          3. Error Handling
            1. Troubleshooting
              1. Acknowledgements
                1. License

                  Related MCP Servers

                  • A
                    security
                    A
                    license
                    A
                    quality
                    A Model Context Protocol (MCP) server implementation that integrates with FireCrawl for advanced web scraping capabilities.
                    Last updated -
                    9
                    15,275
                    2,745
                    JavaScript
                    MIT License
                    • Apple
                    • Linux
                  • A
                    security
                    F
                    license
                    A
                    quality
                    Built as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.
                    Last updated -
                    4
                    1
                    Python
                    • Apple
                    • Linux
                  • -
                    security
                    A
                    license
                    -
                    quality
                    A Model Context Protocol server that allows LLMs to interact with web content through standardized tools, currently supporting web scraping functionality.
                    Last updated -
                    Python
                    MIT License
                    • Linux
                    • Apple
                  • -
                    security
                    A
                    license
                    -
                    quality
                    A Model Context Protocol server that enables web search, scraping, crawling, and content extraction through multiple engines including SearXNG, Firecrawl, and Tavily.
                    Last updated -
                    35
                    11
                    TypeScript
                    MIT License

                  View all related MCP servers

                  ID: 517w3plzdq