Skip to main content
Glama

MCP Web Search Server

by undici77

📂 MCP Web Search Server

A privacy‑focused web, social media, and archive search server exposing tools via the Model Control Protocol (MCP) for controlled access to external search capabilities.


Table of Contents


🎯 Features

  • Parallel search across multiple privacy‑focused web engines.

  • Social media lookup for public content on major platforms.

  • Archive retrieval from Wayback Machine, archive.today, Google Cache and others.

  • Dynamic listing of supported engines and archive services.

  • Result caching with LRU eviction to speed up repeated queries.


📦 Installation & Quick Start

# Clone the repository (if applicable) git clone https://github.com/undici77/MCPWebSearch.git cd MCPWebSearch # Run the startup script (adjust name if different) ./run.sh -d /path/to/working/directory

1️⃣ Create & activate a Python virtual environment (.venv).
2️⃣ Install all required dependencies from requirements.txt.
3️⃣ Launch the MCP Search Server (main.py) which listens on stdin/stdout for JSON‑RPC messages.

📌 Ensure the startup script is executable: chmod +x run.sh


⚙️ Command‑Line Options

Option

Description

-d

,

--directory

Path to the working directory (default: current process dir).

The server itself does not require additional CLI flags; all configuration is performed via JSON‑RPC.


🤝 Integration with LM Studio

Add an entry to your mcp.json so LM Studio can start the server automatically:

{ "mcpServers": { "web-search": { "command": "/absolute/path/to/run.sh", "args": [ "-d", "/absolute/path/to/working/directory" ], "env": { "WORKING_DIR": "." } } } }

📌 Make the script executable (chmod +x /absolute/path/to/run.sh) and run ./run.sh once to install the virtual environment before launching LM Studio.


📡 MCP API Overview

All communication follows JSON‑RPC 2.0 over stdin/stdout.

initialize

{ "jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {} }

Response: protocol version (2024-11-05), server capabilities (tool enumeration) and basic server info (name, version).

tools/list

{ "jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {} }

Response: an array of tool definitions (name, description, input schema).

tools/call

{ "jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": { "name": "<tool_name>", "arguments": { … } } }

Note: The tool identifier key is name, not tool.


🛠️ Available Tools

web_search

Search the web using multiple privacy‑focused engines in parallel.

Name

Type

Required

Description

query

string

Search query (max 500 characters).

engine

string

❌ (default 

all

)

Engine to use (

duckduckgo

,

brave

,

startpage

,

ecosia

,

mojeek

,

yandex

or

all

).

max_results

integer

❌ (default 20)

Max results per engine (1‑50).

Example

{ "jsonrpc": "2.0", "id": 10, "method": "tools/call", "params": { "name": "web_search", "arguments": { "query": "privacy focused search engines", "engine": "duckduckgo", "max_results": 15 } } }

The server returns a formatted text block containing titles, URLs and snippets from each selected engine.


social_search

Search public content on major social‑media platforms.

Name

Type

Required

Description

query

string

Search query (max 500 characters).

platform

string

❌ (default 

all

)

Platform to search (

twitter

,

reddit

,

youtube

,

github

,

stackoverflow

,

medium

,

pinterest

,

tiktok

,

instagram

,

facebook

,

linkedin

or

all

).

Example

{ "jsonrpc": "2.0", "id": 11, "method": "tools/call", "params": { "name": "social_search", "arguments": { "query": "AI ethics research", "platform": "reddit" } } }

The response contains direct URLs that can be opened in a browser.


archives_search

Find archived versions of a URL across multiple web‑archive services.

Name

Type

Required

Description

url

string

Complete URL (must include

http://

or

https://

).

service

string

❌ (default 

all

)

Archive service (

wayback

,

archive_today

,

google_cache

,

bing_cache

,

yandex_cache

,

cachedview

,

ghostarchive

or

all

).

check_availability

boolean

❌ (default false)

When true, the server queries the Wayback Machine API for snapshot statistics.

Example

{ "jsonrpc": "2.0", "id": 12, "method": "tools/call", "params": { "name": "archives_search", "arguments": { "url": "https://example.com", "service": "wayback", "check_availability": true } } }

The response lists archive URLs and, if requested, snapshot counts and timestamps.


list_engines

List all available privacy‑focused search engines.

Name

Type

Required

Description

(No parameters)

Example

{ "jsonrpc": "2.0", "id": 13, "method": "tools/call", "params": { "name": "list_engines", "arguments": {} } }

The server returns a markdown‑formatted overview of each engine and usage notes.


list_archives_services

List all supported web‑archive services.

Name

Type

Required

Description

(No parameters)

Example

{ "jsonrpc": "2.0", "id": 14, "method": "tools/call", "params": { "name": "list_archives_services", "arguments": {} } }

The response includes a description of each service, its ID and key features.


clear_cache

Clear the internal search‑result cache.

Name

Type

Required

Description

(No parameters)

Example

{ "jsonrpc": "2.0", "id": 15, "method": "tools/call", "params": { "name": "clear_cache", "arguments": {} } }

The server replies with a confirmation message.


🔐 Security Features

  • Query sanitisation – strips control characters, removes HTML tags and enforces MAX_QUERY_LENGTH (500).

  • Strict URL validation – accepts only http:// or https:// schemes with a valid domain.

  • Blocked patterns – regexes prevent <script> injection, javascript: URIs and event‑handler attributes.

  • Input schema enforcement – each tool validates required fields via the JSON‑RPC inputSchema.

  • Rate limiting – an asyncio semaphore caps concurrent external requests (MAX_CONCURRENT_SEARCHES).


© 2025 Undici77 – All rights reserved.

-
security - not tested
A
license - permissive license
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Enables privacy-focused web searches, social media lookups, and web archive retrieval across multiple engines including DuckDuckGo, Brave, Reddit, YouTube, and Wayback Machine with built-in caching and security features.

  1. Table of Contents
    1. 🎯 Features
      1. 📦 Installation & Quick Start
        1. ⚙️ Command‑Line Options
          1. 🤝 Integration with LM Studio
            1. 📡 MCP API Overview
              1. initialize
              2. tools/list
              3. tools/call
            2. 🛠️ Available Tools
              1. web_search
              2. social_search
              3. archives_search
              4. list_engines
              5. list_archives_services
              6. clear_cache
            3. 🔐 Security Features

              MCP directory API

              We provide all the information about MCP servers via our MCP API.

              curl -X GET 'https://glama.ai/api/mcp/v1/servers/undici77/MCPWebSearch'

              If you have feedback or need assistance with the MCP directory API, please join our Discord server