Skip to main content
Glama
anuragsindhu

mcp-websearch-aggregator

by anuragsindhu

MCP Websearch Aggregator

Overview

mcp-websearch-aggregator is a Python microservice that exposes a JSON-RPC /mcp endpoint and aggregates search results from DuckDuckGo, Wikidata, and ArXiv into a single markdown response.

This repository follows an enterprise-friendly src/ layout and contains automated tests, configuration via environment variables, and Prometheus-compatible metrics.

Related MCP server: duckduckgo-mcp

Table of Contents

Quick Start

git clone <repo-url>
cd mcp-websearch-aggregator
uv venv
source .venv/bin/activate
uv sync
python main.py

Open http://127.0.0.1:3000 and send JSON-RPC requests to /mcp.

Prerequisites

macOS

This project targets Python 3.14. On macOS, install the required runtime and tooling with Homebrew.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install python@3.14
brew install uv

Verify your installations:

python3.14 --version
uv --version
uv venv
source .venv/bin/activate
uv sync

Repository Layout

.
├── README.md
├── pyproject.toml
├── main.py
├── src/
│   └── mcp_websearch_aggregator/
│       ├── config.py
│       ├── logging.py
│       ├── main.py
│       ├── middleware.py
│       ├── metrics.py
│       ├── server.py
│       ├── services/
│       │   ├── arxiv.py
│       │   ├── duckduckgo.py
│       │   └── wikidata.py
│       └── utils/
│           └── formatter.py
└── tests/
    ├── integration/
    └── unit/

Configuration

The service is configured using environment variables. Defaults are provided in src/mcp_websearch_aggregator/config.py.

Variable

Default

Description

ENVIRONMENT

dev

Runtime environment identifier (dev, uat, prod)

SERVER_HOST

127.0.0.1

Host interface for the HTTP server

SERVER_PORT

3000

Port for the HTTP server

LOG_LEVEL

INFO

Logging verbosity

REQUEST_TIMEOUT

10

HTTP request timeout in seconds

MAX_RESULTS

1

Maximum results fetched per third-party service

DUCKDUCKGO_API

https://api.duckduckgo.com

DuckDuckGo service endpoint

WIKIDATA_API

https://www.wikidata.org/w/api.php

Wikidata service endpoint

ARXIV_API

http://export.arxiv.org/api/query

ArXiv service endpoint

METRICS_ENABLED

true

Enable Prometheus metrics export

Default environment variables

export ENVIRONMENT=dev
export SERVER_HOST=127.0.0.1
export SERVER_PORT=3000
export LOG_LEVEL=DEBUG
export REQUEST_TIMEOUT=10
export MAX_RESULTS=1
export METRICS_ENABLED=true

Use a dedicated .env/ folder to organize environment-specific configuration files. This keeps configs organized and scalable as the project grows.

Profile

Path (from repo root)

Purpose

Key Settings

DEV

.env/dev

Local development

SERVER_HOST=127.0.0.1, LOG_LEVEL=DEBUG, REQUEST_TIMEOUT=10

UAT

.env/uat

User acceptance testing

SERVER_HOST=0.0.0.0, LOG_LEVEL=INFO, REQUEST_TIMEOUT=15

PROD

.env/prod

Production deployment

SERVER_HOST=0.0.0.0, LOG_LEVEL=INFO, REQUEST_TIMEOUT=20

Create the folder structure:

mkdir -p .env/dev .env/uat .env/prod

Each file should define the same variables (shown in the Configuration section above). Load a profile before running the app:

source .env/dev
python main.py

For UAT and PROD, keep these profile files secure and out of public source control. Alternatively, inject environment variables through your CI/CD or container runtime instead of using local files.

Because profile files already define environment variables, you do not need to run additional export commands after sourcing them. The shell will load the values into the current session automatically.

For UAT and PROD, keep these profile files in a secure deployment pipeline, not in public source control. Alternatively, inject the same variables through your CI/CD or container runtime instead of using local files.

Environment Profiles

This repository supports explicit environment profiles via the ENVIRONMENT variable. Each profile is a preset of environment variables tailored for a specific deployment context.

Setting

DEV

UAT

PROD

ENVIRONMENT

dev

uat

prod

SERVER_HOST

127.0.0.1

0.0.0.0

0.0.0.0

SERVER_PORT

3000

3000

3000

LOG_LEVEL

DEBUG

INFO

INFO

REQUEST_TIMEOUT

10

15

20

MAX_RESULTS

1

1

1

METRICS_ENABLED

true

true

true

Development

Install dependencies

uv sync

This reads pyproject.toml and installs all runtime and development dependencies into the active virtual environment.

  1. Create a virtual environment

  2. Activate the environment

  3. Install dependencies

  4. Run the application locally

  5. Execute tests

Running the app locally

Use the root launcher:

python main.py

Or run the ASGI app directly with Uvicorn:

uvicorn mcp_websearch_aggregator.server:app --host 127.0.0.1 --port 3000 --reload

Debugging

  • Inspect service logs on startup and request handling

  • Use the /metrics endpoint for runtime metrics

  • Increase logging verbosity with LOG_LEVEL=DEBUG

Testing

Run all tests with Pytest:

uv run pytest -q

Run a specific test file:

uv run pytest tests/unit/test_duckduckgo.py -q

Code Coverage

Measure code coverage with pytest-cov:

uv run pytest --cov=src/mcp_websearch_aggregator --cov-report=term-missing

Generate an HTML coverage report:

uv run pytest --cov=src/mcp_websearch_aggregator --cov-report=html

View the report in your browser:

open htmlcov/index.html

Coverage configuration is defined in .coveragerc. The project enforces a minimum 80% code coverage threshold. Excluded lines are marked with # pragma: no cover.

Coverage metrics:

  • Line coverage: % of lines executed

  • Branch coverage: % of conditional branches exercised

  • Missing lines: source lines not covered by tests

Packaging

Build a source distribution and wheel:

python -m pip install --upgrade build
python -m build

Install the package locally for validation:

python -m pip install dist/mcp_websearch_aggregator-0.1.0-py3-none-any.whl

Running

The service exposes two endpoints:

  • POST /mcp — JSON-RPC search endpoint

  • GET /metrics — Prometheus metrics endpoint

Example run command:

uvicorn mcp_websearch_aggregator.server:app --host 0.0.0.0 --port 3000

Deployment

For production, use an ASGI server with process management and explicit environment configuration.

Deploy to UAT

export ENVIRONMENT=uat
export SERVER_HOST=0.0.0.0
export SERVER_PORT=3000
export LOG_LEVEL=INFO
export REQUEST_TIMEOUT=15
export MAX_RESULTS=1
export METRICS_ENABLED=true

gunicorn -k uvicorn.workers.UvicornWorker mcp_websearch_aggregator.server:app --bind 0.0.0.0:3000

Deploy to PROD

export ENVIRONMENT=prod
export SERVER_HOST=0.0.0.0
export SERVER_PORT=3000
export LOG_LEVEL=INFO
export REQUEST_TIMEOUT=20
export MAX_RESULTS=1
export METRICS_ENABLED=true

gunicorn -k uvicorn.workers.UvicornWorker mcp_websearch_aggregator.server:app --bind 0.0.0.0:3000

Docker / containerized deployment

For container deployment, pass environment variables through your container runtime or orchestration platform.

docker run -e ENVIRONMENT=prod -e SERVER_HOST=0.0.0.0 -e SERVER_PORT=3000 \
  -e LOG_LEVEL=INFO -e REQUEST_TIMEOUT=20 -e MAX_RESULTS=1 -e METRICS_ENABLED=true \
  your-image-name

API Contract

OpenAPI Documentation

FastAPI automatically generates interactive API documentation with Pydantic models. Access it at:

  • Swagger UI: http://127.0.0.1:3000/docs — interactive API explorer

  • ReDoc: http://127.0.0.1:3000/redoc — alternative documentation view

The /docs endpoint allows you to test the /mcp endpoint directly in your browser.

JSON-RPC Endpoint: /mcp

The service implements the JSON-RPC 2.0 specification. It accepts POST requests and returns JSON-RPC responses.

Request format

{
  "jsonrpc": "2.0",
  "method": "search",
  "params": {"query": "machine learning"},
  "id": 1
}

Fields:

  • jsonrpc: always "2.0"

  • method: "search" (other methods return "Method not found" error)

  • params: object with query field (search term)

  • id: unique request identifier for correlation

Success response

{
  "jsonrpc": "2.0",
  "result": "## DuckDuckGo\n...\n\n## Wikidata\n...\n\n## ArXiv\n...",
  "id": 1
}

Fields:

  • result: markdown-formatted aggregated results from all sources

Error response

{
  "jsonrpc": "2.0",
  "error": {
    "code": -32600,
    "message": "Invalid Request"
  },
  "id": null
}

Common error codes:

  • -32700: Parse error (malformed JSON)

  • -32600: Invalid Request (missing/invalid fields)

  • -32601: Method not found (unsupported method)

  • -32603: Internal error (server failure)

Pydantic Models

The service uses Pydantic for request/response validation and schema documentation:

  • SearchParams: query string parameter definition

  • JSONRPCRequest: complete request envelope

  • JSONRPCResponse: success response with markdown result

  • JSONRPCError: error object structure

  • JSONRPCErrorResponse: error response envelope

These models are automatically documented in the OpenAPI schema and visible in the Swagger UI.

Troubleshooting

  • ModuleNotFoundError: ensure src/ is on PYTHONPATH or run via python main.py

  • ConnectionError: verify outbound internet access to DuckDuckGo, Wikidata, and ArXiv

  • TimeoutError: increase REQUEST_TIMEOUT

Notes

  • The repository uses src/ package layout for clean import boundaries.

  • The default JSON-RPC method is search; other methods return Method not found.

  • Metrics are exported in plaintext for Prometheus scraping.

A
license - permissive license
-
quality - not tested
D
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/anuragsindhu/mcp-websearch-aggregator'

If you have feedback or need assistance with the MCP directory API, please join our Discord server