mcp-websearch-aggregator
Aggregates academic paper search results from ArXiv, providing access to scientific literature.
Aggregates search results from DuckDuckGo, providing web search capabilities within the MCP server.
Aggregates structured data queries from Wikidata, enabling knowledge graph searches.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-websearch-aggregatorsearch for recent papers on artificial intelligence"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Websearch Aggregator
Overview
mcp-websearch-aggregator is a Python microservice that exposes a JSON-RPC /mcp endpoint and aggregates search results from DuckDuckGo, Wikidata, and ArXiv into a single markdown response.
This repository follows an enterprise-friendly src/ layout and contains automated tests, configuration via environment variables, and Prometheus-compatible metrics.
Related MCP server: duckduckgo-mcp
Table of Contents
Quick Start
git clone <repo-url>
cd mcp-websearch-aggregator
uv venv
source .venv/bin/activate
uv sync
python main.pyOpen http://127.0.0.1:3000 and send JSON-RPC requests to /mcp.
Prerequisites
macOS
This project targets Python 3.14. On macOS, install the required runtime and tooling with Homebrew.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install python@3.14
brew install uvVerify your installations:
python3.14 --version
uv --versionRecommended local environment
uv venv
source .venv/bin/activate
uv syncRepository Layout
.
├── README.md
├── pyproject.toml
├── main.py
├── src/
│ └── mcp_websearch_aggregator/
│ ├── config.py
│ ├── logging.py
│ ├── main.py
│ ├── middleware.py
│ ├── metrics.py
│ ├── server.py
│ ├── services/
│ │ ├── arxiv.py
│ │ ├── duckduckgo.py
│ │ └── wikidata.py
│ └── utils/
│ └── formatter.py
└── tests/
├── integration/
└── unit/Configuration
The service is configured using environment variables. Defaults are provided in src/mcp_websearch_aggregator/config.py.
Variable | Default | Description |
|
| Runtime environment identifier ( |
|
| Host interface for the HTTP server |
|
| Port for the HTTP server |
|
| Logging verbosity |
|
| HTTP request timeout in seconds |
|
| Maximum results fetched per third-party service |
|
| DuckDuckGo service endpoint |
|
| Wikidata service endpoint |
|
| ArXiv service endpoint |
|
| Enable Prometheus metrics export |
Default environment variables
export ENVIRONMENT=dev
export SERVER_HOST=127.0.0.1
export SERVER_PORT=3000
export LOG_LEVEL=DEBUG
export REQUEST_TIMEOUT=10
export MAX_RESULTS=1
export METRICS_ENABLED=trueRecommended config files
Use a dedicated .env/ folder to organize environment-specific configuration files. This keeps configs organized and scalable as the project grows.
Profile | Path (from repo root) | Purpose | Key Settings |
DEV |
| Local development |
|
UAT |
| User acceptance testing |
|
PROD |
| Production deployment |
|
Create the folder structure:
mkdir -p .env/dev .env/uat .env/prodEach file should define the same variables (shown in the Configuration section above). Load a profile before running the app:
source .env/dev
python main.pyFor UAT and PROD, keep these profile files secure and out of public source control. Alternatively, inject environment variables through your CI/CD or container runtime instead of using local files.
Because profile files already define environment variables, you do not need to run additional export commands after sourcing them. The shell will load the values into the current session automatically.
For UAT and PROD, keep these profile files in a secure deployment pipeline, not in public source control. Alternatively, inject the same variables through your CI/CD or container runtime instead of using local files.
Environment Profiles
This repository supports explicit environment profiles via the ENVIRONMENT variable. Each profile is a preset of environment variables tailored for a specific deployment context.
Setting | DEV | UAT | PROD |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Development
Install dependencies
uv syncThis reads pyproject.toml and installs all runtime and development dependencies into the active virtual environment.
Recommended workflow
Create a virtual environment
Activate the environment
Install dependencies
Run the application locally
Execute tests
Running the app locally
Use the root launcher:
python main.pyOr run the ASGI app directly with Uvicorn:
uvicorn mcp_websearch_aggregator.server:app --host 127.0.0.1 --port 3000 --reloadDebugging
Inspect service logs on startup and request handling
Use the
/metricsendpoint for runtime metricsIncrease logging verbosity with
LOG_LEVEL=DEBUG
Testing
Run all tests with Pytest:
uv run pytest -qRun a specific test file:
uv run pytest tests/unit/test_duckduckgo.py -qCode Coverage
Measure code coverage with pytest-cov:
uv run pytest --cov=src/mcp_websearch_aggregator --cov-report=term-missingGenerate an HTML coverage report:
uv run pytest --cov=src/mcp_websearch_aggregator --cov-report=htmlView the report in your browser:
open htmlcov/index.htmlCoverage configuration is defined in .coveragerc. The project enforces a minimum 80% code coverage threshold. Excluded lines are marked with # pragma: no cover.
Coverage metrics:
Line coverage: % of lines executed
Branch coverage: % of conditional branches exercised
Missing lines: source lines not covered by tests
Packaging
Build a source distribution and wheel:
python -m pip install --upgrade build
python -m buildInstall the package locally for validation:
python -m pip install dist/mcp_websearch_aggregator-0.1.0-py3-none-any.whlRunning
The service exposes two endpoints:
POST /mcp— JSON-RPC search endpointGET /metrics— Prometheus metrics endpoint
Example run command:
uvicorn mcp_websearch_aggregator.server:app --host 0.0.0.0 --port 3000Deployment
For production, use an ASGI server with process management and explicit environment configuration.
Deploy to UAT
export ENVIRONMENT=uat
export SERVER_HOST=0.0.0.0
export SERVER_PORT=3000
export LOG_LEVEL=INFO
export REQUEST_TIMEOUT=15
export MAX_RESULTS=1
export METRICS_ENABLED=true
gunicorn -k uvicorn.workers.UvicornWorker mcp_websearch_aggregator.server:app --bind 0.0.0.0:3000Deploy to PROD
export ENVIRONMENT=prod
export SERVER_HOST=0.0.0.0
export SERVER_PORT=3000
export LOG_LEVEL=INFO
export REQUEST_TIMEOUT=20
export MAX_RESULTS=1
export METRICS_ENABLED=true
gunicorn -k uvicorn.workers.UvicornWorker mcp_websearch_aggregator.server:app --bind 0.0.0.0:3000Docker / containerized deployment
For container deployment, pass environment variables through your container runtime or orchestration platform.
docker run -e ENVIRONMENT=prod -e SERVER_HOST=0.0.0.0 -e SERVER_PORT=3000 \
-e LOG_LEVEL=INFO -e REQUEST_TIMEOUT=20 -e MAX_RESULTS=1 -e METRICS_ENABLED=true \
your-image-nameAPI Contract
OpenAPI Documentation
FastAPI automatically generates interactive API documentation with Pydantic models. Access it at:
Swagger UI:
http://127.0.0.1:3000/docs— interactive API explorerReDoc:
http://127.0.0.1:3000/redoc— alternative documentation view
The /docs endpoint allows you to test the /mcp endpoint directly in your browser.
JSON-RPC Endpoint: /mcp
The service implements the JSON-RPC 2.0 specification. It accepts POST requests and returns JSON-RPC responses.
Request format
{
"jsonrpc": "2.0",
"method": "search",
"params": {"query": "machine learning"},
"id": 1
}Fields:
jsonrpc: always"2.0"method:"search"(other methods return "Method not found" error)params: object withqueryfield (search term)id: unique request identifier for correlation
Success response
{
"jsonrpc": "2.0",
"result": "## DuckDuckGo\n...\n\n## Wikidata\n...\n\n## ArXiv\n...",
"id": 1
}Fields:
result: markdown-formatted aggregated results from all sources
Error response
{
"jsonrpc": "2.0",
"error": {
"code": -32600,
"message": "Invalid Request"
},
"id": null
}Common error codes:
-32700: Parse error (malformed JSON)-32600: Invalid Request (missing/invalid fields)-32601: Method not found (unsupported method)-32603: Internal error (server failure)
Pydantic Models
The service uses Pydantic for request/response validation and schema documentation:
SearchParams: query string parameter definitionJSONRPCRequest: complete request envelopeJSONRPCResponse: success response with markdown resultJSONRPCError: error object structureJSONRPCErrorResponse: error response envelope
These models are automatically documented in the OpenAPI schema and visible in the Swagger UI.
Troubleshooting
ModuleNotFoundError: ensuresrc/is onPYTHONPATHor run viapython main.pyConnectionError: verify outbound internet access to DuckDuckGo, Wikidata, and ArXivTimeoutError: increaseREQUEST_TIMEOUT
Notes
The repository uses
src/package layout for clean import boundaries.The default JSON-RPC method is
search; other methods returnMethod not found.Metrics are exported in plaintext for Prometheus scraping.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/anuragsindhu/mcp-websearch-aggregator'
If you have feedback or need assistance with the MCP directory API, please join our Discord server