Web Researcher MCP
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| PORT | No | Enable HTTP/SSE mode | |
| SEARXNG_URL | No | SearXNG instance URL | |
| BRAVE_API_KEY | No | Brave Search API key | |
| OAUTH_AUDIENCE | No | Expected JWT audience claim | |
| SEARCH_ROUTING | No | Multi-provider routing with automatic fallback (e.g. brave,google,serper) | |
| SERPER_API_KEY | No | Serper.dev API key | |
| SEARCH_PROVIDER | No | Backend: google, brave, serper, searxng, or searchapi | |
| OAUTH_ISSUER_URL | No | JWT issuer URL for token validation | |
| SEARCHAPI_API_KEY | No | SearchAPI.io API key | |
| GOOGLE_CUSTOM_SEARCH_ID | Yes | Programmable Search Engine ID | |
| GOOGLE_CUSTOM_SEARCH_API_KEY | Yes | Google API key from Google Cloud Console |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| logging | {} |
| prompts | {
"listChanged": true
} |
| resources | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| academic_searchA | Search peer-reviewed papers and scholarly literature using plain natural language — no special syntax needed. Each result includes the paper's title, authors, journal, year, abstract, citation count, and a PDF link when one is available (pair with scrape_page to read the full text). Reach for this for literature reviews, prior-art research, and finding citations; use web_search for non-academic content or news_search for current events. Results can be narrowed by year, source, or access type. Returns structured JSON, with recovery hints when nothing matches. Results stay fresh for 1 hour. |
| get_research_sessionA | Recover a sequential_search research session after context loss. Returns the session summary, step index, and most recent steps. Use stepId to retrieve full details of a specific earlier step. Sessions persist for 4 hours from last activity and survive server restarts. |
| image_searchA | Find images on the web matching your description. Filter by size, type (photo, clipart, line art, etc.), dominant color, or file format. Returns up to 10 image links per search. Best for finding visual references or assets — use web_search if you need text content from pages that contain images. Results stay fresh for 30 minutes. |
| news_searchA | Find recent news articles on any topic, returning each article's headline, source, publish time, and snippet. Defaults to the past week, but the freshness window is tunable for breaking news or for looking further back, and results can be limited to a single outlet. Reach for this when recency matters; use web_search for general content, academic_search for research papers, or search_and_scrape when you need the full article text. Errors come back as structured JSON. Results refresh every 15 minutes. |
| patent_searchA | Search patents for prior art, competitive landscape mapping, or to look up a specific patent. Query by patent number (e.g. 'US11234567'), an invention description, a company, or an inventor — company name variations are matched automatically. Each result carries the patent's bibliographic details (title, number, abstract, assignee, inventor, dates, status). Reach for this when the question is about inventions or IP; use academic_search for research papers or web_search for general technical content. Zero-result and error responses come back as structured JSON with recovery hints. Results stay fresh for 24 hours. |
| scrape_pageA | Read a single URL and get back its content — web pages (including JavaScript-heavy sites), PDFs, Word/PowerPoint files, and YouTube transcripts — picking the best extraction method automatically. Returns readable text plus a ready-to-use citation. Reach for this when you already have a URL and want what's on the page; use search_and_scrape to find and read in one step, or web_search when you only need links. Modes: full (default, cleaned text), preview (a fast first look), and raw (verbatim page bytes with no sanitization — only for inspecting source like JSON or HTML, and the bytes are untrusted, so never execute or render them). Blocked pages and other failures return structured JSON (kind, retryable, suggestedAction). Results stay fresh for 1 hour. |
| search_and_scrapeA | Search the web and read the full content from the top results, all in one step. Combines content from multiple sources, removes duplicates, and scores each source for quality and relevance. Returns a status field (complete/partial/failed) and per-source quality scores. If some pages fail, scrapeFailures lists each with kind, retryable, and suggestedAction. Use web_search if you only need links, or scrape_page to read one specific URL you already have. |
| sequential_searchA | Keep track of a multi-step research project. Use this alongside web_search or search_and_scrape to record what you've found at each step, note unanswered questions, and explore alternative angles (branching). Start a new session with stepNumber=1, then pass the returned sessionId for each follow-up step. Mark the session complete by setting nextStepNeeded=false. Sessions stay active for 4 hours between steps and persist across restarts. Use get_research_session to recover a session after context loss. |
| web_searchA | Search the web and get a list of relevant pages with titles and snippets — without reading the full page content. Narrow results to one domain with the site parameter, or apply a search lens to restrict to trusted sites in a field (see the lens parameter for the full list). Use search_and_scrape if you need full page text, news_search for current events, or academic_search for research papers. Results stay fresh for 30 minutes; use time_range to get more recent results. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
| competitive-analysis | Research competitors in a given market |
| comprehensive-research | Guide an AI assistant through a multi-step research process |
| fact-check | Verify a claim using multiple independent sources |
| literature-review | Systematic review of academic literature on a topic |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
| Configured Providers | Search, patent, and academic providers currently configured and available for use |
| Rate Limit Status | How many requests you can make and how many you have left today. Only applies when connecting over the network (not in local mode). |
| Active Sessions | Count of active research sessions |
| Tool Statistics | Usage stats for each tool — how many times it's been called, how fast it responded, and how often errors occurred |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/zoharbabin/web-researcher-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server