Skip to main content
Glama
rsi-ai-platform

rsi-search-pro-mcp

Official

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
PORTNoPort for HTTP transports7863
MCP_HOSTNoBind host for HTTP transports0.0.0.0
ALLOWED_HOSTSNoComma-separated list of allowed hosts for DNS rebinding protection
MCP_TRANSPORTNoTransport type: stdio, sse, or streamable-httpstdio
FORWARDED_ALLOW_IPSNoTrust proxy headers from these IPs*
BROWSER_RESEARCH_URLNoOverride the upstream browser-research-mcp URL
AUTHORITY_WEB_SEARCH_URLNoOverride the upstream authority-web-search-mcp URL

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
todayA

Return the SERVER'S CURRENT DATE (UTC). Call this FIRST whenever the user mentions a temporal phrase like "latest", "current", "today", "yesterday", "this quarter", "this year" — your training-data cutoff is NOT a reliable anchor for what 'today' actually is. Use the returned iso_date (YYYY-MM-DD) and year to construct concrete queries.

Returns:
    {iso_date, iso_datetime, year, month, day, weekday, quarter,
     fiscal_year_in: "FY26", note}
pick_authority_domainsA

Decide which AUTHORITY domains a query should be restricted to.

Call this BEFORE any web search when the query has an authoritative
answer — official regulators, statistical agencies, market exchanges,
industry SROs. Returns the ranked list keyed by `primary` and `secondary`.

Args:
    query: The user question (free text).
    indicators: Optional indicator hints (e.g. ["repo_rate", "cpi_inflation"]).
    jurisdiction: ISO code: "IN", "US", "UK", "EU". Drives jurisdiction defaults.
    topic_hint: One of "regulator", "market", "news", "company_ir",
                "statistics", "academic", "any".
    additional_authority_sources: If you've already resolved authority
        sources for the indicators (e.g. ["RBI", "MOSPI"]), pass them
        here — they will be expanded to domains via the AUTHORITY_DOMAINS
        registry and used as the primary set.

Returns:
    {primary, secondary, primary_sources, secondary_sources,
     hints, landing_pages, rationale, query, current_date}

    `hints` (when non-empty) are DOMAIN-SPECIFIC INSTRUCTIONS you MUST
    follow for this query — e.g. "for GST collections, prefer the Excel
    files at gst.gov.in/download/gststatistics over PDF press releases".
    `landing_pages` are URLs you should fetch DIRECTLY (web_fetch_structured)
    before broadening the search.
web_search_authoritativeA

Three-pass authoritative web search.

Pass 1: primary authority domains (catalog rules + curated registry).
Pass 2: secondary authority + Tier-1 business press.
Pass 3: open web (toggle via `allow_open_web_fallback`).

Prefer this over plain web search when the query has an authoritative
answer — automates the include-domains discipline and the fallback ladder.

Args:
    query: Search query (free text).
    indicators: Indicator hints; piped to pick_authority_domains.
    jurisdiction: ISO code: "IN", "US", "UK", "EU".
    topic_hint: As in pick_authority_domains.
    include_domains: Manual override; skips pick_authority_domains.
    additional_authority_sources: See pick_authority_domains.
    max_results: Per-pass cap (default 6).
    topic: "general" or "news". When "news", set days for recency window.
    days: Days back for news topic (e.g. 7 = last week).
    allow_open_web_fallback: If False, refuses pass 3.

Returns:
    Tavily result shape plus `pass`, `domains_used`, `authority_score`
    (1.0 primary, 0.6 secondary, 0.3 open web), and `rationale`.
web_fetch_structuredA

Fetch a URL and extract STRUCTURED data via a focused LLM pass.

Better than plain fetch when you need SPECIFIC numbers from a long
press release / annual report / regulatory document. The extraction is
LLM-mediated so it understands context and won't hallucinate values
not on the page.

Args:
    url: The page URL.
    focus: What to extract, e.g. "CPI YoY April 2025, food inflation,
           core CPI". The LLM uses this to bias its extraction.

Returns:
    {title, dateline, summary, key_facts[], numeric_values[],
     dates[], tables_summary[]}

Requires ANTHROPIC_API_KEY env var. Without it, returns raw text only.
web_compare_across_sourcesA

Issue THE SAME query across N authority domains in one call. Returns a per-domain top hit plus an agreement matrix.

Use to cross-validate a load-bearing number (e.g. "India CPI April 2025")
across MoSPI / RBI / IMF / press without writing N separate web_search
calls.

Args:
    claim_or_query: The claim or query to test.
    domains: Authority domains to compare (2-6 sweet spot).
    max_results_per_domain: 1-5.

Returns:
    {per_domain[], domains_covered, domains_total,
     agreement: "agree"|"partial"|"conflict"|"insufficient", summary}
web_sitemap_walkA

Locate the canonical landing page for a topic on an authority domain via /sitemap.xml or /robots.txt Sitemap: declarations. Falls back to Tavily site-restricted search if the domain doesn't expose a sitemap.

Args:
    domain: e.g. "rbi.org.in".
    topic: The topic to score sitemap entries against, e.g. "press releases".
    max_candidates: 1-20.

Returns:
    {domain, sitemap_urls[], candidates[], method, notes[]}
web_searchA

Plain Tavily search — escape hatch for free-form exploration.

Prefer `web_search_authoritative` when the query has an authoritative
answer.
web_fetchA

Plain Tavily extract — returns clean text from a URL.

Prefer `web_fetch_structured` when you need typed key_facts /
numeric_values rather than prose. For PDFs, use `pdf_fetch` instead —
Tavily's Extract often returns "binary / not extractable" for them.
pdf_discoverA

List every PDF link on an HTML landing page, with its anchor text.

Use this on HUB pages — PPAC consumption / production / imports, RBI
bulletin month index, MoSPI press-release listings, MoRTH notification
indexes, MCA filing pages — where the actual data lives in attached
PDFs and the page often has Year/Month/Product dropdowns that are
really just client-side filters over the same anchor set. Returns
each PDF's absolute URL and the human-readable anchor text so you can
pick by name (e.g. "Domestic Consumption of Petroleum Products-2026-27",
"Flash Report May 26").
Workflow: pdf_discover → pick by anchor text → pdf_fetch_structured.

Args:
    url: The HTML landing page URL.
    link_text_filter: Optional case-insensitive substring; only anchors
        whose text contains it are returned. E.g. "2026-27", "Flash".
    max_links: Cap on links returned (default 40).

Returns:
    {url, domain, pdfs: [{href, text, label_hint}], count,
     page_title, fetched_at}
http_post_formA

POST a form (application/x-www-form-urlencoded) and return the JSON.

The escape hatch for Year/Month/Product dropdowns on government dashboards
that don't change the page URL — the dropdown triggers an AJAX POST and
only renders the result client-side, so pdf_discover and web_fetch can't
see it. Use this when a landing page's dropdown isn't a `<select>` whose
value becomes a query param.

Example — PPAC prior-year (FY2025-26) petroleum consumption:
  url = "https://ppac.gov.in/AjaxController/getConsumptionPetroleumProductsChartData"
  form = {"financialYear": "2025-2026", "reportBy": "1", "pageId": "43"}
  referer = "https://ppac.gov.in/consumption/products-wise"
Returns the full FY2025-26 monthly JSON (April 2025 → March 2026).

Args:
    url: The POST endpoint (usually `/AjaxController/...` on gov sites).
    form: Form fields to submit.
    referer: Optional Referer header — many gov AJAX endpoints reject
             requests without one.
    parse: "json" / "text" / "auto" (default — try JSON, fall back to text).

Returns:
    {url, status, content_type, json (when parseable), text, fetched_at}.
pdf_fetchA

Download a PDF directly and extract its text with pypdf.

Use this WHENEVER a `web_fetch` or `web_fetch_structured` call comes
back saying the content was "binary" or "not extractable" — that's
almost always a Tavily limitation on PDFs that are actually text-based
and perfectly extractable with a proper PDF library. Common cases:
PPAC monthly reports, RBI bulletins, MoSPI press release PDFs, PIB
statements, regulator circulars.

Args:
    url: The PDF URL (.pdf in path, or a server that returns
         Content-Type: application/pdf).
    pages: Optional 1-indexed list of pages to extract (e.g. [1, 2, 5]).
           If omitted, the first `max_pages` are extracted.
    max_pages: Cap on auto-extracted pages when `pages` is omitted.

Returns:
    {url, domain, content, fetched_at, page_count, pages_extracted,
     content_truncated, kind: "pdf"}.
pdf_fetch_structuredA

Direct PDF download + pypdf extraction → focused LLM pass → structured JSON.

Same returned shape as `web_fetch_structured` (title, dateline,
key_facts[], numeric_values[], dates[], tables_summary[]) but goes
through the PDF path. Use when you have a PDF URL and want the values
extracted into a structured shape rather than just raw text.
visitA

Open a URL with a real Chromium and return its rendered state.

Use when the cheaper fetch tools (web_fetch, pdf_fetch, http_post_form)
fail because the page is a SPA, JS-rendered chart, login-walled, or has
a dropdown that's not a separate URL.

Args:
    url: The page URL.
    wait_for_selector: Optional CSS selector to await before reading the
        DOM. Use when data appears only after an AJAX call returns —
        e.g. ".chart svg", "table#monthly tbody tr".
    wait_extra_ms: Extra settle time after the wait fires (default 1500).
    timeout_ms: Hard navigation timeout (default 45s).
    screenshot: Whether to capture a PNG INTERNALLY (default True). Adds
        ~200ms; the bytes are used by extract()/act() for Sonnet vision.
    full_page_screenshot: Scroll-stitch the whole page (default False).
    text_cap: Cap on extracted text length (default 30000).
    return_screenshot_b64: Whether to ECHO the base64 PNG back in the
        response. DEFAULT False — typical screenshots are 700KB-1MB and
        accumulating them across an agent's tool-call history blows the
        1M-token context window in ~3 calls. Only opt in when the caller
        actually consumes the bytes (e.g. a browser-canvas UI).

Returns:
    {url, title, domain, text, screenshot_bytes, screenshot_b64 (opt-in),
     fetched_at, current_date}
actA

Drive a real Chromium through a sequence of steps, then run Sonnet structured extraction on the final state.

Use this when the data is BEHIND an interaction — a Year/Month dropdown
that fires AJAX inline, a tab to click, a "Load more" button, a form
to submit. `visit` and `extract` only read the page as it loaded;
`act` clicks/types/selects first.

Steps are a list of single-key dicts:
    {"click":  "css-selector"}
    {"fill":   {"selector": "#q", "value": "x"}}
    {"select": {"selector": "#year", "value": "2024-2025"}}
    {"press":  {"selector": "#q", "key": "Enter"}}
    {"scroll": {"to": "bottom"|"top"|<int px>}}
    {"wait_for_selector": "css-selector"}
    {"wait_for_load_state": "networkidle"|"load"}
    {"wait_ms": 1500}
    {"goto":   "https://…"}     // mid-flow navigation
    {"screenshot": {"name": "after-select"}}    // logged, not returned

Example — pull PPAC FY2024-25 monthly consumption (a flow that needs
the year dropdown change to fire an AJAX request):
    act(
      url="https://ppac.gov.in/consumption/products-wise",
      steps=[
        {"wait_for_selector": "#financialYear"},
        {"select": {"selector": "#financialYear", "value": "2024-2025"}},
        {"wait_for_load_state": "networkidle"},
        {"wait_ms": 2000},
      ],
      focus="FY2024-25 monthly LPG, MS, HSD, ATF consumption",
    )

Returns the same shape as `extract` PLUS `step_results` (per-step
timing + ok/error) and `final_url`.

Args:
    url: Starting page URL.
    steps: Ordered list of action dicts (vocabulary above).
    focus: Extraction focus passed to Sonnet.
    timeout_ms: Per-step navigation / wait timeout.
    full_page_screenshot: Whether the final screenshot is full-page.

Returns:
    {url, domain, title, dateline, summary, key_facts[],
     numeric_values[], dates[], tables_summary[], step_results[],
     final_url, kind: "browser"}.
extractA

Visit a URL → focused Sonnet structured extraction.

Sends BOTH rendered text AND a screenshot to Sonnet — so numbers drawn
via canvas / SVG (chart values on PPAC, RBI, NSE dashboards) that don't
appear in the DOM still get extracted. Same returned shape as
pdf_fetch_structured / web_fetch_structured on authority-web-search-mcp.

Args:
    url: The page URL.
    focus: What to extract, e.g. "monthly LPG, MS, HSD consumption for
           FY2024-25" or "Q4 FY26 EBITDA margin and revenue".
    wait_for_selector: Optional CSS selector to await (see visit).
    full_page_screenshot: Default True so charts below the fold are seen.

Returns:
    {url, domain, title, dateline, summary, key_facts[], numeric_values[],
     dates[], tables_summary[], kind: "browser"}.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rsi-ai-platform/rsi-search-pro-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server