Skip to main content
Glama
JayArrowz

mcp-arsr

ARSR MCP Server

Adaptive Retrieval-Augmented Self-Refinement — a closed-loop MCP server that lets LLMs iteratively verify and correct their own claims using uncertainty-guided retrieval.

What it does

Unlike one-shot RAG (retrieve → generate), ARSR runs a refinement loop:

Generate draft → Decompose claims → Score uncertainty
       ↑                                    ↓
   Decide stop ← Revise with evidence ← Retrieve for low-confidence claims

The key insight: retrieval is guided by uncertainty. Only claims the model is unsure about trigger evidence fetching, and the queries are adversarial — designed to disprove the claim, not just confirm it.

Architecture

The server exposes 6 MCP tools. The outer LLM (Claude, GPT, etc.) orchestrates the loop by calling them in sequence:

#

Tool

Purpose

1

arsr_draft_response

Generate initial candidate answer (returns is_refusal flag)

2

arsr_decompose_claims

Split into atomic verifiable claims

3

arsr_score_uncertainty

Estimate confidence via semantic entropy

4

arsr_retrieve_evidence

Web search for low-confidence claims

5

arsr_revise_response

Rewrite draft with evidence

6

arsr_should_continue

Decide: iterate or finalize

Inner LLM: Tools 1-5 use Claude Haiku internally for intelligence (query generation, claim extraction, evidence evaluation). This keeps costs low while the outer model handles orchestration.

Refusal detection: arsr_draft_response returns a structured is_refusal flag (classified by the inner LLM) indicating whether the draft is a non-answer. When is_refusal is true, downstream tools (decompose, revise) pivot to extracting claims from the original query and building an answer from retrieved evidence instead of trying to refine a refusal.

Web Search: arsr_retrieve_evidence uses the Anthropic API's built-in web search tool — no external search API keys needed.

Setup

Prerequisites

  • Node.js 18+

  • An Anthropic API key

Install & Build

cd arsr-mcp-server
npm install
npm run build

Environment

export ANTHROPIC_API_KEY="sk-ant-..."

Run

stdio mode (for Claude Desktop, Cursor, etc.):

npm start

HTTP mode (for remote access):

TRANSPORT=http PORT=3001 npm start

Claude Desktop Configuration

Add to your claude_desktop_config.json:

Npm:

{
  "mcpServers": {
    "arsr": {
      "command": "npx",
      "args": ["@jayarrowz/mcp-arsr"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "ARSR_MAX_ITERATIONS": "3",
        "ARSR_ENTROPY_SAMPLES": "3",
        "ARSR_RETRIEVAL_STRATEGY": "adversarial",
        "ARSR_INNER_MODEL": "claude-haiku-4-5-20251001"
      }
    }
  }
}

Local build:

{
  "mcpServers": {
    "arsr": {
      "command": "node",
      "args": ["/path/to/arsr-mcp-server/dist/src/index.js"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "ARSR_MAX_ITERATIONS": "3",
        "ARSR_ENTROPY_SAMPLES": "3",
        "ARSR_RETRIEVAL_STRATEGY": "adversarial",
        "ARSR_INNER_MODEL": "claude-haiku-4-5-20251001"
      }
    }
  }
}

How the outer LLM uses it

The orchestrating LLM calls the tools in sequence:

1. draft = arsr_draft_response({ query: "When was Tesla founded?" })
   // draft.is_refusal indicates if the inner LLM refused to answer
2. claims = arsr_decompose_claims({ draft: draft.draft, original_query: "When was Tesla founded?", is_refusal: draft.is_refusal })
3. scored = arsr_score_uncertainty({ claims: claims.claims })
4. low = scored.scored.filter(c => c.confidence < 0.85)
5. evidence = arsr_retrieve_evidence({ claims_to_check: low })
6. revised = arsr_revise_response({ draft: draft.draft, evidence: evidence.evidence, scored: scored.scored, original_query: "When was Tesla founded?", is_refusal: draft.is_refusal })
7. decision = arsr_should_continue({ iteration: 1, scored: revised_scores })
   → if "continue": go to step 2 with revised text
   → if "stop": return revised.revised to user

Configuration

All settings can be overridden via environment variables, falling back to defaults if unset:

Setting

Env var

Default

Description

max_iterations

ARSR_MAX_ITERATIONS

3

Budget limit for refinement loops

confidence_threshold

ARSR_CONFIDENCE_THRESHOLD

0.85

Claims above this skip retrieval

entropy_samples

ARSR_ENTROPY_SAMPLES

3

Rephrasings for semantic entropy

retrieval_strategy

ARSR_RETRIEVAL_STRATEGY

adversarial

adversarial, confirmatory, or balanced

inner_model

ARSR_INNER_MODEL

claude-haiku-4-5-20251001

Model for internal intelligence

Cost estimate

Per refinement loop iteration (assuming ~5 claims, 3 low-confidence):

  • Inner LLM calls: ~6-10 Haiku calls ≈ $0.002-0.005

  • Web searches: 6-9 queries ≈ included in API

  • Typical total for 2 iterations: < $0.02

Images

Before:

After:

License

MIT

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JayArrowz/mcp-arsr'

If you have feedback or need assistance with the MCP directory API, please join our Discord server