com.luneresearch/lune

Gather evidence and judge sufficiency

gather_evidence

Read-only

Checks gathered evidence sufficiency for a research task, identifies missing evidence, and suggests next search queries using verbatim quotes from paper searches.

Instructions

Use for a multi-part research task when you need to know whether your gathered evidence is SUFFICIENT, what is still MISSING, and what to search next, without the tool writing the answer. Pass the goal in task and your first search angles in queries; the server runs one corpus search per angle, decomposes the task into evidence requirements (or use your own via requirements), and returns each requirement as covered / partial / missing with the exact evidence_spans (verbatim quotes) that support it, plus next_queries for the gaps. Default max_iterations=1 is a one-shot assessment billed len(queries); set max_iterations>1 AND max_total_queries>len(queries) to authorize bounded server-side follow-up searches (billed max_total_queries, capped at 25). Optionally pass a draft to get per-sentence support checks against the gathered spans. Every covered requirement and supported draft sentence carries a verbatim quote verified server-side, so you can cite it directly. You write the answer; cite papers by title, authors, and venue, not by paper_id.

Input Schema

TableJSON Schema

Name	Required	Description
`task`	Yes	The research goal in prose: what you are trying to establish. Drives requirement decomposition and the sufficiency judgment.
`year`	No
`draft`	No	Optional current draft. Each sentence is checked for support against the gathered spans (no extra searches). The tool never rewrites your draft.
`venues`	No	Restrict to these conference short names.
`queries`	Yes	Your initial search angles (full natural-language questions). One corpus search runs per angle; they are billed like search_papers_many.
`year_max`	No
`year_min`	No
`conference`	No	Filter to this conference short name, e.g. "NeurIPS".
`requirements`	No	Optional explicit evidence slots; omit to let the server derive them from `task`.
`max_iterations`	Yes	Sufficiency rounds. Default 1 is a one-shot advisor. Set >1 (with max_total_queries>len(queries)) to authorize bounded server-side follow-up searches.
`max_total_queries`	No	Total search budget across all iterations (the billed ceiling). Defaults to len(queries). Must exceed len(queries) only when max_iterations>1.

Output Schema

TableJSON Schema

Name	Required	Description
`queries_run`	Yes	Actual searches run (<= units_charged).
`stop_reason`	Yes
`next_queries`	Yes	Suggested follow-up search angles for partial / missing requirements.
`requirements`	Yes	One coverage row per requirement: covered / partial / missing.
`draft_support`	Yes	Per-sentence support for a supplied draft, or null when no draft was sent.
`units_charged`	Yes	Billed ceiling (max_total_queries, default len(queries), cap 25).
`evidence_spans`	Yes	The spans the judge evaluated; every supporting_span_id points here.
`iterations_run`	Yes
`queries_failed`	Yes	Per-query failures (non-CircuitBreaker); a systemic outage 503s instead.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses behavior beyond annotations: it runs one corpus search per angle, decomposes tasks, returns coverage status, evidences spans, next queries, and supports draft checks. It explains billing (len(queries) for one-shot, max_total_queries for iterative) and constraints. Annotations readOnlyHint true is consistent with 'without the tool writing the answer'. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat lengthy but well-structured: opening with purpose, then parameter explanations, then additional notes on billing and output. It front-loads the core use. Some redundancy (billing mentioned twice) but generally efficient for the complexity (11 parameters). Score 4.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, the description covers all necessary aspects: what it does, how it works, parameter usage, billing, optional features (requirements, draft), and what returns (coverage status, evidence spans, next queries). The output schema exists, so return values need not be fully detailed, but the description still gives a good overview. Complete and actionable for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 73% schema coverage, the description adds value by explaining parameter semantics (e.g., task drives decomposition, queries each run a search, max_iterations default and behavior, draft usage). It provides context beyond schema descriptions, but the schema already includes basic descriptions. The description enhances understanding, justifying a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to assess sufficiency of gathered evidence for a multi-part research task, distinguishing it from siblings like search_papers or extract_from_papers. It specifies the action (gather evidence and judge sufficiency) and the resource (evidence), meeting the 5 criteria.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance: when needing to know sufficiency, what is missing, and what to search next. It contrasts with not using it for writing the answer, implying the agent writes the answer. It also describes typical workflow and optional parameters. However, it could be more explicit about when not to use (e.g., simple search). Clear but not exhaustive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/RetrogradeLabs/lune-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server