Disco

discovery_estimate

Read-only

Estimate cost, time, and credit requirements before running a data analysis to determine if you have sufficient resources and whether a free public alternative exists.

Instructions

Estimate cost, time, and credit requirements before running an analysis.

Returns credit cost, estimated duration in seconds, whether you have
sufficient credits, and whether a free public alternative exists. Always call
this before discovery_analyze for private runs.

Args:
    file_size_mb: Size of the dataset in megabytes.
    num_columns: Number of columns in the dataset.
    num_rows: Number of rows (optional, improves time estimate).
    analysis_depth: Search depth (1=fast, higher=deeper). Default 1.
    visibility: "public" (free, results published) or "private" (costs credits).
    use_llms: Slower and more expensive, but you get smarter pre-processing, summary page, literature context and pattern novelty assessment. Only applies to private runs — public runs always use LLMs. Default false.
    api_key: Disco API key (disco_...). Optional if DISCOVERY_API_KEY env var is set.

Input Schema

TableJSON Schema

Name	Required	Default
`file_size_mb`	Yes
`num_columns`	Yes
`num_rows`	No
`analysis_depth`	No
`visibility`	No	public
`use_llms`	No
`api_key`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond the readOnlyHint annotation. It explains that this is a pre-check tool to avoid unexpected costs, describes the different visibility modes (public vs private), clarifies the LLM behavior difference between public and private runs, and mentions the API key fallback to environment variable. While the annotation covers safety, the description provides important operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and appropriately sized. It starts with the core purpose, then lists outputs, provides critical usage guidance, and details each parameter with meaningful explanations. While comprehensive, every sentence earns its place by adding necessary information for tool selection and invocation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, cost estimation function) and the presence of an output schema, the description is complete. It explains the tool's role in the workflow, distinguishes it from siblings, provides parameter semantics that the schema lacks, and gives operational context. The output schema existence means the description doesn't need to detail return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by providing detailed semantic explanations for all 7 parameters. It explains what each parameter means (e.g., 'analysis_depth: Search depth (1=fast, higher=deeper)', 'use_llms: Slower and more expensive, but you get smarter pre-processing...'), specifies defaults, and clarifies optional vs required parameters with practical implications.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to estimate cost, time, and credit requirements before running an analysis. It specifies the exact outputs (credit cost, duration, credit sufficiency, free alternative existence) and distinguishes it from sibling tools by explicitly mentioning its relationship to discovery_analyze for private runs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Always call this before discovery_analyze for private runs.' It also distinguishes between public (free) and private (costs credits) runs, and clarifies that LLMs only apply to private runs while public runs always use them. This gives clear when-to-use and when-not-to-use criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/leap-laboratories/discovery-engine'

If you have feedback or need assistance with the MCP directory API, please join our Discord server