Skip to main content
Glama

Valuein — SEC EDGAR Fundamentals & Smart-Money Data

Server Details

Point-in-time, survivorship-free SEC EDGAR fundamentals + smart-money signals for AI agents.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
valuein/valuein
GitHub Stars
0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.6/5 across 68 of 68 tools scored. Lowest: 3.6/5.

Server CoherenceA
Disambiguation4/5

Tool descriptions are detailed and each tool has a clearly defined purpose. However, the sheer number of tools (68) means that some tools like get_company_fundamentals, get_financial_ratios, and get_valuation_metrics could be confused by an agent if it doesn't read descriptions carefully. Overall, most tools are unambiguous.

Naming Consistency5/5

All tool names follow the snake_case verb_noun pattern consistently, making it easy to predict function. Examples: get_*, create_*, delete_*, search_*, etc. There are no irregularities.

Tool Count2/5

With 68 tools, this server far exceeds the typical well-scoped range of 3-15. While each tool serves a specific purpose, the large number creates a steep learning curve and increases the chance of agent mis-selection.

Completeness5/5

The tool set covers the full lifecycle for fundamental data, insider and institutional analytics, reports, theses, claims, alerts, watchlists, and workflows. It includes CRUD operations, generation of Excel/DOCX, and SEC EDGAR searches. No obvious gaps within the stated domain.

Available Tools

68 tools
compare_periodsCompare Financial PeriodsA
Read-onlyIdempotent
Inspect

Compare a company's core financial metrics across two fiscal periods side-by-side. Shows absolute and percentage changes with significance classification (minor < 5%, notable 5–15%, significant > 15%). The response includes a material_changes count: this is the number of metrics whose significance ∈ {notable, significant} (i.e. absolute percentage change > 5%). Use it as a quick scalar to triage filings — anything > ~3 typically signals a material event worth deeper review. Use period format: 'FY2024' for annual, 'Q1-2024' for quarterly. Pass period_a as the EARLIER period and period_b as the LATER one — if you invert them the server auto-swaps and sets swapped: true in the response so deltas always carry the correct sign (rather than silently flipping). Point-in-time safe via as_of_date. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol, e.g. AAPL, MSFT, BRK.B
period_aYesEarlier fiscal period. Format: 'FY2023' for annual or 'Q1-2023' for quarterly.
period_bYesLater fiscal period. Format: 'FY2024' for annual or 'Q1-2024' for quarterly.
as_of_dateNoPoint-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias for backtesting.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
changesYesPer-metric deltas: metric, label, period_a, period_b, delta, delta_pct, significance
swappedYesTrue when inputs were reordered so period_b is the more recent period
period_aYesEarlier period descriptor: label, fiscal_year, fiscal_period, period_end, filing_date
period_bYesLater period descriptor, same shape as period_a
as_of_dateNo
company_nameNo
total_metricsYesCount of metrics compared across the two periods
material_changesYesCount of metrics flagged as a material change
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. Description adds auto-swapping with swapped flag, significance thresholds (minor <5%, notable 5-15%, significant >15%), material_changes count as triage scalar, and point-in-time semantics. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences that are front-loaded with main purpose, each sentence adds distinct information (significance, material_changes, period format, auto-swap, point-in-time). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete for a comparison tool: covers input parameters, output semantics (significance, material_changes), edge cases (auto-swap on inversion), and safety feature (point-in-time). Output schema exists but description sufficiently explains key return elements.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 4 parameters. Description adds valuable context: format examples for period_a/b, explanation of auto-swap inversion logic, significance classification for output, and practical use of as_of_date for backtesting. Significantly enhances parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares core financial metrics across two periods side-by-side, with verb 'compare' and specific resource 'financial metrics across fiscal periods'. It distinguishes from sibling tools like get_company_fundamentals or get_financial_ratios by focusing on period comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear usage context: period format (FY2024, Q1-2024), order (period_a earlier, period_b later), auto-swap behavior, and point-in-time safety via as_of_date. Does not explicitly state when not to use or list alternatives, but guides correct usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compute_dcfCompute Forward DCFA
Read-onlyIdempotent
Inspect

Forward discounted-cash-flow valuation (two-stage Gordon-growth model): caller provides growth + WACC + terminal assumptions, returns per-share intrinsic value (value_per_share_cents, cents USD) + 5×5 sensitivity grid. Pulls FCF base + net debt + shares from R2; caller can override any field. Definitions (consistent with get_financial_ratios / get_capital_allocation_profile): FCF base = operating_cash_flow − capex (absolute USD); net_debt = total_debt − (cash + short-term investments). Shares resolve via a fallback chain (valuation row → fact CommonSharesOutstanding → net_income/eps_diluted), reported as result.shares_source. The pulled inputs are echoed in result.inputs_echo with their source lineage so the valuation is reproducible and traceable. A null value_per_share_cents means the model is degenerate (e.g. WACC ≤ terminal growth, or FCF base ≤ 0) or a required input was unavailable — it is NOT a zero valuation; the reason field explains. Use the returned figures exactly. Use this when you want to drive the assumptions yourself; for the pipeline's pre-computed DCF/DDM value and inputs (no assumptions needed) use get_valuation_metrics instead. Does NOT persist a report — use create_report (report_type:'reverse_dcf') for that. Tier: sp500+.

ParametersJSON Schema
NameRequiredDescriptionDefault
waccNoDiscount rate. Default 0.09.
tickerYesStock ticker symbol of the company to value, e.g. AAPL, MSFT, BRK.B.
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD) for the auto-pulled inputs. Fundamentals are filtered by SEC accepted_at (strict PIT); valuation.parquet inputs are best-effort PIT (filtered by created_at, its accepted_at proxy — no SEC acceptance timestamp exists for pipeline-computed valuations). Omit to use the latest knowable inputs.
stage1_yearsNoNumber of explicit high-growth projection years before the terminal stage (3–15). Defaults to 5.
shares_overrideNoOverride shares outstanding. Leave unset to use R2-derived.
fcf_base_overrideNoOverride the auto-pulled FCF base (USD). Leave unset to use R2-derived.
stage1_growth_rateYesStage-1 FCF growth rate (e.g. 0.12 = 12%/yr).
terminal_growth_rateNoLong-run growth. Default 0.025.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
resultYes
tickerYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint; description adds model details, input sources, fallback logic for shares, meaning of null value, traceability via inputs_echo, and non-persistence.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is comprehensive and well-structured, but slightly lengthy; however, every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, output schema exists), the description covers model type, inputs, outputs, special cases, and alternatives, making it fully adequate for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with descriptions; description adds meaning about model assumptions, relationship between parameters (e.g., WACC vs terminal growth), fallback for shares, and override behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes a forward DCF valuation using a two-stage Gordon growth model, returns per-share intrinsic value and sensitivity grid, and distinguishes itself from sibling tools like get_valuation_metrics and create_report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (when caller wants to drive assumptions) and when not (use get_valuation_metrics for pre-computed), explains null value meaning, and notes it does not persist a report.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_alertCreate AlertAInspect

Persist an alert and register it with the firing pipeline. Three condition shapes:

  • filing_event — fire when a ticker files a chosen form type (8-K, 10-K, etc.).

  • ratio_threshold — fire when a ticker's financial ratio crosses a threshold (e.g. interest_coverage < 1.5).

  • watchlist_change — fire on any filing on any ticker in a named watchlist.

Delivery channels: email (transactional via Resend) or webhook (HMAC-SHA256-signed POST). The cron evaluator runs every 5 minutes. Use test_alert to verify your channel is wired correctly before relying on the cron.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesHuman-readable label.
channelYesDelivery channel for a match — `email` (Resend transactional email), `webhook` (HMAC-SHA256-signed POST to your URL), or `dashboard` (in-app inbox, readable via list_alert_inbox).
conditionYesCondition evaluated each cron tick — a discriminated union of `filing_event` (a watched ticker files a new form), `ratio_threshold` (a financial ratio crosses a comparator/threshold), or `watchlist_change` (a named watchlist's membership changes).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
alertYes
cron_indexedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that alerts are persisted and registered, cron runs every 5 minutes, and delivery channels behave as described (email via Resend, webhook signed with HMAC-SHA256). Adds behavioral context beyond annotations (which only indicate write operation).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise and well-structured: begins with core purpose, then condition shapes, delivery channels, cron cadence, and sibling guidance. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensively covers condition types, channel types, and cron schedule. References test_alert for verification. Output schema exists for return value, so that gap is acceptable. Missing details about response format but overall complete for a complex tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters (100%), but description adds concrete examples and explanations (e.g., 'email (transactional via Resend)', 'HMAC-SHA256-signed POST'). This enriches the schema descriptions with practical context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Persist an alert and register it with the firing pipeline' and enumerates three condition shapes and delivery channels. Distinguishes from sibling test_alert by referencing it for verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context for when to use (e.g., create alerts with specific conditions) and explicitly advises using test_alert before relying on the cron. Lacks explicit when-not-to-use but gives sufficient guidance for agent choice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_reportCreate Research ReportA
Idempotent
Inspect

Synchronously generate a research report and persist it under the caller's authorship. Two subtypes:

reverse_dcf — solves the stage-1 free-cash-flow growth rate the market price implies, with a 5×5 sensitivity grid across WACC × terminal-growth assumptions. Returns full markdown + structured JSON + every numerical claim's citation chain to the originating SEC accession.

thesis — snapshot a saved thesis (via save_thesis) as a frozen narrative report with at-a-glance table, author notes, anchor fundamentals (latest annual), and lineage to the source filing. Later edits to the thesis do NOT propagate — generate a new report to capture new state.

Tier: sample tier rejected — reports are per-author state.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoOptional human-supplied title; auto-generated when omitted.
paramsNoReverse-DCF parameters — required for report_type=reverse_dcf.
tickerNoUS-listed ticker — required for report_type=reverse_dcf. Case-insensitive.
thesis_idNoId of a saved thesis owned by the caller — required for report_type=thesis.
report_typeYesSubtype. `reverse_dcf` requires ticker + params; `thesis` requires thesis_id (from save_thesis / list_theses).
idempotency_keyNoOptional key for at-most-once semantics. Same key from the same user always yields the same report id.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
reportYes
markdownYes
sectionsYes
citationsYes
structuredYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and destructiveHint=false. The description adds context: synchronous generation, persisting under caller's authorship, per-author state, and the snapshot behavior of thesis reports (no propagation). Adds meaningful behavior beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with bullet points, front-loaded purpose, and subtype details. Slightly long but justified by complexity; no fluff. Minor improvement possible in brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters, nested objects, output schema, and complexity, the description is thorough. It covers all necessary context: subtypes, required fields, behavioral nuances (non-propagating), and output format hints. Complete for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% parameter descriptions. The description groups parameters by subtype (e.g., 'Reverse-DCF parameters'), clarifies requirements, and explains idempotency key semantics. Adds context not in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a research report and persists it under the caller's authorship, and distinguishes two subtypes (reverse_dcf and thesis) with specific details. This effectively differentiates it from siblings like compute_dcf or save_thesis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use each subtype: reverse_dcf requires ticker+params, thesis requires thesis_id. It also notes that thesis edits do not propagate, implying when to regenerate. Could be more explicit about when not to use, but adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_alertDelete AlertA
DestructiveIdempotent
Inspect

Soft-delete an alert by its id (from create_alert/list_alerts): status flips to deleted and it is removed from the cron evaluator index so it stops firing. Alerts are immutable — to change one, delete then create_alert. Idempotent. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
alert_idYesIdentifier of the alert to soft-delete, as returned by create_alert or list_alerts.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
statusYes
alert_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and idempotentHint=true. The description adds context: soft-delete behavior, status flip, removal from cron evaluator, and immutability. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with no fluff. The first sentence front-loads the main action and effect. Every sentence adds essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With one parameter, good annotations, an output schema (not shown but present), and a description covering soft-delete, immutability, and tier, the tool's behavior is fully documented for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter alert_id is fully described in the input schema. The description reinforces that the ID comes from create_alert/list_alerts and adds behavioral context (soft-delete effect) that goes beyond the schema's parameter description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs a soft-delete on an alert by its ID, flipping status to 'deleted' and removing it from the cron evaluator. It distinguishes itself from create_alert and list_alerts by specifying the source of the ID and the effect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides guidance on when to use the tool (to delete an alert) and indirectly explains that to change an alert, one must delete and then create a new one due to immutability. It also notes the tier restriction (sp500+ sample rejected), indicating usage scope.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_citation_overrideDelete Citation OverrideA
DestructiveIdempotent
Inspect

Remove a user-authored citation correction by fact_id. Idempotent — deleting a missing override returns deleted=false without error. Once deleted, reports that previously rendered the corrected value revert to the canonical fact value on next regeneration. Tier: paid + free (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
fact_idYesFact identifier whose citation override should be removed, as returned by save_citation_override or list_citation_overrides.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
deletedYes
fact_idYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral details beyond annotations: idempotency behavior (deleting missing override returns deleted=false) and side effect (reports revert to canonical fact value on regeneration). This enriches the destructiveHint annotation with concrete consequences.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three efficient sentences, front-loaded with the action, and includes key notes on idempotency and tier without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with full schema coverage, output schema, and annotations, the description covers effects on reports and idempotency, making it complete for an agent's decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the schema already describes the fact_id parameter as returned by related tools. The description adds no additional parameter meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool removes a user-authored citation correction by fact_id, differentiating it from save and list tools. The action and resource are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (to delete an override) but does not explicitly state when not to use or name alternatives like save_citation_override. It includes tier restrictions but lacks direct comparative guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_claimDelete ClaimA
DestructiveIdempotent
Inspect

Soft-delete a claim by id. The row and its score history are preserved for audit (archived, not erased); the claim drops out of default list_claims results. Idempotent — deleting an already-archived claim succeeds.

Tier: all paid + free tiers (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
claim_idYesId of the claim to archive.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
archivedYes
claim_idYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and idempotentHint=true. The description adds context by specifying soft-delete behavior (archived, not erased), audit preservation, and impact on list_claims results. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences plus tier info), front-loaded with the main action, and every sentence adds value. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one parameter, and the explanation of behavior (soft-delete, idempotent, tier constraints) is complete. An output schema exists, so return values need not be described. The annotations cover safety aspects, making the description fully adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers 100% of the single parameter with a clear description. The tool description does not add additional meaning beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (soft-delete), the resource (claim), and the identifier (by id). It distinguishes soft-delete from hard delete by explaining that the row and score history are preserved for audit, and the claim drops out of default list_claims results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides tier restrictions (all paid + free tiers, sample rejected) and notes idempotency, implying use cases where idempotency is desired. However, it does not explicitly mention when not to use or provide alternatives among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_reportDelete (Soft) Research ReportA
DestructiveIdempotent
Inspect

Soft-delete a report owned by the caller: status flips to delisted, visibility to private — not a hard delete, the row and R2 artifact are preserved (90-day audit window). Idempotent (deleting an already-delisted report succeeds). Sample tier rejected.

ParametersJSON Schema
NameRequiredDescriptionDefault
report_idYesId from `create_report` or `list_my_reports`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
statusYes
report_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds significant context beyond annotations: soft-delete specifics (delisted status, private visibility, 90-day audit window), idempotency confirmation, and sample tier rejection. Consistent with destructiveHint=true and idempotentHint=true annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences front-loaded with core action, followed by idempotency and tier note. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects: soft-delete effect, idempotency, and tier restriction. Missing explanation of sample tier rejection and ownership enforcement, but output schema exists and tool is simple.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameter description, including source of report_id. Description does not add extra parameter meaning beyond schema, achieving baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states soft-delete for reports owned by caller, specifying effect (status flips to delisted, visibility private) and distinguishing from hard delete. Title reinforces 'Soft' delete, differentiating from sibling delete tools for other resources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for deleting own reports and mentions idempotency, but does not explicitly state when to use vs. alternatives like update_report or other delete tools. No exclusions or alternative tool names provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_thesisArchive Saved ThesisA
DestructiveIdempotent
Inspect

Soft-delete a saved thesis: status flips to archived (the row stays for audit / re-scoring). Idempotent — archiving an already-archived thesis succeeds. Hard-delete is not supported by design; future versions may expire archived theses after N years. This does not delete the claims linked to the thesis — use delete_claim for those. Tier: paid + free (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
thesis_idYesId returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
statusYes
thesis_idYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds context beyond annotations: explains soft-delete behavior (row stays for audit/re-scoring), idempotence, no effect on linked claims, and future expiration plans, complementing the destructiveHint and idempotentHint annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no fluff, front-loaded with key info (soft-delete, idempotent), each earning its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given single parameter, good annotations, and presence of output schema, description covers all necessary context: purpose, behavior, idempotence, relationship to other tools, and tier.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with description for thesis_id, but description adds value by specifying where the ID comes from (save_thesis or list_theses), aiding correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states soft-delete with status change to 'archived', distinguishes from hard-delete and delete_claim for claims, providing a specific verb-resource pair with clear scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (archiving theses) and when not (delete claims), mentions idempotence and future behavior, and includes tier info for access control.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_watchlistArchive WatchlistA
DestructiveIdempotent
Inspect

Soft-delete a watchlist by its name (not id): status flips to archived (still readable via list_watchlists status=all/archived). The name is freed for reuse by a new save_watchlist. Idempotent. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesWatchlist name to soft-delete (case-insensitive, 1–80 chars); frees the name for reuse.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
statusYes
watchlist_idYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations, describes soft-delete vs hard, idempotency, name reuse, and access restrictions. Adds value over annotations' destructiveHint and idempotentHint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no fluff, front-loaded with main action, secondary behavioral notes follow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with output schema and good annotations, description covers behavior, parameter, and post-conditions completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description covers parameter name with case-insensitivity, length constraints, and purpose. Description adds 'by name (not id)' context. 100% schema coverage already strong.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it soft-deletes a watchlist by name, flips status to archived, frees name for reuse. Verb+resource precise, distinguishes from siblings like delete_alert.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explains that archived watchlists are still readable via list_watchlists with status=all/archived, and name is freed for save_watchlist. Provides access tier info. Could explicitly state when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

describe_schemaDescribe Data SchemaA
Read-onlyIdempotent
Inspect

Returns the Parquet schema for all tables in the Valuein SEC data warehouse. Includes table descriptions, column names, types, primary keys, and foreign-key references. Use this tool to understand the data model before querying with other tools. No data reads required — schema is embedded in the manifest. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
tableNoFilter to a single table name (e.g. 'fact', 'entity', 'references'). Omit to return the full schema for all tables.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
tableNoSingle-table mode: the requested table name
tablesNoFull-schema mode: map of table name → { description, column_count, columns }
columnsNoSingle-table mode: map of column name → definition
projectNoFull-schema mode: source project name
descriptionNoSingle-table mode: the table's description
schema_versionYesParquet schema version from the active R2 manifest
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint. The description adds that no data reads are required and the schema is embedded in the manifest, clarifying performance and cost implications. Also states availability on all plans.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover purpose and usage with no unnecessary words. Information is front-loaded and each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description need not detail return values. It covers purpose, usage, behavioral traits, and parameter effects adequately for a read-only schema tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds concrete examples ('fact', 'entity', 'references') and explicitly states behavior when the parameter is omitted. This adds useful context beyond the formal schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the Parquet schema for all tables in the data warehouse, specifying the resource and action. It is distinct from sibling tools like 'search_filing_text' or 'get_sec_filing_links'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance to use this tool to understand the data model before querying. While no explicit alternatives are listed, the context implies it is the primary schema exploration tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dismiss_inbox_itemDismiss Inbox ItemA
DestructiveIdempotent
Inspect

Soft-delete a single inbox item by its id (from list_alert_inbox) — not an alert id; sets dismissed_at. The row stays queryable via list_alert_inbox(include_dismissed=true) for audit. Idempotent. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
inbox_idYesIdentifier of the inbox item to dismiss (soft-delete), as returned by list_alert_inbox.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
inbox_idYes
dismissedYes
unread_countYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint and destructiveHint. The description adds meaningful behavioral context: soft-delete behavior, setting dismissed_at, remaining queryable via list_alert_inbox(include_dismissed=true) for audit, and idempotency. This significantly enhances understanding beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise: two sentences covering all key points. Key information is front-loaded, and every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no nested objects, annotations present), the description covers soft-delete behavior, auditability, idempotency, and tier. However, it does not mention the relationship to the sibling mark_inbox_read, which could provide additional context about alternate actions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter inbox_id is fully described in the schema (100% coverage), but the description adds value by specifying the source (from list_alert_inbox) and clarifying it's not an alert id. This provides useful context beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool soft-deletes a single inbox item by its id from list_alert_inbox, not an alert id. It specifies the action (soft-delete) and resource (inbox item), and distinguishes from sibling tools by clarifying it's not an alert id. This is specific and differentiates effectively.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for soft-deleting an inbox item and mentions idempotency and tier constraints. It does not explicitly contrast with siblings like mark_inbox_read, but clarifies the input type (not alert id), providing clear context for appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forensic_auditForensic Audit (Beneish + Sloan + Solvency)A
Read-onlyIdempotent
Inspect

Deterministic forensic-accounting scores for a single ticker: partial Beneish M-Score, Sloan accruals, and a solvency snapshot. Returns a red-flag narrative ranked by severity, with citations to source filings. Used by the forensic_earnings_brief SOP.

Note: full Beneish needs AR / current assets / PPE / SGA / current liabilities, which aren't in our fundamentals model. We compute the recoverable subset (SGI + TATA + LVGI) and flag partial=true. Tier: sp500+.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol of the company to audit, e.g. AAPL, MSFT, BRK.B.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
resultYes
tickerYes
sec_urlYes
period_endYes
source_filingYes
prior_period_endYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only, idempotent, non-destructive. The description adds that it is deterministic, returns a ranked narrative with citations, and flags partial=true due to missing data. This goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose and adds details about limitations and usage. It is concise but could be slightly tighter; each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool complexity (forensic accounting), existence of output schema, and single parameter, the description fully covers what the tool does, its limitations (partial flag), and its use case. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter ticker is well-described in the schema (length constraints, example). The description does not add additional semantic meaning for the parameter beyond the schema, but schema coverage is 100%, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes deterministic forensic-accounting scores (partial Beneish M-Score, Sloan accruals, solvency snapshot) for a single ticker, and returns a red-flag narrative. This is specific and distinguishes from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it is used by the forensic_earnings_brief SOP, providing context. It does not explicitly list alternatives or when not to use, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_comps_xlsxGenerate Peer Comparables Workbook (xlsx)AInspect

Render a peer comparables table into an Excel workbook. The Comps sheet is formatted as a named Excel Table (ValueinPeerComps) so the user gets one-click Insert Chart on any column — the cleanest workaround for not embedding chart objects server-side. Subject-row highlight makes side-by-side comparison instant. A Summary sheet adds subject vs peer-median deltas.

Pair with get_peer_comparables for a typical flow.

Tier: pro+.

ParametersJSON Schema
NameRequiredDescriptionDefault
notesNoOptional free-text note (≤500 chars) rendered on the Summary sheet.
peersYesPeer companies to tabulate against the subject (1–50 rows); each row carries the peer's ticker, name, and comparable ratio values.
subject_tickerYesStock ticker symbol of the subject company the comps sheet is built around, e.g. AAPL.
subject_company_nameNoOptional display name for the subject company; falls back to the ticker if omitted.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
_metaYesProvenance envelope — data lineage for every MCP response
r2_keyYes
filenameYes
expires_atYes
size_bytesYes
content_typeYes
expires_in_secondsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate it is not readOnly nor destructive. The description adds value by detailing what the tool produces (Excel workbook with specific formatting) and how it achieves clean chart insertion. It does not contradict annotations and provides useful behavioral context beyond the structural fields.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three focused sentences covering purpose, technical highlight, pairing suggestion, and tier. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description adequately covers what the tool produces and key formatting features. It does not explain error handling or data format expectations, but for a generation tool it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters described. The description does not add specific parameter details beyond the schema, aligning with the baseline of 3 for high coverage. It hints at the overall structure but does not enhance parameter understanding further.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it renders a peer comparables table into an Excel workbook with specific technical details (named Excel Table, subject-row highlight, Summary sheet with deltas). It effectively communicates the tool's core function and distinguishes it from siblings by mentioning pairing with get_peer_comparables for a typical flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly suggests pairing with get_peer_comparables for a typical flow, providing clear context on when to use it. However, it does not explicitly state when not to use this tool or list alternative approaches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_dcf_xlsxGenerate DCF Workbook (xlsx)AInspect

Render a forward DCF result into a professional Excel workbook (Summary + 5×5 Sensitivity heatmap + Inputs sheet). Native conditional formatting — no chart images needed. Returns a 15-minute presigned R2 download URL.

Pair with compute_dcf for a typical analyst flow: agent calls compute_dcf({ticker, ...}), then passes the structured result straight to generate_dcf_xlsx({ticker, dcf_result, ...}) to materialise a shareable file.

Tier: pro+.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol of the company the DCF workbook is built for, e.g. AAPL.
dcf_resultYesStructured DCF result — typically the `result` field returned by `compute_dcf`.
company_nameNoOptional — surfaces on the cover row. Falls back to ticker only.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
_metaYesProvenance envelope — data lineage for every MCP response
r2_keyYes
filenameYes
expires_atYes
size_bytesYes
content_typeYes
expires_in_secondsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses key behaviors: native formatting, no chart images, returns a 15-minute presigned R2 URL. Annotations indicate it is not read-only or destructive. No contradiction, but could mention potential side effects like temporary storage or permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise two-paragraph structure: first paragraph states core function and features, second paragraph gives usage flow. No wasted words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main aspects: tool purpose, output format, typical pairing. With output schema present, no need to detail return values. Could mention error handling or prerequisites for dcf_result validity, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description adds workflow context by noting dcf_result typically comes from compute_dcf and explaining the optional company_name. Enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it renders a DCF result into a professional Excel workbook with specific sheets (Summary, Sensitivity heatmap, Inputs). It mentions native conditional formatting and returns a presigned URL, distinguishing it from sibling tools like compute_dcf and generate_comps_xlsx.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly pairs with compute_dcf and outlines a typical analyst flow, providing clear use context. Does not include negative guidance or alternatives for similar tasks, but the practical pairing is well explained.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_research_brief_docxGenerate Research Brief (docx)AInspect

Render a structured research brief into a professionally-styled Word document — cover, abstract, optional snapshot table, body sections, and a citations table with clickable SEC EDGAR links. No embedded charts in v1; pair with generate_dcf_xlsx / generate_comps_xlsx for visuals the analyst pastes in.

Consumes the same sections + citations shape create_report emits, so the typical flow is two tool calls: create_reportgenerate_research_brief_docx.

Tier: pro+.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleYesDocument title rendered on the cover page (1–200 chars).
tickerYesStock ticker symbol the brief covers, e.g. AAPL, MSFT, BRK.B.
abstractNoOptional executive-summary paragraph (≤2000 chars) shown after the cover page.
sectionsYesOrdered body sections of the brief (1–20); each has a heading and body text.
snapshotNoOptional at-a-glance metric rows (≤20) rendered as the snapshot table.
citationsNoOptional source citations (≤60) rendered as a table with clickable SEC EDGAR hyperlinks.
company_nameNoOptional display name shown on the cover page; falls back to the ticker if omitted.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
_metaYesProvenance envelope — data lineage for every MCP response
r2_keyYes
filenameYes
expires_atYes
size_bytesYes
content_typeYes
expires_in_secondsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds value beyond annotations by disclosing limitations (no charts, v1) and providing context about the output format (clickable SEC EDGAR links). Annotations already indicate non-destructive, non-idempotent behavior without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise, front-loaded sentences that cover purpose, key constraints, and typical usage. Every sentence adds value; no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters and an output schema, the description provides sufficient context about the tool's role in a workflow, its limitations, and what outputs to expect. It is complete for an agent to understand when and how to use it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds meaning by explaining how parameters map to document sections (cover, abstract, snapshot, body, citations) and notes the compatibility with create_report output. This reduces cognitive load for agents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it renders a structured research brief into a Word document with specific components (cover, abstract, snapshot, body, citations). Distinguishes from siblings by mentioning pairing with generate_dcf_xlsx/generate_comps_xlsx for visuals.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes the typical flow (create_report → generate_research_brief_docx) and when to use alternatives (for visuals). Also notes limitations ('No embedded charts in v1'), guiding appropriate tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_blockholdersBlockholders (SC 13D / 13G)A
Read-onlyIdempotent
Inspect

Returns SC 13D / SC 13G blockholder disclosures (5%+ stakes) for a US public company. Each row carries percent_owned, sole/shared voting + dispositive split, schedule_type, and the first-class going_active flag — TRUE when the same filer flipped 13G → 13D within the lookback window (the single most actionable activist signal in this dataset). Use latest_only=true (default) to dedupe to the most recent filing per filer. Use collapse_groups=true to fold multi-person filings into one row. Institutional tier only.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol of the issuer.
as_of_dateNoPIT filter on accepted_at — only filings on or before this date.
latest_onlyNoWhen true (default), keep only the most recent filing per (filer, schedule prefix) — typically what analysts want. Set false to see the full filing history.
lookback_daysNoWindow for the going_active (13G → 13D) detection. Default 365 days.
lineage_detailNoPer-row provenance envelope.compact
collapse_groupsNoWhen true, fold multi-reporting-person filings into a single row, with secondary persons in the ``persons[]`` field. Default false: each person stays as its own row.
schedule_filterNoWhich schedule(s) to return. '13D' = activist (intent to influence). '13G' = passive. 'both' = no filter.both

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
rowsYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
company_nameYes
data_age_daysYes
staleness_warningYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. The description adds substantial behavioral detail: explains the going_active flag as an activist signal, mentions dedup logic for latest_only, and clarifies group handling with collapse_groups. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, dense paragraph of 4-5 sentences. It front-loads the purpose, then efficiently covers key parameters and the most notable feature (going_active). Every sentence adds value with no filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, 100% schema coverage, and an existing output schema, the description is complete. It covers the data source, key output fields, the most important flag (going_active), and parameter usage. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, but the description adds significant business context beyond the schema. It explains the importance of the 'going_active' detection, defines 'schedule_filter' values meaningfully (13D=activist, 13G=passive), and clarifies the effect of 'collapse_groups'. Exceeds baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The title and first sentence clearly state it returns SC 13D/13G blockholder disclosures for US public companies. It distinguishes from sibling tools like get_insider_transactions or get_top_holders by specifying the 5%+ stake reporting and the unique 'going_active' flag.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises 'Institutional tier only' and explains when to use 'latest_only' and 'collapse_groups' parameters. Does not explicitly state when not to use this tool versus alternatives, but the sibling context and parameter descriptions provide implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_capital_allocation_profileCapital Allocation ProfileA
Read-onlyIdempotent
Inspect

Get a multi-year capital allocation breakdown for a US public company. Shows how management deploys cash across all six categories — capex, R&D, M&A, dividends, buybacks, and debt — plus pre-computed deployment ratios (% of operating cash flow) and over-distribution flags. Use this tool when the user asks: how does a company allocate capital, what's the buyback-vs-dividend mix, is the company over-distributing, is growth funded by R&D or M&A, what's the cash return ratio trend, or any 'where does the money go' question. Also use for owner-earnings analysis (Buffett-style) and reinvestment-rate analysis (Damodaran-style). Data sourced from annual 10-K filings (SEC EDGAR) — income statement (R&D), investing section (capex, M&A), financing section (dividends, buybacks, debt). All figures are point-in-time safe via the as_of_date parameter — no look-ahead bias. R&D semantics: R&D is included as a deployment category despite being an income-statement expense, because for knowledge-economy businesses (tech, pharma, industrials with heavy engineering) it represents the primary growth reinvestment vehicle and often dwarfs capex. R&D is already deducted before reaching operating cash flow, so rd_pct_ocf is INFORMATIONAL — it does not reduce OCF a second time. The total_deployment_pct_ocf field excludes R&D from its sum to preserve the cash-flow identity (OCF = capex + M&A + dividends + buybacks + debt repayment + change in cash). Flags object: pre-computed booleans for common analytical questions. Use buybacks_exceed_fcf to identify years a company returned more to shareholders via repurchases than it generated in free cash flow. Use total_returns_exceed_fcf for the stricter test (buybacks + dividends > FCF). Use debt_funded_distribution to distinguish over-distribution funded by leverage (typical industrials) from over-distribution funded by cash hoard (Apple 2018-2019 post-tax-reform repatriation). NOT yet included (separate roadmap items): buyback_yield_implied requires a price × shares market-cap series; equity-method investments and intangibles are excluded from acquisitions_net to keep M&A semantics tight (request other_investing_outflows if needed). Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol, e.g. AAPL, MSFT
as_of_dateNoPoint-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias.
lookback_yearsNoNumber of fiscal years to look back from the most recent filing (1–20). Defaults to 5 years for a full capital allocation cycle.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYesPer-period capital-allocation rows: capex, R&D, M&A, dividends, buybacks, debt, and deployment-mix flags
noteNo
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
as_of_dateNo
lookback_yearsYesNumber of fiscal years summarized
periods_returnedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. The description adds crucial context: point-in-time safety via as_of_date, R&D semantics (informational, not double-counted), explanation of total_deployment_pct_ocf exclusion, and flag interpretations. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose, then systematically covers use cases, source, parameter semantics, R&D details, flags, and exclusions. While lengthy, every sentence adds value given the tool's complexity. Could be slightly tighter but appropriate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers use cases, data sources, parameter semantics, behavioral notes, flag interpretations, and limitations. With an output schema present, the description provides sufficient context for an agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds value by explaining the purpose of as_of_date (eliminates look-ahead bias) and lookback_years default (full capital allocation cycle). It does not repeat schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get a multi-year capital allocation breakdown for a US public company' and lists specific categories (capex, R&D, M&A, etc.) and ratios. It explicitly distinguishes from siblings by detailing its unique scope and use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use scenarios: 'how does a company allocate capital', 'buyback-vs-dividend mix', 'owner-earnings analysis', etc. It also mentions what is not yet included and suggests alternatives (e.g., 'request other_investing_outflows if needed').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_claimGet ClaimA
Read-onlyIdempotent
Inspect

Fetch a single claim by id, plus the ids of theses it supports/refutes and its full append-only score history. Use this to inspect a claim's evidence, current status, and how its outcome has evolved.

Tier: all paid + free tiers (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
claim_idYesId returned by save_claim or list_claims.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
claimYes
score_eventsYes
linked_thesis_idsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly and idempotent. The description adds valuable behavioral context about the returned data (theses IDs, append-only score history) and tier availability, without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two focused sentences plus tier info. The first sentence captures the core functionality, the second explains usage. No extraneous content; every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, output schema present, robust annotations), the description covers the essential aspects: what it fetches, what additional data it includes, and when to use it. No obvious gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema description for claim_id is adequate ('Id returned by save_claim or list_claims'). The description adds no additional parameter details, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches a single claim by ID and specifies the additional data returned (theses IDs and score history), distinguishing it from sibling tools like list_claims and get_thesis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear use case: 'inspect a claim's evidence, current status, and how its outcome has evolved.' It does not explicitly mention alternatives but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_company_fundamentalsCompany FundamentalsA
Read-onlyIdempotent
Inspect

Retrieve standardized SEC EDGAR fundamental financial metrics for a US public company. Returns revenue, gross profit, operating income, net income, EPS (diluted), total assets, total liabilities, stockholders' equity, cash & equivalents, total debt, operating cash flow, and capital expenditures for one or more fiscal periods. Data sourced from 10-K (annual) and 10-Q (quarterly) filings. Point-in-time: no look-ahead bias — pass as_of_date (YYYY-MM-DD) to reconstruct exactly the information set known on that date. This returns the raw as-reported line items; for pipeline-computed ratios (margins, ROE, ROIC, leverage, per-share) use get_financial_ratios, and for margins combined with DCF/DDM model inputs use get_valuation_metrics — both derive from the figures this tool returns.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of periods to return (1–40). Defaults to 5.
periodNoFiling period granularity. Annual uses 10-K; quarterly uses 10-Q.annual
strictNoWhen true, fail with PLAN_LIMIT_EXCEEDED if the plan cannot satisfy the requested limit. Default false: return what's available and explain the gap in _meta.truncation.
tickerYesStock ticker symbol, e.g. AAPL, MSFT, BRK.B
as_of_dateNoPoint-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias for backtesting. Omit for the full dataset.
fiscal_yearNoFiscal year (YYYY). Omit to return the most recent available years.
lineage_detailNoPer-period provenance envelope + per-metric availability/provenance sidecars. 'compact' (default) returns source_filing + source_url + restated flag, plus lean per-metric availability + fact_id + source_filing. 'full' adds first_filed_at + accepted_at + per-metric source_url + computed inputs[]. 'off' omits all provenance.compact
response_formatNoOutput shape. 'flat' (default) returns the legacy `metrics` object plus the additive `metrics_availability`/`metrics_provenance` sidecars. 'envelope' additionally attaches `metric_envelopes` — one canonical {metric,value,unit,scale,period,availability,provenance} object per metric.flat

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYes
_metaYesProvenance envelope — data lineage for every MCP response
periodYes
tickerYes
as_of_dateYes
company_nameYes
years_returnedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds value by specifying data source (SEC EDGAR 10-K/10-Q), point-in-time behavior, and that it returns raw line items. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise sentences front-loaded with purpose, followed by metrics list, data source, and usage guidance. No wasted words; every sentence is informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (8 params, output schema exists), the description covers purpose, data source, point-in-time, and sibling differentiation. Lacks mention of pagination or error cases, but overall sufficient for effective tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description mentions as_of_date briefly but does not add new parameter information beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves standardized SEC EDGAR fundamental financial metrics for a US public company, lists specific metrics returned, and distinguishes itself from sibling tools like get_financial_ratios and get_valuation_metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly guides when to use this tool versus alternatives: raw as-reported line items vs. ratios vs. valuation metrics. Also explains the point-in-time feature with as_of_date for backtesting, providing clear context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_compute_ready_streamCompute-Ready StreamA
Read-onlyIdempotent
Inspect

STATUS: pending — direct R2 Parquet access is in private beta (ETA 2026-Q3). Calls return 501 FEATURE_NOT_AVAILABLE today. When live: returns a pre-signed Cloudflare R2 URL for bulk Parquet access that can be piped into Python/DuckDB/Polars for high-throughput computation that exceeds the MCP context window. Datasets: fact (per-entity partition — requires ticker), ratio (all computed ratios), valuation (DCF inputs), filing (SEC filing metadata), references (company universe), index_membership (historical index composition). URL would expire in 15 minutes. TODAY use the Python SDK (pip install valuein-sdk) for the same data via DuckDB.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerNoRequired when dataset_type is 'fact'. Resolves to the per-entity fact/{CIK}.parquet partition for that company.
dataset_typeYesDataset to access. 'fact' requires ticker (per-entity partition). All others are full-universe tables.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlNoPresigned, time-limited GET URL for the Parquet object
_metaYesProvenance envelope — data lineage for every MCP response
scopeNoWhat the presigned URL is scoped to (method, object_key_only, etc.)
usageNoReady-to-run DuckDB / Polars snippets
bucketNo
formatNo
tickerNo
url_hashNo
expires_atNo
object_keyNo
dataset_typeYes
expires_in_secondsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Provides extensive behavioral context beyond annotations: current 501 response, private beta status, 15-minute URL expiry, and dataset-specific ticker requirements. Annotations already indicate read-only and idempotent, and the description adds significant operational detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat lengthy but well-structured: status, future behavior, dataset list, expiry, and alternative. The critical information (current unavailability) is front-loaded, and every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's current pending status and simple parameters, the description thoroughly covers behavior, dataset semantics, and a concrete workaround. The output schema exists, so return value details are appropriately delegated.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with informative parameter descriptions. The description adds value by elaborating on each dataset type's content (e.g., 'ratio (all computed ratios)'), complementing the schema's enum definitions without contradicting them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a pre-signed Cloudflare R2 URL for bulk Parquet access, emphasizing high-throughput computation beyond MCP context window. It distinguishes from sibling get_* tools by highlighting bulk dataset access and explicit dataset types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states the current unavailability ('Calls return 501 FEATURE_NOT_AVAILABLE today') and provides a concrete alternative ('use the Python SDK...'). For future use, the description implicitly guides when to use this tool for bulk data processing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_earnings_signalsEarnings SignalsA
Read-onlyIdempotent
Inspect

Reported earnings results and a model-derived earnings-trend signal for a company, by fiscal period: actual reported EPS, a trailing-trend EPS estimate (eps_trend_est), the deviation of actual vs that trend (eps_surprise_pct), reported revenue, and year-over-year revenue growth. IMPORTANT: eps_trend_est is NOT Wall Street analyst consensus — Valuein is sourced purely from SEC EDGAR and carries no consensus feed. It is a deterministic estimate computed from the company's own prior reported EPS, so eps_surprise_pct measures how far the print landed from its own trailing trend, not whether it 'beat the Street'. Use it to track earnings/revenue trajectory and momentum, not to claim a consensus beat or miss. Point-in-time safe — pass as_of_date to filter by SEC acceptance (accepted_at) for look-ahead-free backtests. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of periods to return (1–40), most recent first. Defaults to 8 — covers 2 years of quarterly signals plus their TTM equivalents. earnings_signals.parquet currently emits one row per (entity, period_end); older rows surface here as more historical periods are published.
tickerYesStock ticker symbol, e.g. AAPL, MSFT
as_of_dateNoPoint-in-time filter: only return signals with accepted_at on or before this date. Use for backtesting to avoid look-ahead bias.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYes
noteYes
planYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
as_of_dateNo
estimate_basisYes
periods_returnedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent and readOnly, which the description supports. The description adds critical context about the estimate's source and meaning, and confirms safety for backtesting. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately long but front-loaded with key information. It uses 'IMPORTANT' to highlight a critical distinction. Every sentence adds value, though it could be slightly more concise without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, output schema exists), the description covers return fields, usage guidance, backtesting, and important caveats. It is complete enough for an agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already well-described. The description adds context about the output fields, but not much per-parameter detail beyond what's in the schema. A score of 3 reflects adequate but not exceptional added value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool reports earnings results and a model-derived earnings-trend signal for a company by fiscal period. It distinguishes itself from siblings by focusing on earnings-specific data with a unique estimate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clarifies that eps_trend_est is not analyst consensus but a model estimate from EDGAR, and advises using the tool to track trajectory/momentum rather than claim consensus beats. It also mentions point-in-time backtesting with as_of_date. However, it doesn't explicitly state when to avoid using this tool versus specific siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_financial_ratiosFinancial RatiosA
Read-onlyIdempotent
Inspect

Get pipeline-computed financial ratios from ratio.parquet. Served categories: profitability (margins, ROE, ROA, ROIC), liquidity (current ratio, quick ratio), leverage (D/E, interest coverage, net debt/EBITDA), efficiency (asset turnover, inventory days), per_share (EPS, BVPS, FCF/share), owner_earnings (Buffett FCF, owner yield), and the pipeline-emitted forensic, growth, and rank (cross-sectional *_sector_pctile) categories. NOT every category exists for every ticker — the exact set is data-driven; omit categories to get whatever this ticker has, or read the available_categories list returned in the CATEGORY_NOT_AVAILABLE envelope. valuation (EV/EBITDA, P/E, P/FCF) is DECLARED BUT NOT YET POPULATED for any ticker (price feed pending) — requesting it returns a CATEGORY_NOT_AVAILABLE envelope, never data. Includes TTM (trailing twelve months) rows alongside annual periods. Each row carries is_calendar_aligned (boolean) — TRUE when period_end is actually on the entity's fiscal year boundary (±7 days), FALSE when the pipeline emitted a calendar-quarter row tagged fiscal_period='FY' for a non-December fiscal-year filer. Filter to is_calendar_aligned=TRUE if you're joining ratios with fact-table fundamentals on (entity, fiscal_year). Ratios are pipeline-derived — they're recomputed on each pipeline run with ON CONFLICT DO UPDATE. For historical cuts use as_of_date (the canonical cross-tool name; alias period_end_before). PIT semantics are data-driven: when the ratio data carries an SEC accepted_at timestamp, as_of_date filters point-in-time by accepted_at (zero look-ahead, latest-knowable per period) and _meta.pit_safe=true; when it does not (today's data), the cut is by ratio.period_end and _meta.pit_safe=false. For guaranteed accepted_at PIT regardless, use get_company_fundamentals (which carries fact.accepted_at). Use this instead of get_valuation_metrics when you only need ratios (no DCF wiring); use get_valuation_metrics when you also need DCF/DDM. Each ratio is a {value, unit, category, reason} entry; a response-level lineage (DerivedLineage) envelope marks the values pipeline-derived and points to get_company_fundamentals / verify_fact_lineage for filing-level provenance. Use the returned value exactly — do not recompute it; a null value carries a reason (e.g. INPUT_MISSING, DENOMINATOR_NEGATIVE) so missing is never confused with a real zero. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoNumber of distinct period_end dates to return (1–20). Defaults to 5. Within each period, all matching ratio_names are included.
tickerYesStock ticker symbol, e.g. AAPL, MSFT
as_of_dateNoHistorical cutoff (canonical cross-tool date param — same name as get_company_fundamentals / get_valuation_metrics). When the ratio data carries an SEC accepted_at timestamp, this filters point-in-time by accepted_at (returns the latest value knowable on or before the date, zero look-ahead, _meta.pit_safe=true). When it does not (current data), it filters by ratio.period_end and is not strict point-in-time (_meta.pit_safe=false). For guaranteed accepted_at-based PIT, use get_company_fundamentals.
categoriesNoRatio categories to include. Omit to return all categories this ticker has. Served: profitability, liquidity, leverage, efficiency, per_share, owner_earnings, forensic, growth, rank. valuation is accepted but NOT yet populated for any ticker (price feed pending) — it returns a CATEGORY_NOT_AVAILABLE envelope, not data. Availability is per-ticker; the envelope lists this ticker's available_categories.
fiscal_periodNoFilter to a specific fiscal period type. Use 'TTM' for trailing twelve months. Omit to return both annual (FY) and TTM rows.
period_end_beforeNoAlias of as_of_date — either works; as_of_date is preferred (it is the canonical name used across every time-series tool). Returns only ratios with period_end on or before this date.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYes
noteYes
planYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
lineageNoProvenance for pipeline-derived values (ratio.parquet / factor_scores.parquet): source table + pipeline computed_at, plus a pointer to the tools that return filing-level lineage. NOT point-in-time (recomputed on each pipeline run).
periods_returnedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true. The description goes far beyond by detailing point-in-time semantics (as_of_date behavior with accepted_at vs. period_end, _meta.pit_safe flags), pipeline recomputation (ON CONFLICT DO UPDATE), null value handling (reason field), and the CATEGORY_NOT_AVAILABLE envelope for missing categories. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-organized: it starts with purpose, lists categories, adds caveats, then discusses filters, PIT, and sibling comparisons. Every sentence adds value, but some redundancy (e.g., repeated mention of 'valuation' not populated) could be trimmed. Front-loaded with the primary action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, output schema exists, multiple edge cases), the description covers everything: available categories, missing category handling (CATEGORY_NOT_AVAILABLE), calendar alignment, point-in-time semantics, null reasons, lineage, and plan availability. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, but the description adds substantial extra meaning: it explains that `categories` must exclude 'valuation' because it is not yet populated, clarifies `as_of_date` PIT semantics, and identifies `period_end_before` as an alias. This enhances the agent's understanding beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with 'Get pipeline-computed financial ratios from ratio.parquet,' clearly stating the verb and resource. It enumerates specific categories and distinguishes itself from siblings like 'get_valuation_metrics' (which includes DCF/DDM) and 'get_company_fundamentals' (for guaranteed accepted_at PIT). This provides unambiguous purpose and differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells the AI when to use this tool vs. alternatives: 'Use this *instead of* `get_valuation_metrics` when you only need ratios (no DCF wiring); use `get_valuation_metrics` when you also need DCF/DDM.' It also advises using `get_company_fundamentals` for guaranteed accepted_at-based PIT. Additionally, it provides guidance on filtering `is_calendar_aligned` for joining with fact tables.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_insider_sentimentInsider Sentiment (composite)A
Read-onlyIdempotent
Inspect

Role-weighted insider sentiment score on a fixed [-100, +100] scale for a single issuer over a lookback window. Role weights: CEO/CFO = 3.0 (via officer_title pattern), other NEO Officer = 2.0, 10%-Owner = 1.5, Director = 1.0. P = +1, S = -1; option exercises, grants, and tax withholdings are neutralised. Cluster flag = TRUE when ≥3 distinct insiders transacted within any 30-day window inside the lookback. Institutional tier only.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesIssuer ticker symbol.
lookback_daysNoDays back from today to scan transactions for. Default 180.
cluster_window_daysNoSliding window for the cluster_flag detection. Default 30 days.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
buy_countYes
sell_countYes
cluster_flagYes
company_nameYes
lookback_daysYes
total_buy_usdYes
total_sell_usdYes
sentiment_scoreYes
top_contributorsYes
total_buy_sharesYes
total_sell_sharesYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true. The description adds behavioral details: role weights, transaction neutralization, and cluster flag logic, which are beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a concise paragraph that front-loads the main purpose (score on a fixed scale) and uses each subsequent sentence to add necessary detail without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description sufficiently covers the tool's behavior: scale, weights, neutralization, and cluster flag. No major gaps remain for its intended use as a composite sentiment tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, but the description adds meaning by explaining how lookback_days and cluster_window_days are used in the sentiment calculation (e.g., lookback window for scanning, sliding window for cluster flag).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes a role-weighted insider sentiment score on a fixed [-100, +100] scale for a single issuer over a lookback window. It distinguishes itself from siblings like get_insider_transactions by being a composite score rather than raw transactions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies 'Institutional tier only,' which is a usage constraint. It implies use for sentiment analysis, but lacks explicit when-not-to-use or alternative tools beyond what sibling names suggest.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_insider_transactionsInsider TransactionsA
Read-onlyIdempotent
Inspect

Form 3 / 4 / 5 / 144 line items for a US public company. Returns each transaction (or initial holding / proposed sale) with the insider's name, role, transaction code, share count, price, and notional. Filters by lookback window, transaction code (P=purchase, S=sale, A=grant, M=option exercise, F=tax withholding, etc.), insider role, and minimum share threshold. Institutional tier only — sample / sp500 / pro return ENTITLEMENT_DENIED with an upgrade link.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum rows to return. Default 100, max 500.
tickerYesStock ticker symbol, e.g. AAPL, MSFT, BRK.B
roles_inNoInsider roles to keep. Omit to include any role.
as_of_dateNoPoint-in-time date (YYYY-MM-DD). Only returns transactions with accepted_at <= this date — eliminates look-ahead bias. When set, lookback_days is ignored.
min_sharesNoMinimum |shares| per transaction. Omit for no floor.
lookback_daysNoHow many days back from today to scan transactions for. Ignored when as_of_date is set.
lineage_detailNoPer-row provenance envelope. 'compact' (default) returns source_filing + source_url. 'full' adds accepted_at. 'off' omits lineage.compact
transaction_codesNoSEC transaction codes to keep (uppercase, single-letter): P=purchase, S=sale, A=grant, M=option exercise, F=tax withholding, G=gift, J=other. Omit to include all codes.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
rowsYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
company_nameYes
data_age_daysYes
staleness_warningYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only, idempotent, non-destructive. The description adds context about tier limitations and the effect of lineage_detail parameter, going beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (3 sentences) with key information front-loaded (what the tool does, returns, and filters). No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 8 parameters, 100% schema coverage, and an output schema, the description fully covers necessary context including tier restrictions and filter behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions. The tool description repeats some code explanations but does not add new meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns Form 3/4/5/144 line items for US companies, specifying the included fields (insider name, role, transaction code, etc.). It distinguishes from siblings like get_insider_sentiment by focusing on raw transaction data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly notes the institutional tier restriction and that lower tiers get ENTITLEMENT_DENIED. However, it does not directly tell when to use this tool versus related tools like get_insider_sentiment.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_institutional_holdingsInstitutional Holdings (by issuer)A
Read-onlyIdempotent
Inspect

Returns top-N institutional holders of a US public company at a specific period_end (latest by default), with aggregate institutional shares, total market value, holder count, and HHI concentration (sum of squared share-of-total percentages). Sourced from Form 13F-HR via the by-issuer partition. Institutional tier only. 13F filings carry a ~45-day reporting lag — staleness_warning fires when latest data is older than 90 days.

ParametersJSON Schema
NameRequiredDescriptionDefault
top_nNoMaximum holders to return, ranked by market_value_usd. Default 25, max 200.
tickerYesStock ticker symbol of the issuer.
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD): only 13F filings ACCEPTED by SEC on or before this date are considered, applied BEFORE the latest period is resolved. A 13F/A amendment or late filing accepted after this date is excluded (zero look-ahead) — use this for survivorship-free backtests. Omit for the latest knowable book.
period_endNoQuarter-end of the 13F reporting period (YYYY-MM-DD). Omit to use the latest period available. This is a REPORTING period, NOT a point-in-time cutoff — use as_of_date for that.
lineage_detailNoPer-row provenance envelope. compact / full / off.compact

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
rowsYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
aggregateYes
as_of_dateYesThe point-in-time cutoff actually applied (echo of the as_of_date input). Null when no PIT cut was requested — never confuse this with the reporting period_end.
period_endYesThe 13F REPORTING period the rows belong to — NOT a point-in-time cutoff.
company_nameYes
data_age_daysYes
holders_countYes
hhi_concentrationYes
staleness_warningYes
total_market_value_usdYes
options_positions_countYesOption positions (put_call set) excluded from totals/HHI/rows. rows[] are common-stock 13F holdings only.
total_institutional_sharesYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: staleness_warning fires when data is older than 90 days, source (Form 13F-HR), ~45-day reporting lag, and partition details. Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, so the description enriches transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: three sentences that front-load the core purpose, then add data source and staleness information. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 parameters, output schema exists), the description covers all necessary context: purpose, data source, parameter nuances, staleness warning, and reporting lag. It is sufficient for an agent to understand when and how to use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing a baseline of 3. The description adds meaningful context for parameters: it explains HHI concentration, the distinction between as_of_date and period_end, and the usage of as_of_date for backtests. This exceeds the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns top-N institutional holders of a US public company with aggregate metrics, sourced from Form 13F-HR via the by-issuer partition. It distinguishes from siblings like get_blockholders and get_top_holders by specifying 'institutional tier only' and the data source.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use this tool (e.g., for institutional holdings data, with staleness_warning for old data) and explains the reporting lag. However, it does not explicitly state when not to use it or mention alternatives, though sibling names provide some guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_manager_portfolioManager Portfolio (13F by filer)A
Read-onlyIdempotent
Inspect

Returns a 13F filer's full portfolio at a specific period_end (latest by default), with QoQ deltas vs the prior quarter (new / increased / decreased / exited / unchanged). Specify the filer either by filer_cik (preferred) or filer_name (fuzzy match against entity.name; multiple matches raise an ambiguity error so you can disambiguate by CIK). Institutional tier only.

ParametersJSON Schema
NameRequiredDescriptionDefault
top_nNoMaximum positions to return, ranked by market_value_usd. Default 25.
filer_cikNoCIK of the 13F filer (1-10 digits; will be zero-padded to 10). Preferred over filer_name when known.
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD): only 13F filings ACCEPTED by SEC on or before this date are considered, applied BEFORE the latest + prior periods (and the QoQ basis) are resolved. A 13F/A amendment or late filing accepted after this date is excluded (zero look-ahead). Omit for the latest knowable portfolio.
filer_nameNoFiler name to fuzzy-match against entity.name. Case-insensitive substring match. Multiple matches raise INVALID_ARGUMENT — use filer_cik in that case.
period_endNoQuarter-end (YYYY-MM-DD). Omit to use latest available. This is a REPORTING period, NOT a point-in-time cutoff — use as_of_date for that.
lineage_detailNoPer-row provenance envelope.compact

Output Schema

ParametersJSON Schema
NameRequiredDescription
rowsYes
_metaYesProvenance envelope — data lineage for every MCP response
aggregateYes
filer_cikYes
as_of_dateYesThe point-in-time cutoff actually applied (echo of the as_of_date input). Null when no PIT cut was requested — never confuse this with the reporting period_end.
filer_nameYes
period_endYesThe 13F REPORTING period the positions belong to — NOT a point-in-time cutoff.
data_age_daysYes
positions_countYes
prior_period_endYes
staleness_warningYes
total_market_value_usdYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false, and the description adds detailed behavioral context: QoQ deltas, as_of_date as a point-in-time cutoff excluding amendments after that date, and error handling for ambiguous filer_name. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with clear structure: first sentence states core functionality, second provides parameter details. Every sentence adds value, no wasted words. Front-loaded with key purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, QoQ deltas, two date semantics, output schema exists), the description covers all necessary context: identifier options, date behaviors, error cases, and access restrictions. It is complete for an AI agent to understand when and how to invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description adds meaning beyond schema by explaining preferred identifier (filer_cik), fuzzy matching behavior of filer_name, and clarifying the role of as_of_date vs period_end. The description does not redundantly list parameter defaults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a 13F filer's portfolio at a specific period_end with QoQ deltas, specifying the resource, verb, and scope. It distinguishes from siblings like get_institutional_holdings by focusing on a single manager's portfolio with quarter-over-quarter changes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage guidance: prefers filer_cik over filer_name, explains that multiple filer_name matches raise an error, and notes the 'Institutional tier only' access restriction. However, it does not explicitly state when to use this tool versus alternatives like get_institutional_holdings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_peer_comparablesPeer ComparablesA
Read-onlyIdempotent
Inspect

Get ratio-based peer comparison for a company and its closest competitors. Peers are selected by matching 2-digit SIC industry code. Returns pipeline-computed ratios from up to 10 peers alongside the subject company for direct benchmarking. Ratio categories: profitability, liquidity, leverage, efficiency, per_share, owner_earnings, valuation. TTM (trailing twelve months) ratios are used when available for the most current view. Use as_of_date to compare peers at a specific historical date. PIT semantics for the figure leg are data-driven: when the ratio data carries an SEC accepted_at timestamp, as_of_date filters point-in-time by accepted_at (zero look-ahead, _meta.pit_safe=true); when it does not (today's data), the cut is by ratio.period_end (_meta.pit_safe=false). NOTE: peer SELECTION still uses CURRENT S&P 500 membership as a size/relevance ranking proxy regardless of as_of_date (W3-G2). Available on every plan — sample returns the subset covered by the sample bucket.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of peers to return alongside the subject company (1–10). Defaults to 5.
tickerYesSubject company ticker, e.g. AAPL. Peers are auto-selected by SIC code.
as_of_dateNoHistorical cutoff (canonical cross-tool date param — same name as get_company_fundamentals / get_valuation_metrics) for the figure leg. When the ratio data carries an SEC accepted_at timestamp, this filters the figures point-in-time by accepted_at (latest-knowable per ratio, zero look-ahead, _meta.pit_safe=true). When it does not (current data), it filters by ratio.period_end (_meta.pit_safe=false). Peer SELECTION still uses current S&P 500 membership as a ranking proxy regardless of as_of_date (W3-G2).
categoriesNoRatio categories to include in the comparison. Defaults to profitability, valuation, and leverage.
period_end_beforeNoAlias of as_of_date — either works; as_of_date is preferred (canonical name used across every time-series tool). Only include ratios with period_end on or before this date.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYesOne row per company (subject + peers): ticker, cik, name, sector, industry, is_subject, ratios
noteNo
_metaYesProvenance envelope — data lineage for every MCP response
lineageNoProvenance for pipeline-derived values (ratio.parquet / factor_scores.parquet): source table + pipeline computed_at, plus a pointer to the tools that return filing-level lineage. NOT point-in-time (recomputed on each pipeline run).
subjectYesSubject ticker the peer set is built around
as_of_dateNo
categoriesYesRatio categories included in each peer panel
peers_returnedYes
subject_ratiosNoThe subject company's ratio panel
period_end_beforeNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite strong annotations (readOnlyHint, idempotentHint, etc.), the description adds significant behavioral context: PIT semantics for as_of_date, the _meta.pit_safe flag explanation, and the note about peer selection being current regardless of date. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but well-structured, front-loading the core purpose and then detailing method, categories, date behavior, and caveats. Every sentence adds value, though some details could be more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (peer selection logic, PIT semantics, multiple categories), the description covers all necessary aspects. An output schema exists, so return values are not needed. The description is comprehensive and leaves no obvious gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds cross-tool context (as_of_date is canonical across time-series tools) and clarifies the period_end_before alias, enhancing understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves ratio-based peer comparison for a company and its closest competitors, selected by 2-digit SIC code and S&P 500 membership. This distinguishes it from sibling tools like get_financial_ratios (single company) or get_company_fundamentals (fundamental data).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use this tool (benchmarking against peers) and includes limitations (peer selection uses current S&P 500 membership, not historical). It implies when not to use it (e.g., single-company ratios) but does not explicitly list alternatives, though context from sibling tool names suffices.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_pit_universePoint-in-Time UniverseA
Read-onlyIdempotent
Inspect

Use this tool to answer questions about historical index membership — e.g. "Was Company X in the S&P 500 on date Y?" or "Which companies were in the Russell 2000 on 2010-01-01?" Use this INSTEAD OF search_companies when the question involves a specific historical date or asks whether a company was an index member at a point in the past. search_companies only returns current membership snapshots and cannot answer historical membership questions.

Returns a survivorship-free universe of companies valid on a specific as_of_date: only companies that existed and were index members on that exact date — no hindsight contamination. Supports SP500, RUSSELL1000, RUSSELL2000, RUSSELL3000 via index_membership.parquet (accurate join/leave dates with [) interval semantics). To check a single company's membership, pass its ticker and the target date; if the company appears in the response it was a member, if absent it was not.

Returns per company: CIK, ticker, name, sector, industry, SIC code, plus per-row membership confidence (high/medium/low). Check _meta.pit_safe: true only when every matched row is high-confidence; medium/low rows downgrade it to false — treat low-confidence rows with caution for backtest use.

NOTE: sector is SIC-derived (GICS-aligned labels via sic_to_sector.csv), not licensed GICS — industrial conglomerates may map differently. Treat as a screening bucket, not an authoritative GICS label.

Use as the first step of a quantitative backtest before calling get_compute_ready_stream to pull Parquet data for the universe. Returns empty array (with error detail) if the date is out of range or the index_membership data has no coverage for that date. Available on every plan — sample tier returns the subset covered by the sample bucket.

ParametersJSON Schema
NameRequiredDescriptionDefault
indexNoIndex filter. 'sp500' (~500 large caps), 'russell1000' (~1000 large/mid), 'russell2000' (~2000 small caps), 'russell3000' (~3000 broad market). Omit for no index filter (sector-only or full universe queries).
limitNoMaximum number of companies to return (1–3500). Defaults to 100. Sized to fit Russell 3000 (~3000 members) plus headroom for delisted/historical members. Universe is deduped to one row per CIK by default, so set near the index size (SP500 ~505, Russell 1000 ~1010, Russell 3000 ~3050) — no need for share-class headroom.
sectorNoSector filter (case-insensitive substring match) over the SIC-derived, GICS-aligned sector label. E.g. 'Technology', 'Health Care', 'Energy'. Not licensed GICS — see tool description for caveat.
is_activeNoFilter to active (currently trading) companies only. Omit to include all. WARNING: setting this to true on a HISTORICAL query reintroduces survivorship bias — companies that were active on as_of_date but later went bankrupt or got acquired will be filtered out. Leave unset for true PIT backtests.
as_of_dateNoHistorical date (YYYY-MM-DD) for survivorship-free universe construction. For an index: uses index_membership join/leave dates — companies that entered after this date are excluded; companies later removed are kept. For sector: uses security valid_from/valid_to. Omit for the current universe.
as_of_basisNoWhich date column to filter on for historical universe construction. 'effective' (default) — match against effective_date / removal_date (first trading day as a member; passive index replication). 'announcement' — match against announcement_date / removal_announcement_date (the day S&P publicly announced the change; use for inclusion-arb backtests). 'announcement' rows with NULL announcement_date are silently skipped — pre-2015 changes generally lack press-release coverage.effective
include_share_classesNoWhen false (default), the universe is collapsed to one row per CIK — matches how index providers count constituents (BRK is one S&P 500 member, not two for BRK-A + BRK-B). When true, every share-class row is returned (GOOG and GOOGL emit separate rows, etc.). Use only for security-level (not company-level) analysis.

Output Schema

ParametersJSON Schema
NameRequiredDescription
noteYes
_metaYesProvenance envelope — data lineage for every MCP response
indexYes
sectorYes
companiesYes
as_of_dateYes
as_of_basisYes
coverage_gapYes
universe_sizeYes
survivorship_freeYes
confidence_summaryYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosures include survivorship-free construction, confidence levels, pit_safe meta, sector mapping caveats, and empty array on error. These add significant context beyond the annotations, which only declare readOnly/idempotent/destructive hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough and well-structured, with front-loaded usage and detailed caveats. While not excessively long, there is minor redundancy (e.g., repeating 'survivorship-free'), but every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 optional parameters, no required params, full schema coverage, output schema, and rich annotations, the description fully covers edge cases (date out of range, low confidence, sample tier limits) and integrates with related tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds meaning beyond the 100% covered schema: explains index values in context, as_of_date semantics (join/leave intervals), as_of_basis (effective vs announcement with NULL handling), and include_share_classes (CIK dedup vs per-share-class). No parameter lacks explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: answering historical index membership questions, with specific examples like 'Was Company X in the S&P 500 on date Y?'. It distinguishes itself from the sibling tool 'search_companies' by noting that the latter only returns current snapshots.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is provided: 'Use this INSTEAD OF `search_companies` when the question involves a specific historical date'. It also advises using it as the first step before `get_compute_ready_stream` and warns against setting `is_active=true` on historical queries due to survivorship bias.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_reportGet Research ReportA
Read-onlyIdempotent
Inspect

Fetch the current HEAD of a report by id. format=markdown returns the rendered body, format=json returns the full structured payload (sections + citations + report-type-specific data), format=preview returns abstract-only. Authors see any of their own reports; non-authors only get preview of listed reports and need the report's required tier for full bodies. Sample-tier non-authors are downgraded to preview regardless of input. For an archived prior version use get_report_version, not this tool.

ParametersJSON Schema
NameRequiredDescriptionDefault
formatNoResponse shape. Defaults to markdown.markdown
report_idYesId from `create_report` or `list_my_reports`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
formatYes
reportYes
markdownYes
sectionsYes
citationsYes
structuredYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent), description details format behaviors (markdown, json, preview) and access restrictions, adding significant context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with core purpose and format details, then access rules and disambiguation. No redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all aspects: purpose, parameter formats, access control, disambiguation, and behavior. Output schema exists, but description still adds value on return shapes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, but description adds meaning for format parameter by explaining exact return for each enum value, going beyond the schema's brief description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Fetch the current HEAD of a report by id,' which is a specific verb and resource. It clearly distinguishes itself from sibling get_report_version for archived versions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use alternative tool ('For an archived prior version use get_report_version') and describes access control rules for authors vs non-authors and sample-tier downgrade.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_report_versionGet Report VersionA
Read-onlyIdempotent
Inspect

Author-only fetch of a specific archived version of one of your reports, by positive-integer version. Returns metadata + the full payload (sections, citations, structured, markdown) — enough to render a diff against the current HEAD in the workspace editor. Use after list_report_versions identifies the version number you want; for the current HEAD use get_report instead.

ParametersJSON Schema
NameRequiredDescriptionDefault
versionYesVersion number to fetch (from list_report_versions).
report_idYesIdentifier of the report whose archived version to fetch, as returned by create_report or list_my_reports.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
payloadYes
versionYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds beyond annotations by specifying 'Author-only' auth requirement and that returns metadata plus full payload for rendering diffs. Annotations already indicate readOnly=true and destructive=false, so the description complements well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, no filler, front-loaded with the core action. Every sentence adds essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema is present, the description covers usage context, auth, return content, and sequential relationship to sibling tools. Fully complete for this tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description does not add substantial new parameter details beyond what schema provides, though it reinforces the integer constraint and linkage to list_report_versions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool fetches a specific archived version of a report, using a positive-integer version number. It distinguishes from siblings by noting it is for archived versions only, and contrasts with get_report for the current HEAD.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to use after list_report_versions to get the version number, and to use get_report instead for the current HEAD. This provides clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_smart_money_flowSmart Money Flow (composite)A
Read-onlyIdempotent
Inspect

Composite flow score on [-100, +100] aggregating insider transactions, 13F institutional Δ-shares vs the prior quarter, and SC 13D/13G blockholder changes over a lookback window. Each component normalised independently, then combined with configurable weights (default: institutional 0.4, blockholder 0.4, insider 0.2). Returns per-component attribution so an agent can see WHY the score is what it is — not just the headline number. NOTE: the institutional component is a QoQ share-change signal computed over the top-5 13F filers on a MATCHED current-vs-prior basis (a filer only counts when its prior-quarter book is observable), NOT the issuer's complete institutional book — treat the score as a directional signal, not an exact flow. coverage.coverage_confidence (0–1) reports how much of that basis had a real prior quarter; when it is 0 the institutional component is forced to 0 so a 13F ingestion gap can never surface as a false max-conviction buy. See the coverage block for holder coverage + staleness. The score is a unitless composite, not a dollar figure. Institutional tier only.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesIssuer ticker symbol.
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD) applied to all three legs (institutional, insider, blockholder) via SEC accepted_at — filings accepted after this date are excluded so the composite is computed with zero look-ahead. Omit for the latest knowable signal.
lookback_daysNoLookback window for insider + blockholder components. Default 90.
weight_insiderNoWeight applied to the insider component (0–1).
weight_blockholderNoWeight applied to the blockholder component (0–1).
weight_institutionalNoWeight applied to the institutional component (0–1).

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
weightsYes
coverageYesHonesty block: the institutional signal is computed from a top-N 13F slice with a top-5-filer matched basis. Surfaces holder coverage + staleness so the composite is never read as the issuer's complete book.
as_of_dateYesThe point-in-time cutoff actually applied (echo of the as_of_date input). Null when no PIT cut was requested — never the reporting period_end fabricated as a cutoff.
componentsYes
period_endYesThe institutional 13F REPORTING period — NOT a point-in-time cutoff.
company_nameYes
composite_scoreYes
insider_componentYes
blockholder_componentYes
institutional_componentYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite annotations already indicating read-only and non-destructive behavior, the description adds significant behavioral detail: composite score range, normalization and weighting per component, per-component attribution, the institutional component's matched top-5 basis, coverage confidence impact, and that the score is unitless. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough but efficient, with the main purpose front-loaded. Every sentence adds necessary context (methodology, limitations, coverage). Could be slightly more structured, but it avoids verbosity given the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the composite nature, configurable weights, coverage metrics, and presence of an output schema, the description covers all essential aspects: score range, components, weighting, attribution, limitations, and coverage confidence. An agent can reliably understand the tool's behavior and constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). The description adds value by explaining the default weights, the lookback window's scope (insider+blockholder), and the as_of_date's purpose (point-in-time cutoff). It also clarifies the institutional component's behavior beyond schema types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes a composite flow score on [-100, +100] aggregating insider transactions, institutional changes, and blockholder changes. It distinguishes itself from siblings like get_insider_transactions, get_institutional_holdings, and get_blockholders by being a composite. The verb 'get' and resource 'smart money flow' are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use (to get a composite smart money score) and provides important caveats (directional signal, not exact flow, institutional limitation). However, it does not explicitly state when not to use or list alternatives for individual components, though it implies them by naming the sub-signals.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_thesisGet Saved ThesisA
Read-onlyIdempotent
Inspect

Fetch a single saved thesis by its id. Returns the full record including outcome (if scored). Returns NOT_FOUND if the id is unknown or belongs to another user. For the claims composing a thesis use list_claims_for_thesis; for an individual claim use get_claim. Tier: paid + free (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
thesis_idYesId returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
thesisYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only, idempotent, non-destructive. Description adds specific error behavior (NOT_FOUND if id unknown or belongs to another user) and tier constraint, which is valuable beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serving a distinct purpose: function, sibling guidance, and tier info. No extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple one-parameter read-only tool with an output schema, the description covers purpose, error handling, and related tools. It is self-contained and sufficient for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description for thesis_id. The tool description adds context by specifying that the id is returned by save_thesis or list_theses, which helps the agent understand how to obtain the parameter value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it fetches a single saved thesis by id, specifies the return content (full record with outcome if scored) and error case (NOT_FOUND). It also distinguishes from siblings like list_claims_for_thesis and get_claim.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides when to use this tool versus alternatives: 'For the claims composing a thesis use list_claims_for_thesis; for an individual claim use get_claim.' Also mentions tier restrictions, guiding the agent on accessibility.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_top_holdersTop Holders (composite, classified)A
Read-onlyIdempotent
Inspect

Classification-aware UNION across insider transactions (latest post_transaction_shares per insider), 13F institutional holdings, and SC 13D / 13G blockholder filings for one issuer. Each row carries holder_class ∈ {insider, institutional, blockholder_13D, blockholder_13G}. Dedupes overlapping filers by precedence (13D > 13G > institutional > insider). One call, classified cap table — Bloomberg charges separately for INSIDER, OWNER, and HDS; this consolidates them.

ParametersJSON Schema
NameRequiredDescriptionDefault
top_nNoMaximum holders to return, ranked by shares. Default 25.
tickerYesIssuer ticker symbol.
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD): only filings ACCEPTED by SEC on or before this date are considered across all three sources (institutional via accepted_at, insider via accepted_at, blockholders via accepted_at). Excludes amendments/late filings accepted after this date (zero look-ahead). Omit for the latest knowable cap table.
period_endNo13F REPORTING period_end. Omit for latest. NOT a point-in-time cutoff — use as_of_date.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cikYes
rowsYes
_metaYesProvenance envelope — data lineage for every MCP response
tickerYes
stalenessYesEach source has its own as-of date and lag (13F ~45-day lag; 13D/G snapshots can be years old). Percentages from different-dated denominators are NOT directly comparable.
as_of_dateYesThe point-in-time cutoff actually applied (echo of the as_of_date input). Null when no PIT cut was requested. NEVER equal to period_end unless explicitly supplied — a reporting period is not a knowable-as-of date.
period_endYesThe institutional 13F REPORTING period — NOT a point-in-time cutoff.
company_nameYes
sources_breakdownYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond the annotations (readOnlyHint, idempotentHint, destructiveHint). It explains deduplication precedence (13D > 13G > institutional > insider), point-in-time cutoff semantics for as_of_date, and distinguishes period_end as a 13F reporting period parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose. The first sentence defines the composite nature, the second adds deduplication and comparison to Bloomberg. Every sentence earns its place with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the rich annotations (readOnlyHint, idempotentHint), 100% schema coverage, and presence of an output schema, the description is complete. It covers the tool's purpose, parameters, and key behavioral nuances, enabling an AI agent to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the as_of_date parameter's point-in-time behavior across all sources, that omitting it gives the latest knowable cap table, and the distinction between as_of_date and period_end. This enriches the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool returns a 'classification-aware UNION across insider transactions, 13F institutional holdings, and SC 13D / 13G blockholder filings for one issuer.' It uses specific verbs and resources, and distinguishes itself from siblings by noting it consolidates data that Bloomberg charges separately for.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use this tool (when a consolidated cap table with holder classifications is needed) and contrasts it with Bloomberg's separate screens. However, it does not explicitly mention alternatives among sibling tools like get_blockholders or get_insider_transactions, nor does it state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_valuation_metricsValuation MetricsA
Read-onlyIdempotent
Inspect

Get comprehensive valuation and profitability metrics for a US public company. Returns per-period data combining computed ratios (gross_margin, operating_margin, net_margin, ROE, ROA, ROIC, debt_to_equity, FCF, FCF margin) with optional pre-computed DCF model inputs (WACC, fcf_base_per_share, stage1_growth_rate, terminal_growth_rate, dcf_value_per_share, ddm_value_per_share). Profitability + cash flow + leverage fields are sourced from fact.parquet (PIT-safe via accepted_at). DCF/DDM fields are sourced from valuation.parquet (pipeline-computed; recomputed each pipeline run, so NOT strictly PIT-safe). DCF/DDM fields are commonly null — for newer tickers, transition periods, or before the valuation pipeline has run. Each null carries a null_reasons[field] entry with one of: VALUATION_NOT_COMPUTED, INPUT_MISSING, INPUT_NEGATIVE, PIPELINE_PENDING. ALWAYS check null_reasons before assuming a field is zero — null != 0. For agents needing strict PIT DCF reasoning, use the SDK or compute DCF inputs from get_company_fundamentals directly. Use this instead of get_financial_ratios when DCF/intrinsic value matters; use get_financial_ratios when you only need the ratio table without DCF wiring. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of periods to return (1–40). Defaults to 5.
periodNoFiling period granularity. Annual uses 10-K; quarterly uses 10-Q.annual
tickerYesStock ticker symbol, e.g. AAPL, MSFT
as_of_dateNoPoint-in-time date (YYYY-MM-DD). Only returns data with accepted_at on or before this date. Eliminates look-ahead bias for backtesting.
fiscal_yearNoFiscal year (YYYY). Omit to return most recent periods.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYes
_metaYesProvenance envelope — data lineage for every MCP response
periodYes
tickerYes
as_of_dateYes
periods_returnedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, and destructiveHint, but the description goes beyond by explaining data sources (fact.parquet vs valuation.parquet), null handling with null_reasons, and the non-PIT-safe nature of DCF fields. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded: main purpose, data sources, null handling, and usage alternatives. Every sentence adds value without redundancy. Length is appropriate for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, the presence of an output schema, and annotations, the description is complete. It covers data source reliability, null value interpretation, and provides fallback guidance. No gaps for an agent to misuse the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add significant meaning beyond the schema's parameter descriptions. While it mentions that DCF fields are often null, this is behavioral context rather than parameter-specific; overall, the schema already adequately documents parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Get comprehensive valuation and profitability metrics for a US public company,' clearly stating the verb (get), resource (metrics), and scope (US public company). It explicitly differentiates from sibling tool get_financial_ratios by specifying when to use each.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: use this tool instead of get_financial_ratios when DCF/intrinsic value matters, and use get_financial_ratios for ratio-only needs. It also advises using the SDK or get_company_fundamentals for strict PIT DCF reasoning, offering clear alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_watchlistGet WatchlistA
Read-onlyIdempotent
Inspect

Fetch a single watchlist (full ticker set + criteria) by its name, not an id (case-insensitive). NOT_FOUND if the name is unknown to this user. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesWatchlist name to fetch (case-insensitive, 1–80 chars).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
watchlistYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds that it returns full ticker set+criteria, specifies NOT_FOUND error, and tier restriction. Provides additional behavioral context: case-insensitive lookup and name-based retrieval. Could also mention if it returns anything else.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no filler. Front-loaded with purpose, then error and tier info. Each clause adds essential detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with one parameter, description covers purpose, method, error behavior, and access restrictions. Could mention if output schema is paginated or has limits, but not necessary. Annotations and schema cover the rest. Nearly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers parameter fully with description. Description in tool description also reiterates 'by its name (case-insensitive)'. No new semantic info beyond schema, but consistent. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Fetch a single watchlist' by name (not id), case-insensitive, with error behavior and tier restriction. Differentiates from list_watchlists and save_watchlist implicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Gives explicit usage by name (case-insensitive), error behavior, and tier restriction. Lacks explicit mention of alternatives like list_watchlists but implicitly distinguishes. Could be improved with direct reference to list_watchlists for name discovery.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_alert_inboxList Alert InboxA
Read-onlyIdempotent
Inspect

Newest-first listing of the caller's in-app alert inbox. Each item is a single FIRE of an alert with a dashboard channel — written by the cron evaluator (or test_alert); use list_alerts instead for the alert definitions themselves. By default dismissed items are hidden and read items are included. Cursor-paginated by fired_at. Sample tier rejected — alerts are a paid-tier feature (sp500+).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of inbox items to return (1–100). Defaults to 20.
cursorNoPagination cursor — the `fired_at` of the last item on the previous page.
unread_onlyNoWhen true, return only items where read_at IS NULL.
include_dismissedNoWhen true, also return items the caller previously dismissed.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
itemsYes
next_cursorYes
unread_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. The description adds transparency about cursor-pagination by fired_at, default filter behavior, and paid-tier restriction. No contradictions. Adds value beyond annotations, though could mention rate limits or auth scope.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: first defines core purpose and scope, second adds differentiation and pagination details, third notes tier restriction. No wasted words, efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, usage, parameters, pagination, default filters, sibling comparison, and tier restriction. Output schema exists, so return values are handled. Complete for a read-only paginated list tool with good annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters described. The description does not add meaning beyond the schema except implicitly confirming cursor relates to fired_at. Baseline 3 is appropriate as schema already fully defines parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists the caller's in-app alert inbox in newest-first order, specifies each item is a single FIRE of an alert with a 'dashboard' channel, and distinguishes from list_alerts which returns alert definitions. This provides a specific verb and resource, effectively differentiating it from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use list_alerts instead (for alert definitions), notes default behavior (dismissed hidden, read included), mentions cursor-pagination by fired_at, and that alerts are a paid-tier feature. Provides clear usage context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_alertsList AlertsA
Read-onlyIdempotent
Inspect

Paginated newest-first listing of the caller's alerts (id, condition, channel, status, trigger_count, evaluator health). Filter by status (active/paused/deleted/all). Use the returned alert id with delete_alert or test_alert. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of alerts to return (1–100). Defaults to 20.
cursorNoOpaque pagination cursor from a previous response; omit for the first page.
statusNoFilter by lifecycle state; defaults to `active`. Use `all` to include paused and soft-deleted alerts.active

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
alertsYes
next_cursorYes
total_countYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses pagination, newest-first ordering, filterable status, and returned fields. Annotations already indicate read-only, idempotent, non-destructive; description adds no contradictions and complements well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences front-loading key information. No redundant filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given full input schema and output schema present, description covers pagination, filtering, and downstream usage. No missing context for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters. Description adds value by explaining how cursor ties to pagination and that 'all' includes soft-deleted. Reinforces status filter but doesn't add beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists the caller's alerts with pagination, newest-first, and specific fields. Distinguishes from sibling tools like delete_alert, test_alert by mentioning returned id usage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit filter by status and directs to use returned id with delete/severity operations. Mentions tier restriction (sp500+) but lacks explicit when-not-to-use scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_citation_overridesList Citation OverridesA
Read-onlyIdempotent
Inspect

Author-only newest-first listing of the caller's citation corrections. Filterable by ticker (e.g. all AAPL corrections) or by a single fact_id (returns 0 or 1 row). Pair with save_citation_override and delete_citation_override. Sample tier rejected.

Agent use: call with ticker to introspect what corrections the user has previously applied on that ticker — useful for system prompts that respect prior corrections during regeneration.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of citation overrides to return (1–100). Defaults to 20.
cursorNoCursor from the previous response's `next_cursor` — the updated_at of the last row on that page. Omit for first page.
tickerNoOptional ticker filter, case-insensitive. Uppercased internally.
fact_idNoOptional fact_id filter — returns at most one row.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
overridesYes
next_cursorYes
total_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds that listing is author-only and newest-first, and that fact_id returns 0 or 1 row, providing useful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two paragraphs with front-loaded purpose and filter info. The phrase 'Sample tier rejected' is cryptic but not misleading. Overall efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and annotations cover safety, the description covers listing, filters, pagination (via cursor), pairing with other tools, and a practical agent use case. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds meaning by noting ordering (newest-first), case-insensitive ticker filter, and fact_id returning at most one row, plus a concrete use case for the ticker parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Author-only newest-first listing of the caller's citation corrections' with filter options by ticker or fact_id. It distinguishes from siblings by mentioning pairing with save and delete overrides.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides a specific use case: 'call with ticker to introspect what corrections the user has previously applied on that ticker'. It also pairs with save/delete tools but doesn't explicitly exclude other scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_claimsList ClaimsA
Read-onlyIdempotent
Inspect

List the caller's saved claims, most-recent-first, with AND-composed filters and cursor pagination. Filter by ticker, claim_type (assertion/prediction/judgment), tag, or lifecycle status (open/confirmed/refuted/expired/stale/needs_review). Archived claims are excluded unless include_archived is set.

Tier: all paid + free tiers (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
tagNoFilter to claims carrying this topical tag.
limitNoPage size (max 100).
cursorNoPagination cursor from a previous page's next_cursor.
statusNoFilter by lifecycle status, or 'all'.all
tickerNoFilter to claims referencing this ticker.
claim_typeNoFilter by epistemic type.
include_archivedNoInclude soft-deleted claims.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
claimsYes
next_cursorYes
total_countYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent, non-destructive), description discloses critical behavior: sorting order, AND-composed filters, cursor pagination, archived exclusion logic, and tier access. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two dense sentences with no redundant words; front-loads the core action and filters. Tier note is succinctly appended.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, output schema exists, and annotations cover safety, the description sufficiently covers purpose, filtering, pagination, and tier constraints. No gaps for typical usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes all parameters (100% coverage). Description adds overarching context about AND-composition and the default exclusion of archived claims, providing slight extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states action (list), resource (caller's saved claims), and key features (most-recent-first, AND-composed filters, cursor pagination). Distinguishes from siblings like list_claims_for_thesis and list_public_claims_by_user by specifying ownership.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Describes filtering capabilities and tier restrictions, which guide when it's applicable. However, lacks explicit 'when to use vs not use' statements or alternatives, leaving the agent to infer from context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_claims_for_thesisList Claims for ThesisA
Read-onlyIdempotent
Inspect

List the claims composing a thesis, each with its role (supports/refutes/context). This is how you read a thesis as the structured argument it is — its supporting and disconfirming claims with their current statuses. Archived claims are omitted. Tier: paid + free (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
thesis_idYesId of the thesis whose claims to list.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
itemsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=true, destructiveHint=false), the description adds that archived claims are omitted and mentions a tier restriction. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a tier note. Front-loaded with the main action and key details. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema and the tool's simplicity (one param, read-only), the description covers all essential aspects: what is listed, role/status inclusion, omission of archived, and tier restriction. Complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter. The tool description does not add additional meaning beyond what the schema already provides, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List the claims composing a thesis' with specific verb and resource. It distinguishes from siblings like list_claims (generic) and get_claim (single) by specifying thesis context and output fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'This is how you read a thesis as the structured argument it is' gives strong usage guidance. While no explicit when-not or alternatives are listed, the context is clear enough for an agent to infer appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_my_reportsList My Research ReportsA
Read-onlyIdempotent
Inspect

Cursor-paginated newest-first listing of the caller's own reports (owner-scoped). Filters compose with AND; status defaults to 'ready' so pass status='draft' or 'all' to see drafts. Use cursor from the previous response's next_cursor to fetch the next page (limit max 100). Sample tier rejected (no per-author state).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoPage size.
cursorNoCursor from previous `next_cursor`.
statusNoFilter by status. Default 'ready' (excludes drafts + delisted).ready
tickerNoFilter to a single ticker (case-insensitive).
report_typeNoFilter by report type.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
reportsYes
next_cursorYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint. Description adds pagination behavior (newest-first, cursor usage, max limit 100), filter composition (AND), default status behavior, and a limitation (sample tier rejected). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding distinct information: purpose, filtering, pagination. No redundant words. The sample tier mention is a bit abrupt but still efficient. Slight improvement could integrate the limitation more naturally.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and annotations, the description covers all essential aspects: purpose, scope, ordering, pagination, default filters, cursor usage, limit, and a notable limitation. Nothing seems missing for effective tool usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds value by explaining filter composition ('AND'), the implication of status default ('excludes drafts'), and how to use cursor with limit. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it's a 'cursor-paginated newest-first listing of the caller's own reports (owner-scoped)'. The verb 'listing' and resource 'reports' are specific, and 'owner-scoped' distinguishes it from sibling tools that list other users' reports or all reports.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explains default status filter and how to see drafts, implying when to override. It mentions pagination with cursor and limit, guiding usage. However, it does not explicitly state when not to use this tool or name alternatives, which keeps it from a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_public_claims_by_userList Public Claims by UserA
Read-onlyIdempotent
Inspect

Return the PUBLIC claims + claim-accuracy reputation for a user identified by Stripe customer_id. Used by the /[handle] profile to render an analyst's claim-level track record — a separate signal from thesis-outcome accuracy. Only visibility='public' claims surface; private state never leaks. Accuracy is confirmed/(confirmed+refuted) over resolved claims; null when n < 5. Sample tier rejected; sp500+ only.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax public claims to return. Defaults to 20.
customer_idYesTarget user's Stripe customer_id (resolved by the frontend from the handle).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
claimsYes
reputationYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint, destructiveHint), the description reveals critical behaviors: only visibility='public' claims surface, accuracy formula (confirmed/(confirmed+refuted)), null condition (n < 5), and user tier restriction ('sp500+ only'). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is succinct (two sentences plus a note) and front-loads the purpose. Every sentence adds essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown but indicated as true), the description covers all necessary behaviors: data scope, accuracy calculation, null handling, and tier restriction. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description adds value by clarifying that customer_id is 'resolved by the frontend from the handle', which enriches understanding beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action ('Return'), the resource ('public claims + claim-accuracy reputation'), and the context (Stripe customer_id, analyst's claim-level track record). It distinguishes itself from siblings like 'list_claims' by being user-specific and public-only.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states it is used by the /[handle] profile and contrasts with thesis-outcome accuracy, implying the context. It does not explicitly list alternatives or when-not-to-use, but the purpose is clear enough for an agent to infer usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_public_theses_by_userList Public Theses by UserA
Read-onlyIdempotent
Inspect

Return the PUBLIC theses + reputation aggregate for a user identified by Stripe customer_id. Used by the /[handle] profile page to render an analyst's track record. Only entries with visibility='public' are surfaced — private theses never leak. Reputation is correct/(correct+wrong) over graded theses; null when n < 5 (sample too small). Sample tier rejected; sp500+ only.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax public theses to return. Defaults to 20.
customer_idYesTarget user's Stripe customer_id (resolved by the frontend from the handle).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
thesesYes
reputationYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and idempotentHint=true. Description adds critical behavioral details: only public theses are surfaced (visibility='public'), reputation formula explicitly given, null condition for small samples (n<5), and note about sample tier (rejected; sp500+ only). These go beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise sentences, each adding unique value. No redundant information. Front-loaded with purpose, followed by use case, visibility constraint, and reputation explanation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With annotations and an output schema present, the description adequately covers behavior, constraints, and edge cases (sample tier, null condition). No gaps remain for this type of tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and both parameters (customer_id, limit) are well-described in the schema. Description does not add significant new meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Return the PUBLIC theses + reputation aggregate for a user identified by Stripe customer_id.' Distinguishes from sibling tools like list_theses and list_public_claims_by_user by specifying it returns both theses and reputation for a specific user.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Used by the /[handle] profile page to render an analyst's track record.' Provides context but does not mention when not to use or alternative tools. However, the purpose is clear enough to guide appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_report_versionsList Report VersionsA
Read-onlyIdempotent
Inspect

Author-only newest-first listing of a report's archived version history. Each entry summarises what changed (sections edited, etc.) so the workspace UI can render a clickable history without loading every artifact. Pair with get_report_version to fetch a specific version's content for diffing against HEAD.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of archived versions to return (1–100). Defaults to 20.
cursorNoCursor from the previous response's `next_cursor` — the smallest version number on the previous page. Omit for the first page.
report_idYesIdentifier of the report whose version history to list, as returned by create_report or list_my_reports.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
versionsYes
next_cursorYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses author-only access, newest-first ordering, and that entries summarize changes. Adds context beyond annotations (which only indicate idempotence, read-only, non-destructive). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no extraneous text. Front-loaded with the essential action. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists so return value details are covered. Annotations are present. Description covers behavior, usage pairing, and data structure (summaries). Fully complete for a list tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with comprehensive parameter descriptions. Description does not add additional parameter meaning beyond what schema provides. Baseline score appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists archived version history of a report, with author-only and newest-first ordering. Distinguishes from sibling get_report_version by noting it provides summaries rather than full content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly recommends pairing with get_report_version for diffing. Implies use case for workspace UI history rendering. Could be stronger by stating when not to use (e.g., for fetching content).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_thesesList Saved ThesesA
Read-onlyIdempotent
Inspect

Return the caller's saved theses, newest-first. Filters: ticker (exact), view, status. Cursor-based pagination — pass next_cursor from the previous response to fetch the next page. Sample tier rejected (no per-user state).

ParametersJSON Schema
NameRequiredDescriptionDefault
viewNoFilter to a single view.
limitNoPage size, 1–100. Defaults to 20.
cursorNoPagination cursor returned by the previous `list_theses` call's `next_cursor`.
statusNo'active' (default) hides archived theses; pass 'all' to include them.active
tickerNoFilter to theses on this ticker (case-insensitive).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
thesesYes
next_cursorYes
total_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by explaining cursor-based pagination and that the sample tier is rejected. It does not contradict any annotations. The additional behavioral details (e.g., 'newest-first') enhance transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with three sentences that cover purpose, filters, pagination, and a usage note. Every sentence adds value, and the key information is front-loaded. No filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and the tool's simplicity (list with filters and pagination), the description fully covers what the tool does, including the rejection in the sample tier. It is complete for an agent to understand how to use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions filters (ticker, view, status) and pagination cursor, but these are already well-described in the schema. The description does not add significant new meaning to parameters; it mostly reiterates what is in the schema. The note about sample tier is not parameter-related.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the caller's saved theses ordered newest-first, with filters. The verb 'return' and resource 'saved theses' are specific. The title 'List Saved Theses' also reinforces this. It is distinct from sibling tools which deal with other entities like claims or alerts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions that the sample tier rejects the tool due to no per-user state, which is a usage restriction. However, it does not provide explicit guidance on when to use this versus other listing tools or when not to use it, leaving the agent with only implied context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_watchlistsList WatchlistsA
Read-onlyIdempotent
Inspect

Paginated newest-first listing of the caller's watchlists (id, name, tickers, status, counts). Filter by status (active/archived/all). Returns metadata only — use get_watchlist for one list's full ticker set, or watchlist_diff for new filings across a list. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of watchlists to return (1–100). Defaults to 20.
cursorNoOpaque pagination cursor from a previous response; omit for the first page.
statusNoFilter by state; defaults to `active`. Use `all` to include archived watchlists.active

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
watchlistsYes
next_cursorYes
total_countYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint, idempotentHint, destructiveHint. The description adds pagination order (newest-first), filtering by status, and that it returns only metadata. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose. Every sentence provides value: listing fields, filtering, tier restriction, and sibling distinctions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all relevant aspects: what is returned, filtering, pagination, tier restriction, and how it relates to sibling tools. Output schema exists, so return values are documented elsewhere.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with clear parameter descriptions. The description adds context about the 'newest-first' ordering not in the schema. However, it does not significantly extend beyond the schema for pagination parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists the caller's watchlists with specific fields (id, name, tickers, status, counts) and distinguishes from siblings (get_watchlist, watchlist_diff). The verb 'list' and resource 'watchlists' are precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use this tool vs alternatives: 'Returns metadata only — use get_watchlist for one list's full ticker set, or watchlist_diff for new filings across a list.' Also mentions tier restriction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

mark_inbox_readMark Inbox Item ReadA
Idempotent
Inspect

Set read_at on a single inbox item by its id (from list_alert_inbox or the alerts feed resource) — not an alert id. Idempotent — re-marking does NOT reset the first-read timestamp; there is no unmark. Returns the new unread_count so the agent/UI can update its badge without a follow-up call. Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
inbox_idYesIdentifier of the inbox item to mark read, as returned by list_alert_inbox or the alerts feed resource.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
inbox_idYes
marked_readYes
unread_countYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses idempotent behavior with no unmark, and that the first-read timestamp is not reset. Also specifies return value for badge update. Adds significant context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently describe purpose, constraints, behavior, and return value. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and output schema, the description covers all necessary aspects: purpose, usage, behavior, and return value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter inbox_id has full schema description. The description adds explicit caution against using alert id, which complements the schema but does not introduce new semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sets `read_at` on a single inbox item, distinguishes it from alert id, and specifies idempotent behavior. It differentiates from sibling tools like dismiss_inbox_item by specifying the action and return value.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: by inbox id from specific sources, not alert id. Implicitly excludes other contexts, but does not directly compare with sibling tools like dismiss_inbox_item.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

render_reportRender Report Download URLA
Idempotent
Inspect

Return a 15-minute presigned download URL for a report in the requested binary format.

format=md presigns the cached markdown — instant, no compute. format=docx returns a branded Word document with a cover page (logo, title, ticker, tier badge), the report body (abstract, sections, citations table with clickable SEC EDGAR links), and a back page (methodology, sources, disclaimer). The DOCX is cached in R2 alongside the markdown after first build so repeat downloads are instant; pass force_regenerate: true to bust the cache (e.g. right after update_report).

Tier gate mirrors get_report: authors always see their own reports; non-authors below the report's required tier get an upgrade prompt.

ParametersJSON Schema
NameRequiredDescriptionDefault
formatYesmd = raw markdown (the same body the editor renders). docx = branded Word document with cover + back page.
report_idYesId from create_report or list_my_reports.
force_regenerateNoIf true, ignore the cached DOCX and re-render. No effect on md (markdown is canonical).

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
_metaYesProvenance envelope — data lineage for every MCP response
formatYes
filenameYes
expires_atYes
from_cacheYes
size_bytesYes
content_typeYes
expires_in_secondsYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes beyond annotations by detailing that the URL is presigned for 15 minutes, caching behavior (docx cached after first build, md always instant), and that force_regenerate busts the cache. Annotations only indicate idempotent and non-destructive; the description adds depth on side effects and limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is about 150 words across four paragraphs, front-loaded with the core purpose. Each paragraph adds value without redundancy. Slightly verbose in docx details but remains efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, an output schema, and complex caching/format behavior, the description covers all agent needs: what it returns, format differences, cache invalidation, and access control. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaningful context: format enum values are explained with content specifics (cover page, ticker, citations), report_id uses authoritative sources (from create_report or list_my_reports), and force_regenerate clarifies cache behavior and its lack of effect on md.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a 15-minute presigned download URL for a report in a chosen binary format. It distinguishes between 'md' (cached markdown, instant) and 'docx' (branded Word document), differentiating it from siblings like 'get_report' which likely returns content directly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use each format (md vs docx) and when to use force_regenerate (e.g., after update_report). It mentions tier gate mirrors get_report, implying similar access constraints, but does not explicitly exclude alternatives or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_workflowRun Saved WorkflowA
Read-onlyIdempotent
Inspect

Resolve a saved workflow by id and return a structured execution plan for a single ticker. Each plan entry names a real MCP tool or SOP plus its ticker-substituted arguments; the calling agent invokes them in order, applying any skip_if predicate against the previous step's output.

This tool does NOT execute the steps server-side. It plans; the agent runs. Iterate through plan[] in order, call the named tool/SOP with args, accumulate outputs, and apply each step's skip_if (skip the step when the previous output's path equals equals).

Workflows are private state owned by the calling user. Sample-tier callers are rejected. Pair with list_workflows (frontend) to discover available workflow_ids.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerNoUS-listed ticker the workflow should run against, e.g. 'AAPL'. Pass either `ticker` (single) or `tickers` (batch up to 50). Exactly one is required.
tickersNoBatch mode — array of US-listed tickers, up to 50. When provided, the response has `plans[]` (one plan per ticker) instead of `plan`. Parity with the frontend batch-runner so agents can request 'plan over my watchlist' in a single call.
workflow_idYesWorkflow id returned by the frontend workflow builder.

Output Schema

ParametersJSON Schema
NameRequiredDescription
metaYesProvenance envelope — data lineage for every MCP response
planYes
plansYes
tickerYes
workflowYes
instructionsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. Description reinforces read-only nature and adds important behavioral context: private state per user, sample-tier rejection, and the execution model (plan nor run). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear front-loaded purpose. Each sentence adds value, though the description is slightly longer than minimal. Still effective and focused.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (plan generation, batch, skip_if logic), the description is thorough. It explains the plan structure, iteration, and complementary tool. Output schema exists, so return values are automatically documented.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All 3 parameters have schema descriptions (100% coverage). Description adds value by explaining batch mode (tickers vs ticker leading to plans[]), and origin of workflow_id (frontend). Well beyond schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it resolves a saved workflow by id and returns a structured execution plan for one or more tickers. Distinguishes itself from siblings by emphasizing it plans but does not execute steps server-side.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states the tool does NOT execute steps; it plans and the agent runs. Provides detailed instructions on iterating through plan[], applying skip_if, and mentions pairing with list_workflows. Also notes sample-tier rejection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_citation_overrideSave Citation OverrideA
Idempotent
Inspect

Persist a correction of a citation value. The correction is keyed on the canonical fact_id (a stable hash of CIK + accession + concept + period) so it applies to every report that references that same fact — including agent-regenerated reports. Re-saving the same fact_id replaces the prior correction in place (no duplicate row).

The fact_id is VERIFIED against live SEC data (scoped to ticker) before the correction is stored — a fact_id that doesn't resolve to a real fact is rejected with FACT_NOT_FOUND and nothing is persisted. You therefore must supply the ticker the fact belongs to.

Use this when the user notices an inaccuracy in an AI-generated report and wants the fix to persist. Provide notes for the rationale (≤500 chars) and source_report_id for provenance. Tier caps: sp500=500, pro=5000, full=50000.

ParametersJSON Schema
NameRequiredDescriptionDefault
notesNoOptional free-form rationale, ≤500 chars.
tickerYesTicker the fact belongs to (REQUIRED) — scopes fact_id resolution against live SEC data and denormalises the row for fast filtering (the workspace UI's 'my corrections on AAPL' view).
fact_idYesCanonical fact identifier — usually returned in a citation's `fact_ids` array by get_report or any compute tool. Stable across report regenerations. Verified against live SEC data (scoped to `ticker`) before persistence — a fabricated or unresolvable fact_id is rejected.
corrected_valueYesUser-corrected value, stringified. The frontend interprets it based on the fact's known datatype (number, string, ISO date).
source_report_idNoOptional report id the user was viewing when they applied the correction (provenance).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
createdYes
capacityYes
overrideYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds significant behavioral context beyond annotations: verification against SEC data, rejection with FACT_NOT_FOUND, idempotent replacement, applicability across reports, tier caps. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured and efficient. Every sentence serves a purpose: purpose, behavior, verification, usage, notes on parameters, tier caps. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (implied), the description covers all necessary behavioral aspects, side effects, and constraints. No obvious gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds value by explaining fact_id verification, ticker scoping, and corrected_value stringification. Provides more context than the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool persists a correction of a citation value, with specific verb ('Persist a correction'), resource ('citation value'), and keyed on fact_id. It distinguishes from sibling tools like delete_citation_override by explaining the save/replace behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this when the user notices an inaccuracy in an AI-generated report and wants the fix to persist.' Provides clear context and prerequisites (fact_id, ticker). No explicit exclusions or alternatives, but the use case is well-defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_claimSave ClaimA
Idempotent
Inspect

Persist a single falsifiable, evidence-backed CLAIM — the atomic unit of the research graph. Use this for each discrete assertion an analysis produces (e.g. 'NVDA gross margin stays above 70% through FY2026'), then compose claims into a thesis with link_claim_to_thesis. Claims are scored independently of theses, so claim accuracy is tracked as its own track record.

Pick claim_type by HOW it's judged, not what it's about: assertion = true now, checked against data; prediction = resolves at horizon_days via verifiable_condition; judgment = qualitative, not auto-scored. Use tags for the topic (financial, valuation, macro, …). Set eval_mode: 'auto' + a verifiable_condition for deterministic grading, else 'agent'/'manual'.

Tier: all paid + free tiers (sample rejected — guest has no customerId). Verifiable claims must cite evidence.

ParametersJSON Schema
NameRequiredDescriptionDefault
tagsNoTopical labels (controlled vocab). Multi-valued; drives filtering + learning segmentation, not scoring.
tickersYesEntities referenced (uppercased). 1 for most claims; 2+ for a comparison claim.
evidenceNoEvidence grounding the claim.
directionYesDirectional polarity of the claim.
eval_modeNoHow the outcome is resolved: auto (deterministic grade vs data via verifiable_condition), agent (an LLM judges at resolution), manual (a human marks it).agent
statementYesThe atomic, falsifiable statement. One claim, not a paragraph.
antecedentNoScenario precondition — the claim only resolves when this holds. Null = unconditional.
claim_typeYesEpistemic type — drives scoring. assertion=true now (verified vs data); prediction=future (resolves at horizon via verifiable_condition); judgment=qualitative (not auto-scored).
confidenceYesAuthor confidence in [0,1]. Used as the Brier/log-loss weight when scored.
visibilityNo'private' (default) owner-only; 'unlisted' visible at a direct URL; 'public' surfaces on the author's profile and contributes to the claim-accuracy reputation.private
horizon_daysNoResolution horizon in days (predictions). Null for assertions/judgments.
idempotency_keyNoOptional client key for at-most-once semantics from a retrying agent.
source_report_idNoOptional id of a report that contains the supporting analysis.
verifiable_conditionNoMachine-evaluable condition for eval_mode='auto'. Null otherwise.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
claimYes
capacityYes
deduplicatedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false, idempotentHint=true, destructiveHint=false), the description adds significant behavioral context: it reveals that claims are 'falsifiable' and 'evidence-backed', that evidence is verified against live SEC data at save time and rejected if unresolvable, and that claims are scored independently. It also explains the evaluation modes and visibility settings, providing a clear picture of the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections and front-loaded purpose, but it is somewhat verbose. It could be more concise without losing essential information, as some parameter guidance repeats or extends schema descriptions. Nonetheless, every sentence adds value, and the overall structure is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (14 parameters, nested objects, output schema exists), the description is complete. It covers purpose, usage, parameter semantics, constraints (tier, evidence verification), and ties to sibling tools. The existence of an output schema means return values need no explanation. No gaps are apparent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage (baseline 3), the description adds substantial meaning beyond parameter names and schema descriptions. For example, it explains that claim_type is determined 'by HOW it's judged, not what it's about' and clarifies tags as 'Topical labels (controlled vocab)... drives filtering + learning segmentation, not scoring.' It also provides rationale for eval_mode choices and the structure of verifiable_condition, which are not fully covered in schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Persist a single falsifiable, evidence-backed CLAIM — the atomic unit of the research graph.' It specifies the resource (claim) and action (save), and distinguishes itself from sibling tools like link_claim_to_thesis by noting that claims are composed later. The verb 'Persist' is specific and matches the create operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Use this for each discrete assertion an analysis produces... then compose claims into a thesis with link_claim_to_thesis.' It explains when to use this tool vs. alternatives (composing), and gives detailed instructions on picking claim_type, tags, and eval_mode. Also mentions tier restrictions ('Tier: all paid + free tiers (sample rejected — guest has no customerId)').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_thesisSave Investment ThesisA
Idempotent
Inspect

Persist a directional investment thesis (bull / bear / neutral) on a ticker. The thesis becomes part of the caller's private research diary; pair with list_theses + score_thesis_outcome to track conviction-vs-outcome over time. Pass idempotency_key for at-most-once semantics from a retrying agent.

Use this AFTER the agent has finished its analysis, not before — the thesis records the conclusion, not the question. Pair with source_report_id to link the thesis back to a published report so the buyer's thesis-tracking carries provenance.

Tier: all paid + free tiers (sample tier rejected — sample is guest access with no customerId binding). Per-tier cap on # of stored theses: sp500=100, pro=500, full=10,000.

ParametersJSON Schema
NameRequiredDescriptionDefault
viewYesDirectional view: bull (expect outperformance), bear (under), neutral (mean-revert).
notesNoFree-form rationale, ≤4000 chars. Stored verbatim; trim before submitting.
tickerYesUS-listed ticker. Case-insensitive — normalised to upper. E.g. 'AAPL'.
convictionYes1 = low conviction (gut feel) → 5 = high conviction (deep analysis).
visibilityNoPhase 3: 'private' (default) is owner-only; 'unlisted' is visible at a known direct URL; 'public' surfaces on the author's /[handle] profile and contributes to their reputation score.private
horizon_daysYesInvestment horizon in days. 1 day–5 years (1825d). The grader uses this to pick the as-of period.
idempotency_keyNoOptional client-supplied key. If a previous `save_thesis` from the same user used this key, the existing thesis is returned instead of creating a duplicate.
source_report_idNoOptional id of a report (from `create_report` / `publish_report`) that contains the supporting analysis.
thesis_at_price_centsNoOptional snapshot of the ticker's market price (integer cents) at thesis creation. Used by future versions of the grader that mix in price returns; null for now is fine.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
thesisYes
capacityYes
deduplicatedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond annotations: it explains persistence into a private research diary, idempotency semantics via idempotency_key, tier and cap constraints, and the option to link to a source report. Annotations already indicate idempotentHint=true and destructiveHint=false, and the description aligns well without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose. Each sentence adds value, though it could be slightly more concise (e.g., the pairing mentions are repeated). Overall, it earns its length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity with 9 parameters and 4 required, the description covers purpose, usage guidelines, tier restrictions, and pairing suggestions. An output schema exists (not shown), so return values are covered. The description is complete and leaves no major gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter description coverage, so the baseline is 3. The description adds context for certain parameters, such as the 'visibility' param mentioning phases and the 'thesis_at_price_cents' optionality. It also explains the idempotency_key usage. This extra context justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'persist' and the resource 'directional investment thesis', and specifies the types (bull/bear/neutral). It differentiates from sibling tools like 'list_theses' and 'delete_thesis' by providing context on when to use each.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly guides when to use the tool: 'Use this AFTER the agent has finished its analysis, not before'. It also recommends pairing with 'list_theses' and 'score_thesis_outcome' and explains the idempotency key for retries. Tier restrictions and caps are also provided, making usage clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_watchlistSave WatchlistA
Idempotent
Inspect

Upsert a named watchlist with a list of tickers. Replace semantics — the full ticker list is the source of truth for that name. Use this for both creation AND modification (delete + recreate is not required for edits). 500-ticker cap per list. Names are case-insensitive uniqueness.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesUnique-per-user display name.
tickersYesUS tickers. Normalised to uppercase, deduped.
criteriaNoOptional free-form screening criteria description.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
createdYes
watchlistYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-read-only, idempotent, and non-destructive. The description adds valuable behavioral details: 'Replace semantics' clarifies that the ticker list overwrites any existing list, '500-ticker cap' and 'Names are case-insensitive uniqueness' provide constraints not fully captured by schema alone. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences and a fragment, no wasted words. It front-loads the core purpose and then provides key behavioral details. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a moderate complexity (3 parameters, output schema present), the description covers purpose, usage scenarios, behavioral constraints, and limits. It does not need to explain return values because an output schema exists. The description is complete for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good parameter descriptions. The tool description adds context like 'Replace semantics' for tickers and mentions the ticker cap, but these are already implicit in the schema (maxItems). Overall, the description adds marginal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Upsert', 'creation AND modification') and clearly identifies the resource ('named watchlist with a list of tickers'). It distinguishes from siblings like delete_watchlist, get_watchlist, and list_watchlists by stating the replace semantics and the fact that both creation and modification are covered.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (for creation and modification, with replace semantics, no need to delete/recreate). It does not explicitly mention when not to use it, such as for reading or deleting watchlists, but the sibling tools cover those cases. The guidance is clear but could be slightly more directive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_claimScore ClaimA
Idempotent
Inspect

Resolve a claim's outcome. By default auto-grades an auto claim by evaluating its verifiable_condition against SEC fundamentals (confirmed/refuted), or marks it needs_review when it can't be resolved deterministically (judgment, antecedent, or missing data). To record a human/agent judgment instead, pass manual_status (+ optional score/reason). Idempotent — re-scoring the same resolution is a no-op.

Tier: sp500+ (sample rejected).

ParametersJSON Schema
NameRequiredDescriptionDefault
as_ofNoSnapshot date for the fundamentals window (auto mode). Defaults to today UTC.
claim_idYesId of the claim to resolve.
manual_scoreNoOutcome score in [-1,1] for a manual resolution. Null for non-scored statuses.
manual_reasonNoExplanation for a manual resolution.
manual_statusNoProvide to record a human/agent outcome instead of auto-grading.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
_metaYesProvenance envelope — data lineage for every MCP response
basisYes
claimYes
scoreYes
reasonYes
deduplicatedYes
resolved_statusYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (idempotentHint=true), the description fully discloses auto-grading logic (evaluates verifiable_condition, outcomes confirmed/refuted/needs_review), manual mode, and tier restriction. No contradictions with annotations; idempotency is explicitly stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences plus one line, front-loaded with the main action. Every sentence adds essential information: purpose, auto mode, manual mode, idempotency, and tier restriction. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists (context: has output schema), the description covers input modes, idempotency, and behavioral constraints. It is complete for a tool that resolves claims, with no missing critical aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds context: explains how 'manual_status' triggers manual mode, and that score/reason are optional. It clarifies the default auto-grading logic, which goes beyond the schema's descriptive fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Resolve a claim's outcome.' It differentiates between auto-grading and manual modes, and contrasts with siblings like 'score_due_claims' (batch) and 'score_thesis_outcome' (thesis-level) by focusing on a single claim. The verb 'resolve' and resource 'claim outcome' are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use auto vs manual modes and mentions idempotency. However, it does not explicitly compare with sibling tools like 'score_due_claims' or 'save_claim', leaving some ambiguity. The 'Tier: sp500+' note adds a usage restriction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_due_claimsScore Due Claims (bulk auto-grader)AInspect

Find every auto-gradable claim that is due (assertions in open/needs_review/stale; predictions whose horizon has passed) and resolve each against fundamentals. Operates on the caller's own claims by default; can target any user when called with customer_id from a full-tier (admin) token. Returns a summary + per-claim results. Idempotent — re-calling only re-resolves what changed.

ParametersJSON Schema
NameRequiredDescriptionDefault
maxNoSoft cap on claims scored per call (default 100).
as_ofNoSnapshot date for the fundamentals window. Defaults to today UTC.
customer_idNoTarget user's Stripe customer_id. Defaults to caller; requires `full` tier to target others.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dueYes
_metaYesProvenance envelope — data lineage for every MCP response
errorsYes
scoredYesResolved to confirmed/refuted.
resultsYes
scannedYes
skippedYes
needs_reviewYesCould not be auto-resolved; flagged for review.
target_customer_idYes
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description claims idempotency, but annotations have idempotentHint=false, creating a direct contradiction. This severely misleads the agent about repeat-call safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no redundant information. Key details are front-loaded: purpose, scope, token requirement, idempotency, and return summary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main behavior, targeting, and idempotency. With an output schema present, return details are sufficient. Minor omission: no mention of potential rate limits or error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters are described in the schema (100% coverage), and the description adds useful context like default max, meaning of as_of, and admin requirement for customer_id.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly defines the tool: finds and resolves auto-gradable due claims, defaulting to caller and optionally targeting others with admin token. Differentiates from sibling score_claim and score_due_theses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explains default behavior and condition for targeting other users (admin token). Mentions idempotency, but lacks explicit when-not-to-use or comparisons to alternative tools beyond sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_due_thesesScore Due Theses (bulk auto-grader)AInspect

Find every thesis past its horizon with no outcome yet, and grade each via score_thesis_outcome. Operates on the caller's own theses by default; can target any user when called with customer_id from a full-tier (admin) token. Returns a summary + per-thesis results. Idempotent — a re-call only re-grades anything not already graded.

ParametersJSON Schema
NameRequiredDescriptionDefault
maxNoSoft cap on theses scored per call. Defaults to 100. The frontend cron walks users serially so a low cap per user keeps each MCP request bounded.
as_ofNoSnapshot date for the 'current' fundamentals window. Defaults to today UTC.
customer_idNoStripe customer_id of the target user. Defaults to the caller's own; required to be the caller's own unless the caller's plan is `full` (admin / internal cron).

Output Schema

ParametersJSON Schema
NameRequiredDescription
dueYesSubset that were past their horizon AND ungraded.
_metaYesProvenance envelope — data lineage for every MCP response
errorsYesPer-thesis errors caught + logged.
scoredYesSuccessfully scored + persisted.
resultsYes
scannedYesTotal active theses inspected.
skippedYesSkipped because already graded or not yet due.
target_customer_idYes
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description claims idempotency, but the annotation idempotentHint is false, creating a direct contradiction. Per scoring rules, a contradiction warrants a score of 1. The description adds some behavioral context otherwise.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the main action, no unnecessary words. Efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema, the description adequately covers the tool's operation, scope, and parameters. It lacks only an explicit contrast with score_thesis_outcome, but is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The description adds minor context (e.g., 'soft cap', 'snapshot date'), but does not significantly enhance meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: find overdue theses without outcomes and grade them using score_thesis_outcome. It specifies the scope (caller's own by default, any user with admin token) and distinguishes it from the sibling score_thesis_outcome tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on default scope and admin override, explains the soft cap for frontend use, and implies when to use this tool versus the single-grading sibling. It lacks explicit alternatives but is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_thesis_outcomeScore Thesis OutcomeAInspect

Grade a saved thesis against fundamental momentum since its creation. Pulls revenue / operating-margin / EPS / OCF deltas and aggregates into a score in [-1, +1]. Bull theses are graded by directional alignment, bear by inverse, neutral by closeness-to-flat. The grade is persisted back to the thesis row; re-call to refresh once new fundamentals land.

Note (PR 2): scoring is fundamental-only — does NOT yet include market-price returns. Phase 2 will mix in price data via a partner feed; the response shape is stable.

ParametersJSON Schema
NameRequiredDescriptionDefault
as_ofNoSnapshot date for the 'current' fundamentals window. Defaults to today UTC. The scorer picks the fiscal period closest to this date.
thesis_idYesId returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
thesisYes
outcomeYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that the grade is persisted, aligning with readOnlyHint=false. It explains the scoring process (pulls deltas, aggregates) and notes limitations (fundamental-only, stable response shape). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loaded with purpose, and includes necessary details without fluff. It uses a note for future developments, keeping the main text focused.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description covers input, processing, persistence, score range, and future updates. It is complete for a mutation tool with this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The tool description adds context about the overall workflow but does not significantly enhance parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool grades a saved thesis against fundamental momentum, listing specific metrics (revenue, operating-margin, EPS, OCF) and explaining scoring direction for bull/bear/neutral. It is distinct from sibling scoring tools like score_claim or score_due_theses by targeting theses and persisting the grade.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage after saving a thesis and mentions re-calling to refresh with new fundamentals. It does not explicitly compare to alternatives like score_claim, but the context makes it clear when to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_universeScreen Universe by Factor ScoresA
Read-onlyIdempotent
Inspect

Rank companies by cross-sectional factor scores from factor_scores.parquet. Returns the underlying factors (roe, gross_margin, operating_margin, net_profit_margin, revenue_growth_yoy, fcf_to_assets, debt_to_equity, asset_turnover, current_ratio, piotroski_f_score) plus their percentile ranks (1.0 = best in universe; 0.0 = worst). composite_rank is a composite percentile rank across the factor set — a one-number screening shortcut. For single-factor sorts, use the specific rank column (e.g. sort_by=roe_rank); for a balanced multi-factor view, use sort_by=composite_rank (default). Two modes: full-universe (omit ticker) or single-entity (ticker set — useful for spot-checking ONE company's factor profile without scrolling the whole leaderboard). Sector filter is SIC-derived (GICS-aligned labels, not licensed GICS — see get_pit_universe description for the caveat). Use this instead of get_financial_ratios when you want CROSS-SECTIONAL comparison (rank vs peers); use get_financial_ratios when you want one company's ratios over time. Supports survivorship-free POINT-IN-TIME screening: pass as_of_date (YYYY-MM-DD) to reconstruct the cross-section as it was knowable on that date — factor_scores carries the SEC accepted_at of each period, so the screen filters to scores whose underlying filing had been accepted by the as-of date (zero look-ahead) and ranks each entity at its latest-KNOWABLE period. Omit as_of_date for the latest available snapshot. When supplied, the response carries as_of_date and pit_safe=true. Available on every plan — sample returns the subset covered by the sample bucket.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoNumber of results to return (1-100). Defaults to 25.
sectorNoFilter to a specific sector (case-insensitive partial match). E.g. 'Technology', 'Healthcare'.
tickerNoIf provided, show only this ticker's factor scores (single-entity mode). Omit to screen the full universe.
sort_byNoWhich factor rank to sort by. Options: roe_rank, gross_margin_rank, operating_margin_rank, net_profit_margin_rank, revenue_growth_yoy_rank, fcf_to_assets_rank, debt_to_equity_rank, asset_turnover_rank, current_ratio_rank, piotroski_f_score_rank, composite_rank. Defaults to composite_rank.composite_rank
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD). When supplied, the screen is reconstructed as of this date: factor scores are filtered to those whose underlying SEC filing had been accepted on or before the date (via factor_scores.accepted_at), and each entity is ranked at its latest-knowable period — zero look-ahead, survivorship-free. Omit for the latest available snapshot.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataYesRanked factor-score rows for the screened universe
noteNo
planYesCaller's data plan used to scope the screen
_metaYesProvenance envelope — data lineage for every MCP response
tickerNoPresent only when a single-ticker lookup was requested
lineageNoProvenance for pipeline-derived values (ratio.parquet / factor_scores.parquet): source table + pipeline computed_at, plus a pointer to the tools that return filing-level lineage. NOT point-in-time (recomputed on each pipeline run).
sort_byYes
pit_safeNoPresent (and true) only when as_of_date was supplied — the screen was filtered by factor_scores.accepted_at with zero look-ahead
as_of_dateNoPresent only when a point-in-time as_of_date was supplied
sector_filterNoPresent only when a sector filter was applied
results_returnedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description adds value by explaining survivorship-free point-in-time screening, composite rank, zero look-ahead, and plan availability. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured, front-loading the main purpose and adding detailed explanations for features. Every sentence adds value, though could be slightly more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, 100% schema coverage, existing output schema, and annotations, the description fully covers usage modes, point-in-time, composite_rank, sector caveat, and plan availability.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds significant context for parameters like as_of_date (point-in-time, zero look-ahead), sort_by (composite_rank default), ticker (single-entity mode), and sector (SIC-derived, GICS-aligned).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool ranks companies by cross-sectional factor scores from factor_scores.parquet, and distinguishes it from siblings like get_financial_ratios and get_pit_universe.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises when to use this tool versus get_financial_ratios, describes two modes (full-universe vs single-entity), and explains the point-in-time screening with as_of_date.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_companiesSearch CompaniesA
Read-onlyIdempotent
Inspect

Search for US public companies by name, ticker symbol, CIK (SEC identifier), or SIC industry code. Returns ticker, company name, sector, industry, exchange, and current S&P 500 membership status. Use this tool to resolve a company name to ticker/CIK before calling get_company_fundamentals, get_valuation_metrics, or other tools that require a ticker — they do not fuzzy-match company names.

Use this tool — NOT get_pit_universe — when the user asks about CURRENT S&P 500 members. To list current S&P 500 members, call this tool with no search parameters and filter results by sp500_member=true. This returns the live snapshot as of query time. Example: "List 5 current S&P 500 members" → call search_companies with no parameters, then filter by sp500_member=true and return the first 5.

Use get_pit_universe ONLY when the user explicitly needs a survivorship-free historical universe as of a specific past date (e.g. "S&P 500 members as of March 2018"). If the user says "current," "today," "now," or gives no date, use search_companies instead.

Data details: sic_code is the 4-digit SIC; industry is the human-readable label. sector is SIC-derived with GICS-style labels — NOT licensed GICS, so industrial conglomerates may map differently from official GICS (e.g. 3M → 'Health Care' by SIC vs Industrials by GICS). S&P 500 membership is sourced from index_membership.parquet (current SP500 = index_name='SP500' AND removal_date IS NULL). Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
cikNoSEC CIK identifier (exact match). E.g. '0000320193' for Apple.
limitNoMaximum number of results to return (1–50). Defaults to 25.
queryNoFree-text search over company name and ticker. Case-insensitive. E.g. 'Apple', 'AAPL', 'Microsoft', 'semiconductor'.
is_sp500NoFilter to current S&P 500 members only.
sic_codeNo4-digit SIC industry code. E.g. '7372' for Prepackaged Software.
is_activeNoFilter to active (currently trading) companies only.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
queryYes
companiesYes
results_returnedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint. The description adds value by detailing data sources (SIC-derived sectors, S&P 500 membership from index_membership.parquet), the fact that it does not fuzzy-match, and the exact logic for S&P 500 filtering. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but well-structured with clear sections. It front-loads the main purpose, then provides usage guidelines, then data details. Every sentence adds value, though there is slight repetition about S&P 500 filtering.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 optional parameters, output schema exists), the description is comprehensive. It explains return fields, data sources, and nuances like SIC vs GICS sector labels. It fully prepares the agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all 6 parameters. The description adds context about use cases (e.g., resolving names to tickers), but the schema already explains each parameter well. Baseline 3 is appropriate as schema does most of the work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches for US public companies by name, ticker, CIK, or SIC code. It distinguishes itself from sibling tools like `get_pit_universe` by specifying its purpose for current S&P 500 membership, providing a specific verb and resource with scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool vs alternatives, especially `get_pit_universe`. It provides clear directives: use this for current S&P 500 members, and only use `get_pit_universe` for historical survivorship-free universes. Includes examples.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_filing_textSearch Filing TextA
Read-onlyIdempotent
Inspect

Semantic search over SEC 10-K / 10-Q narrative sections — Risk Factors, MD&A, Business, Legal Proceedings, Controls & Procedures, Footnotes. Phrase the query as a statement rather than a question for best recall (e.g. "companies with rising supply chain concentration risk" not "what companies have supply chain risk?"). Returns narrative passages (text, not numeric facts) ranked by semantic similarity, each with the source filing accession id + URL for citation; score is a [0,1] similarity, not a financial figure. For dollar/ratio figures use get_company_fundamentals / get_financial_ratios. Cached at the per-plan tier for 10 min.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax passages to return (1-20). Defaults to 5. Pro plan capped at 10.
queryYesNatural-language query. Statement form beats question form for recall.
tickerNoRestrict to one company (case-sensitive uppercase). Omit for cross-company search.
sectionNoRestrict to a single section label. Omit to search all sections.
form_typeNoRestrict to one form type.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
queryYes
passagesYes
results_returnedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds valuable behavioral context: 10-minute caching per plan tier, returns text passages not numeric facts, and explains the score is a similarity measure. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with no wasted words. Front-loaded with the core action and key constraints, followed by guidance and alternative references. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers usage scenarios, output format (passages with accession ID/URL), similarity scoring, caching behavior, and alternatives. Lacks explicit error conditions or empty-result handling, but output schema likely covers return structure. Solid overall completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The description enriches parameters by advising statement-form queries for better recall and noting plan-dependent limits (Pro cap at 10) beyond what schema provides, adding meaningful usage context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it performs semantic search over SEC filing narrative sections (Risk Factors, MD&A, etc.), clearly distinguishing it from numeric-focused sibling tools like get_company_fundamentals and get_financial_ratios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on when to use this tool vs alternatives ('For dollar/ratio figures use get_company_fundamentals / get_financial_ratios') and offers query phrasing tips for optimal recall, giving clear context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

test_alertTest Alert (synthetic fire)AInspect

Fire a synthetic notification through the alert's configured channel. Use this immediately after create_alert to verify the channel (email address valid / webhook URL reachable + HMAC verification on the receiver). The synthetic fire is logged as attempt=1 channel='test' so it doesn't affect the real fire counter — the next genuine match still fires normally.

ParametersJSON Schema
NameRequiredDescriptionDefault
alert_idYesIdentifier of the alert to fire a synthetic test notification through, as returned by create_alert or list_alerts.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
outcomeYes
alert_idYes
status_codeYes
channel_typeYes
error_messageYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds important behavioral details beyond annotations: synthetic fire is logged with 'attempt=1 channel=test' and doesn't affect real fire counter. Annotations already indicate it's a non-destructive write, and description aligns without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no fluff. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Sufficient for a single-parameter test tool with output schema present. Could briefly mention the return value (e.g., success status) but not critical given simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for alert_id, but description adds 'as returned by create_alert or list_alerts,' providing useful source context beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Fire a synthetic notification through the alert's configured channel,' with specific verb and resource. It distinguishes from sibling tools like create_alert (creation) and list_alerts (listing).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly recommends immediate use after create_alert to verify channel. Provides clear context but lacks explicit 'when not to use' guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_reportUpdate Report SectionsA
Idempotent
Inspect

Replace one or more sections of an existing report owned by the caller. Useful for authoring workflows where the agent's first draft (create_report) is refined by additional analysis before publishing. Bumps version. Re-embeds for Vectorize if the report is currently listed. Does NOT change price / tier / visibility — use publish_report for those.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoOptional new title.
abstractNoOptional new abstract.
sectionsYesSections to replace. Sections not listed are preserved. Section ids must match the existing payload.
report_idYesIdentifier of the report to update, as returned by create_report or list_my_reports.
expected_versionNoOptimistic concurrency check. If supplied and the current HEAD version is different, the call returns a `version_conflict` error WITHOUT writing. Pass the version you loaded so a concurrent agent edit produces a 'conflict — review' UX instead of silently overwriting (eng review A3A).

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
reportYes
versionYes
archivedYes
previous_versionYes
vectorize_reindexedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false, destructiveHint=false, idempotentHint=true), the description adds critical behavioral details: it bumps the version, re-embeds for Vectorize if listed, and supports optimistic concurrency via the expected_version parameter. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences plus a short bullet-like list) and front-loaded with the core purpose. Every sentence adds value, and the structure is easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, the presence of an output schema, and high schema coverage, the description adequately covers ownership constraints, versioning behavior, and boundaries of operation. It leaves no significant gaps for an agent to misuse the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the schema already documents each parameter. The main description adds little beyond the schema for parameters, but the concurrency check detail in the expected_version description provides some extra value. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool replaces one or more sections of an existing report owned by the caller. It uses specific verbs ('replace') and resources ('sections of an existing report'), and distinguishes itself from sibling tools like create_report and publish_report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('authoring workflows...first draft...refined by additional analysis before publishing') and explicitly notes what it does not change ('Does NOT change price / tier / visibility — use publish_report for those'). This effectively differentiates from alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_fact_lineageVerify Fact LineageA
Read-onlyIdempotent
Inspect

Use this tool when the user asks BOTH what a financial figure is AND which filing reported it — for example "What was Apple's most recently reported revenue, and which 10-Q filed it?" or "Show me the accession ID for Tesla's latest net income" or "Which filing form reported Amazon's Q3 operating cash flow?" This tool returns a single fact plus its complete filing provenance: entity, concept, period, value, accession ID, filing URL, and form type (10-K, 10-Q, etc.).

Use this INSTEAD OF search_companies when the user already names a company and wants a financial figure with its source filing — search_companies only resolves company identifiers and returns no financial data. Use this INSTEAD OF get_company_fundamentals when the user explicitly wants to know which filing or form type reported a number, or needs the accession ID — get_company_fundamentals returns metrics across multiple periods but omits filing provenance.

Two lookup modes: (1) by fact_id (SHA-256 hash of entity_id|accession_id|concept|period_end|unit) for deterministic identity; or (2) by concept name (e.g., TotalRevenue, NetIncome, EPSDiluted, TotalAssets, OperatingCashFlow) plus a ticker to retrieve the most recently reported fact. Optionally pin a point-in-time cutoff via as_of_date (YYYY-MM-DD) — returns the latest filing accepted by SEC on or before that date, eliminating look-ahead bias. Check _meta.pit_safe in the response to confirm PIT correctness.

DURATION: income-statement flow concepts (NetIncome, TotalRevenue, etc.) are reported over a window, and a single 10-K tags BOTH a 12-month figure and a 3-month Q4 stub at the same fiscal-year-end period_end. On a tie this tool returns the longer (headline) window, and every result carries period_type (instant | quarterly | half_year | nine_month | annual | duration) and period_span_days so you always know whether a number is a quarter or a full year — never present a 3-month stub as the annual figure.

Provide either fact_id or concept (required). Returns empty result with error_code FACT_NOT_FOUND if no matching fact exists for the given concept and ticker. Available on all plans.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesStock ticker symbol, e.g. AAPL, MSFT, BRK.B
conceptNoStandard concept name to look up the most recently known fact for. Use this when you don't have a fact_id. Covers the full fundamentals + capital-allocation set: income statement (TotalRevenue, CostOfRevenue, GrossProfit, OperatingIncome, OperatingExpenses, ResearchAndDevelopment, NetIncome, EPSDiluted), balance sheet (TotalAssets, TotalLiabilities, StockholdersEquity, StockholdersEquityIncludingNCI, CashAndEquivalents, TotalDebt), and cash flow (OperatingCashFlow, CAPEX, Dividends, ShareBuyback, DebtIssuance, DebtRepayment, Acquisitions, Divestitures). Provide either concept OR fact_id.
fact_idNoDeterministic fact identity hash: SHA-256(entity_id|accession_id|concept|period_end|unit). 64-char lowercase hex. Use this when you already have the hash from a previous query. Provide either fact_id OR concept (not both required, but at least one must be set).
as_of_dateNoPoint-in-time cutoff (YYYY-MM-DD) used with `concept`. Returns the most recently known fact whose 10-K / 10-Q filing was accepted by SEC on or before this date — true PIT, no lookahead bias. Any calendar date works (not limited to fiscal-period closes); the latest preceding filing is returned. This is the canonical name shared by every other PIT tool in the suite; supersedes the legacy `period_end` parameter on this tool.
period_endNo[DEPRECATED — pass `as_of_date` instead.] Filing-acceptance cutoff (YYYY-MM-DD) used with `concept`. The name is misleading — it actually filters on filing accepted_at, not on the period_end of the returned fact (the returned fact's period_end is whatever the SEC filed, e.g. a Q3 cumulative figure). Kept for one release for backwards compat; new callers MUST use `as_of_date`.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
lineageNoFull provenance: fact_id, concept, value, unit, period_end, source accession, SEC EDGAR URL, form_type, accepted_at, plus duration context — period_start, period_span_days, and period_type (instant | quarterly | half_year | nine_month | annual | duration) so a 3-month stub is never mistaken for the 12-month figure.
verifiedYesTrue when the fact was located and its provenance resolved
lookup_byYesHow the fact was located: 'fact_id' or 'concept'
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint. Description adds behavioral details: DURATION section explains tie-breaking for income-statement concepts, mention of period_type and period_span_days, error handling (FACT_NOT_FOUND), and point-in-time correctness via _meta.pit_safe. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, usage, lookup modes, DURATION). Front-loaded with essential information. Every sentence adds value; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given tool complexity (5 params, output schema), description covers purpose, usage, parameters, behavior, error handling, and plan availability. Points to response fields like _meta.pit_safe, period_type, period_span_days. Fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds meaning beyond schema: explains two lookup modes (fact_id vs concept), as_of_date as true PIT, deprecation of period_end, and fact_id formation. Provides context for when to use each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: returns a single fact with its filing provenance. It uses specific verbs and resources, and distinguishes from sibling tools like search_companies and get_company_fundamentals.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool (user asks for both figure and filing) and when to use alternatives. Includes 'Use this INSTEAD OF' statements and explains two lookup modes and point-in-time behavior.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

watchlist_diffWatchlist DiffA
Read-onlyIdempotent
Inspect

Return new SEC filings across the caller's watchlist tickers since a given date. Reads filing.parquet — does not call insider/ratio surfaces (use those tools separately if you need them). Concurrency-bounded; max 50 tickers per call.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesWatchlist name.
sinceYesCutoff date (YYYY-MM-DD); the diff returns SEC filings accepted on or after this date across the watchlist's tickers.
form_typesNoFiling forms to include. Defaults to 10-K + 10-Q + 8-K.

Output Schema

ParametersJSON Schema
NameRequiredDescription
_metaYesProvenance envelope — data lineage for every MCP response
sinceYes
filingsYes
watchlist_nameYes
tickers_scannedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, and non-destructive. The description adds that it reads a specific data source (filing.parquet) and has concurrency limits, providing context beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each with distinct value: purpose, exclusions/alternatives, and constraints. No redundancy, efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present, the description covers the return type, data source, exclusions, and limits. It lacks details on pagination or error cases, but given the complexity, it is substantially complete for correct selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description restates parameter meanings (e.g., 'since' as cutoff date) and adds default behavior for form_types, but does not add significant new semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies 'Return new SEC filings across the caller's watchlist tickers since a given date,' which is a specific verb-resource combination. It distinguishes from siblings by explicitly noting it does not call insider/ratio surfaces.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description tells when to use (for watchlist filing diffs) and when not (not for insider/ratio data, with explicit mention of separate tools). It also provides constraints like concurrency-bounded and max 50 tickers, though lacks exhaustive alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.