Valuein — SEC EDGAR Fundamentals

Name: Valuein — SEC EDGAR Fundamentals
Author: valuein

by io.github.valuein

Server Details

Point-in-time, survivorship-bias-free SEC EDGAR fundamentals for AI agents. 1994 to present.

Status: Healthy
Last Tested: 2026-05-25 12:33
Transport: Streamable HTTP
URL
Repository: valuein/valuein
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.9/5.0

Tool DescriptionsA

Average 4.2/5 across 57 of 57 tools scored. Lowest: 3.4/5.

Server CoherenceA

Disambiguation3/5

Many tools target overlapping domains (e.g., multiple tools for financial data: get_company_fundamentals, get_financial_ratios, get_valuation_metrics; multiple for insider/holder data). Descriptions are thorough and help distinguish, but boundaries remain blurred for an agent.

Naming Consistency4/5

Most tools use a consistent verb_noun pattern (get_, create_, delete_, save_, list_), but there are deviations like generate_* tools and compute_dcf vs compute_ready_stream. Overall pattern is recognizable.

Tool Count3/5

57 tools is high for an MCP server. While the domain is broad (fundamentals, reports, alerts, watchlists, theses, etc.), some tools could be consolidated. The count feels heavy but not unreasonable given the feature set.

Completeness5/5

The tool set covers a comprehensive range of SEC EDGAR fundamental analysis: data retrieval, ratios, valuation, insider/blockholder tracking, reports, alerts, watchlists, theses, and workflow planning. No significant gaps for the stated purpose.

Available Tools

58 tools

compare_periodsCompare Financial PeriodsA

Read-onlyIdempotent

Inspect

Compare a company's core financial metrics across two fiscal periods side-by-side. Shows absolute and percentage changes with significance classification (minor < 5%, notable 5–15%, significant > 15%). The response includes a material_changes count: this is the number of metrics whose significance ∈ {notable, significant} (i.e. absolute percentage change > 5%). Use it as a quick scalar to triage filings — anything > ~3 typically signals a material event worth deeper review. Use period format: 'FY2024' for annual, 'Q1-2024' for quarterly. Pass period_a as the EARLIER period and period_b as the LATER one — if you invert them the server auto-swaps and sets swapped: true in the response so deltas always carry the correct sign (rather than silently flipping). Point-in-time safe via as_of_date. Available on all plans.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT, BRK.B
`period_a`	Yes	Earlier fiscal period. Format: 'FY2023' for annual or 'Q1-2023' for quarterly.
`period_b`	Yes	Later fiscal period. Format: 'FY2024' for annual or 'Q1-2024' for quarterly.
`as_of_date`	No	Point-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias for backtesting.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosure of significance classification thresholds (minor, notable, significant) adds behavioral detail beyond readOnlyHint and idempotentHint annotations. Could elaborate on default sort or metric scope but sufficient for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences packed with essential info: purpose, output details, usage format, safety, and availability. No filler. Front-loaded with action and scope.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 simple parameters, no output schema needed (tool clearly returns comparison metrics), and strong annotations (readOnlyHint, idempotentHint), the description fully covers what an agent needs: what it does, how to use it, and behavioral constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description does not add new parameter info beyond schema, but the period format examples and point-in-time explanation are consistent with schema descriptions. No contradiction.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Compare', specific resource 'core financial metrics across two fiscal periods', and unique differentiation: side-by-side with absolute/percentage changes and significance classification. Distinguishes from sibling tools like get_company_fundamentals or get_financial_ratios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes when to use (comparing financial metrics across periods), provides period format examples, and implicitly distinguishes from sibling tools (e.g., get_financial_ratios doesn't compare periods). Also notes point-in-time safety and plan availability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compute_dcfCompute Forward DCFA

Read-onlyIdempotent

Inspect

Forward discounted-cash-flow valuation: caller provides growth + WACC + terminal assumptions, returns per-share intrinsic value + 5×5 sensitivity grid. Pulls FCF base + net debt + shares from R2; caller can override any field. Does NOT persist a report — use create_report for that. Tier: sp500+.

ParametersJSON Schema

Name	Required	Description
`wacc`	No	Discount rate. Default 0.09.
`ticker`	Yes
`stage1_years`	No
`shares_override`	No	Override shares outstanding. Leave unset to use R2-derived.
`fcf_base_override`	No	Override the auto-pulled FCF base (USD). Leave unset to use R2-derived.
`stage1_growth_rate`	Yes	Stage-1 FCF growth rate (e.g. 0.12 = 12%/yr).
`terminal_growth_rate`	No	Long-run growth. Default 0.025.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`result`	Yes
`ticker`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes data pulling from R2 (FCF base, net debt, shares) and override capability. Discloses non-persistence, aligning with annotations (readOnlyHint, idempotentHint). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and output, then usage guidance and disambiguation. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description adequately covers return values (intrinsic value + sensitivity grid). Combined with annotations, it provides sufficient context for tool selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 71%, and description adds meaning by explaining the source of default values (R2) and purpose of override parameters. Without describing each param in detail, it provides enough context for parameter selection.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it computes a forward DCF valuation, specifying inputs (growth, WACC, terminal assumptions) and outputs (per-share intrinsic value + sensitivity grid). Mentions it pulls data from R2 and allows overrides, distinguishing it from create_report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (caller provides assumptions) and what not to do (does not persist, use create_report instead). Mentions Tier: sp500+, but lacks explicit when-not usage or alternatives beyond the sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_alertCreate AlertAInspect

Persist an alert and register it with the firing pipeline. Three condition shapes:

filing_event — fire when a ticker files a chosen form type (8-K, 10-K, etc.).
ratio_threshold — fire when a ticker's financial ratio crosses a threshold (e.g. interest_coverage < 1.5).
watchlist_change — fire on any filing on any ticker in a named watchlist.

Delivery channels: email (transactional via Resend) or webhook (HMAC-SHA256-signed POST). The cron evaluator runs every 5 minutes. Use test_alert to verify your channel is wired correctly before relying on the cron.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Human-readable label.
`channel`	Yes
`condition`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`alert`	Yes
`cron_indexed`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (which indicate non-readOnly, non-idempotent), the description adds valuable context: alerts are stored durably now, the firing pipeline is future (Phase 2), and webhook deliveries can be HMAC-signed. This discloses persistence and security traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and well-structured: a short first paragraph covering the three condition shapes, and a second paragraph adding persistence and HMAC details. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given an existing output schema, the description adequately covers creation behavior, condition shapes, and future pipeline. It could mention what the API returns (e.g., alert ID) but the output schema likely covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 33% schema description coverage, the description's high-level explanation of condition shapes partially compensates, but it does not detail parameters like name, channel, or nested fields beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Persist an alert' and enumerates three distinct condition shapes (filing_event, ratio_threshold, watchlist_change), differentiating it from sibling tools like list_alerts and delete_alert.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use each condition shape but lacks explicit guidance on when not to use this tool or alternatives such as delete_alert or list_alerts. No exclusions or comparative context are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_reportCreate Research ReportA

Idempotent

Inspect

Synchronously generate a research report and persist it under the caller's authorship. Two subtypes:

• reverse_dcf — solves the stage-1 free-cash-flow growth rate the market price implies, with a 5×5 sensitivity grid across WACC × terminal-growth assumptions. Returns full markdown + structured JSON + every numerical claim's citation chain to the originating SEC accession.

• thesis — snapshot a saved thesis (via save_thesis) as a frozen narrative report with at-a-glance table, author notes, anchor fundamentals (latest annual), and lineage to the source filing. Later edits to the thesis do NOT propagate — generate a new report to capture new state.

Tier: sample tier rejected — reports are per-author state.

ParametersJSON Schema

Name	Required	Description
`title`	No	Optional human-supplied title; auto-generated when omitted.
`params`	No	Reverse-DCF parameters — required for report_type=reverse_dcf.
`ticker`	No	US-listed ticker — required for report_type=reverse_dcf. Case-insensitive.
`thesis_id`	No	Id of a saved thesis owned by the caller — required for report_type=thesis.
`report_type`	Yes	Subtype. `reverse_dcf` requires ticker + params; `thesis` requires thesis_id (from save_thesis / list_theses).
`idempotency_key`	No	Optional key for at-most-once semantics. Same key from the same user always yields the same report id.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`report`	Yes
`markdown`	Yes
`sections`	Yes
`citations`	Yes
`structured`	Yes

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint and no destructive/readOnly flags. The description adds key behavioral traits: synchronous execution, statefulness (persist under caller's authorship), output contents (markdown, JSON, citation chains), and the clear division of labor (server provides numbers, LLM provides prose). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of ~5 sentences, front-loaded with the main action. However, the final line about 'Tier: sample tier rejected' is cryptic and may cause confusion without adding value, slightly reducing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the sibling count and tool complexity, the description covers purpose, output format, and current limitations. It does not explain idempotency or error handling, but the idempotent hint and schema key address that. Output schema exists, so return value explanation is adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover 80% of parameters (title, ticker, report_type, params subfields). The description adds context about the sensitivity grid and the report_type constraint but does not significantly enhance parameter meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Synchronously generate a research report and persist it under the caller's authorship' and specifies the only supported type (reverse_dcf). It differentiates from siblings by noting the tool's role (structured numbers + lineage, not text) and contrasts with compute_dcf or other report types implicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states the tool is for reverse_dcf only and clarifies its synchronous nature. However, it does not explicitly guide when to use this tool versus alternatives like compute_dcf, get_report, or other report creation tools, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_alertDelete AlertA

DestructiveIdempotent

Inspect

Soft-delete: status flips to deleted. ALSO removes the alert from the cron evaluator index so it stops firing. Idempotent.

ParametersJSON Schema

Name	Required	Description	Default
`alert_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`status`	Yes
`alert_id`	Yes

Tool Definition Quality

A3.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by revealing it is a soft-delete (status change) and idempotent. Annotations provide destructiveHint=true and idempotentHint=true, but the description clarifies the mechanism. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences, no redundant information, and effectively communicates the core behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter, the description covers the essential behavior (soft-delete, idempotency). The existence of an output schema reduces the need to explain return values, so it is adequately complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not mention the parameter 'alert_id' or explain its role. With 0% schema description coverage, the burden on the description is high, yet it provides no additional meaning beyond the schema's type and length constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Soft-delete: status flips to deleted' which specifies the action (delete) and the exact behavior (soft-delete, status change). This distinguishes it from other delete tools that might perform hard deletes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus other delete tools (e.g., delete_report, delete_thesis). The description only states idempotent behavior but does not explain prerequisites or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_citation_overrideDelete Citation OverrideA

DestructiveIdempotent

Inspect

Remove a user-authored citation correction by fact_id. Idempotent — deleting a missing override returns deleted=false without error. Once deleted, reports that previously rendered the corrected value revert to the canonical fact value on next regeneration.

ParametersJSON Schema

Name	Required	Description	Default
`fact_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`deleted`	Yes
`fact_id`	Yes

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint=true and destructiveHint=true. The description adds value by explaining the exact idempotent behavior (deleting a missing override returns without error) and the side effect of report reversion to canonical values. This context is beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no unnecessary words. The first sentence covers purpose and idempotency, the second explains consequences. It is front-loaded and efficient, though the second sentence could potentially be integrated more concisely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only one parameter and an output schema exists (not shown but present), the description explains the key behavioral aspects: idempotency, error handling for missing overrides, and the impact on reports. This is largely complete for a delete operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter, fact_id, with no description beyond type and length constraints. Schema description coverage is 0%. The description uses fact_id in the context of removal but does not explicitly define what a fact_id represents. While somewhat clear from the tool name and context, it could provide more explicit semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Remove a user-authored citation correction by fact_id.' This provides a specific verb (Remove) and resource (citation correction), and identifies the key identifier (fact_id). It distinguishes the tool from siblings like save_citation_override and list_citation_overrides.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains idempotency and the effect on reports ('revert to canonical fact value on next regeneration'), which helps in understanding when to use it. However, it does not explicitly state when not to use this tool or suggest alternatives, such as using save_citation_override to restore an override.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_reportDelete (Soft) Research ReportA

DestructiveIdempotent

Inspect

Soft-delete a report owned by the caller: status flips to delisted, visibility to private. Idempotent. R2 artifact preserved (90-day audit window). Sample tier rejected.

ParametersJSON Schema

Name	Required	Description	Default
`report_id`	Yes	Id from `create_report` or `list_my_reports`.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`status`	Yes
`report_id`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds behavioral details beyond annotations: status transition, artifact preservation with 90-day audit window, and rejection for sample tier. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with key information, no superfluous content. Highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all essential aspects: soft-delete behavior, idempotency, audit window, and restrictions. Completeness is adequate for a simple deletion tool with output schema present.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter description. The tool description does not add new parameter details beyond what schema already provides, achieving baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'soft-delete' a report owned by caller, specifying exact status and visibility changes. Differentiates from other delete tools via title and behavior description.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides some context like idempotency and sample tier rejection but does not explicitly guide when to use this tool over alternatives like delete_alert or delete_thesis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_thesisArchive Saved ThesisA

DestructiveIdempotent

Inspect

Soft-delete a saved thesis: status flips to archived (the row stays for audit / re-scoring). Idempotent — archiving an already-archived thesis succeeds. Hard-delete is not supported by design; future versions may expire archived theses after N years.

ParametersJSON Schema

Name	Required	Description	Default
`thesis_id`	Yes	Id returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`status`	Yes
`thesis_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and idempotentHint=true. The description adds valuable context: it is a soft-delete (row stays for audit/re-scoring), idempotent for already-archived theses, and no hard-delete is supported. This goes beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise: two sentences covering the core action, idempotency, and policy on hard-delete. No fluff, front-loaded with the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter tool with comprehensive annotations and an output schema, the description covers all necessary aspects: behavior, idempotency, persistence for audit, and future expiry. It is complete and leaves no obvious gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter 'thesis_id' is fully described in the schema (100% coverage). The description does not add parameter-specific information, but it also does not need to. Baseline score of 3 is appropriate as the schema already handles semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool performs a soft-delete, flipping status to 'archived', and distinguishes this from a hard-delete. It also mentions idempotency and future archival expiry, making the purpose very specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (to soft-delete a saved thesis), confirms idempotency, and clarifies that hard-delete is not supported. It does not explicitly compare to sibling tools like 'delete_alert', but the context of soft-delete vs. hard-delete is clear enough for correct selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_watchlistArchive WatchlistA

DestructiveIdempotent

Inspect

Soft-delete: status flips to archived. The name is freed (a new save_watchlist can reuse it). Idempotent.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`status`	Yes
`watchlist_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds significant context beyond annotations: explains soft-delete nature (status flip), name reuse, and idempotency. Annotations only provide idempotentHint and destructiveHint; description clarifies what 'destructive' means in this context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, concise, front-loaded with key information (soft-delete, status change, name reuse, idempotency). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers essential behavior for a soft-delete tool with output schema. Lacks mention of prerequisites (e.g., watchlist must exist) but idempotency may mitigate. Sufficient for the action type.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but description mentions 'name is freed' implying the parameter identifies the watchlist. Does not explicitly describe the parameter meaning or constraints beyond the schema, but provides useful context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states soft-delete action (archive watchlist), specifies behavior (status to archived, name freed), and distinguishes from sibling tools like save_watchlist. Uses specific verb 'delete' with context 'soft-delete'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for archiving watchlists without explicit when-to-use vs alternatives. However, explains side effects (name freed) which helps decide. Does not mention when not to use, but context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

describe_schemaDescribe Data SchemaA

Read-onlyIdempotent

Inspect

Returns the Parquet schema for all tables in the Valuein SEC data warehouse. Includes table descriptions, column names, types, primary keys, and foreign-key references. Use this tool to understand the data model before querying with other tools. No data reads required — schema is embedded in the manifest. Available on all plans.

ParametersJSON Schema

Name	Required	Description	Default
`table`	No	Filter to a single table name (e.g. 'fact', 'entity', 'references'). Omit to return the full schema for all tables.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds that the schema is embedded in the manifest and requires no data reads, which is consistent. No contradictions. Could further mention that it's fast or returns a large result for all tables, but it's sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: purpose, usage guidance, and availability. No wasted words. Key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool (one optional parameter, no output schema, rich annotations), the description covers purpose, usage, and behavioral aspects completely. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has one optional parameter 'table' with description covering 100% coverage. The description adds context that omitting the parameter returns the full schema for all tables, which is not in the schema description. This adds value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns the Parquet schema for all tables in the Valuein SEC data warehouse, listing the included details (table descriptions, column names, types, primary keys, foreign-key references). This distinguishes it from sibling tools that perform data queries or analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Use this tool to understand the data model before querying with other tools', providing clear guidance on when to use it. It also notes that no data reads are required and it's available on all plans, implying no restrictions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dismiss_inbox_itemDismiss Inbox ItemA

DestructiveIdempotent

Inspect

Soft-delete an inbox item — sets dismissed_at. The row stays queryable via list_alert_inbox(include_dismissed=true) for audit. Idempotent.

ParametersJSON Schema

Name	Required	Description	Default
`inbox_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`inbox_id`	Yes
`dismissed`	Yes
`unread_count`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true and idempotentHint=true. The description adds context by clarifying it's a 'soft-delete' and explains the row remains accessible for audit, going beyond annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a word, each providing essential information: action, effect, and auditability. No wasted words; front-loaded with the primary action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, behavior, and audit trail. Output schema exists, so not explaining return values is acceptable. Could mention idempotency again (already in annotations) but overall complete for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, meaning parameter descriptions are absent. The tool description does not elaborate on inbox_id beyond its name. For a simple string parameter, more context (e.g., where to find it) would be helpful.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs a soft-delete on an inbox item by setting dismissed_at. It distinguishes itself from hard-delete tools like delete_alert by specifying the row remains queryable for audit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes the item stays queryable via list_alert_inbox with include_dismissed=true, implying use when audit is needed. However, it does not explicitly state when to avoid this tool or mention alternatives like delete_alert.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forensic_auditForensic Audit (Beneish + Sloan + Solvency)A

Read-onlyIdempotent

Inspect

Deterministic forensic-accounting scores for a single ticker: partial Beneish M-Score, Sloan accruals, and a solvency snapshot. Returns a red-flag narrative ranked by severity, with citations to source filings. Used by the forensic_earnings_brief SOP.

Note: full Beneish needs AR / current assets / PPE / SGA / current liabilities, which aren't in our fundamentals model. We compute the recoverable subset (SGI + TATA + LVGI) and flag partial=true. Tier: sp500+.

ParametersJSON Schema

Name	Required	Description	Default
`ticker`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`result`	Yes
`ticker`	Yes
`sec_url`	Yes
`period_end`	Yes
`source_filing`	Yes
`prior_period_end`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by stating it is deterministic, returns a red-flag narrative ranked by severity with citations, and that the result includes a partial flag. It does not contradict annotations (readOnly, idempotent, non-destructive). The behavioral traits are clearly disclosed, though more on rate limits or data freshness could be added.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two paragraphs with no fluff. The first paragraph covers purpose, output, and use context. The second adds important limitations and tier. Every sentence adds value, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (partial forensic scores) and the presence of an output schema, the description covers input (single ticker), output (narrative with severity and citations), limitations (partial), and use context (SOP, sp500+ tier). It is comprehensive for an agent to understand when and how to invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, ticker, is explained in the description as being for a single ticker. The schema has no descriptions (0% coverage), but the description compensates by providing context. For a simple string parameter, this is adequate, though examples or format hints could enhance clarity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes deterministic forensic-accounting scores (Beneish, Sloan, solvency) for a single ticker, returning a narrative with severity and citations. It distinguishes itself by its specialized focus and mentions it's part of the forensic_earnings_brief SOP, setting it apart from sibling tools like get_earnings_signals or get_financial_ratios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (for forensic accounting of a ticker) and notes its limitations (partial Beneish due to missing data, flagged as partial=true). It implies usage in the context of the forensic_earnings_brief SOP. However, it does not explicitly list when not to use it or compare to alternatives, which keeps it from a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_comps_xlsxGenerate Peer Comparables Workbook (xlsx)AInspect

Render a peer comparables table into an Excel workbook. The Comps sheet is formatted as a named Excel Table (ValueinPeerComps) so the user gets one-click Insert Chart on any column — the cleanest workaround for not embedding chart objects server-side. Subject-row highlight makes side-by-side comparison instant. A Summary sheet adds subject vs peer-median deltas.

Pair with get_peer_comparables for a typical flow.

Tier: pro+.

ParametersJSON Schema

Name	Required	Description	Default
`notes`	No
`peers`	Yes
`subject_ticker`	Yes
`subject_company_name`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`r2_key`	Yes
`filename`	Yes
`expires_at`	Yes
`size_bytes`	Yes
`content_type`	Yes
`expires_in_seconds`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds details beyond annotations (readOnlyHint false, destructiveHint false) by explaining the Excel formatting features (table, highlight, summary). It does not mention side effects like file generation or size limits, but the output schema (present) may cover return type.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: four sentences front-loaded with purpose and key details (table format, summary sheet, usage tip). No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers output format and usage pairing but omits parameter semantics and file size/timeout limits. Given complexity (nested peers array, 4 params) and presence of output schema, it is minimally complete but missing actionable details about input.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0% description coverage, and the tool description does not explain parameters (subject_ticker, peers, etc.) beyond implicit references. It fails to compensate for the lack of schema descriptions, leaving agents to guess parameter meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool renders a peer comparables table into an Excel workbook, with specific formatting features like named Excel Table and summary sheet. It effectively distinguishes from sibling tools (e.g., get_peer_comparables retrieves data, while this generates the workbook).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly suggests pairing with get_peer_comparables for a typical flow, providing context. However, lacks guidance on when not to use this tool or alternative approaches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_dcf_xlsxGenerate DCF Workbook (xlsx)AInspect

Render a forward DCF result into a professional Excel workbook (Summary + 5×5 Sensitivity heatmap + Inputs sheet). Native conditional formatting — no chart images needed. Returns a 15-minute presigned R2 download URL.

Pair with compute_dcf for a typical analyst flow: agent calls compute_dcf({ticker, ...}), then passes the structured result straight to generate_dcf_xlsx({ticker, dcf_result, ...}) to materialise a shareable file.

Tier: pro+.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes
`dcf_result`	Yes	Structured DCF result — typically the `result` field returned by `compute_dcf`.
`company_name`	No	Optional — surfaces on the cover row. Falls back to ticker only.

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`r2_key`	Yes
`filename`	Yes
`expires_at`	Yes
`size_bytes`	Yes
`content_type`	Yes
`expires_in_seconds`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (no readOnly, destructive, etc.), so the description carries the full burden. It adds valuable details: returns a 15-minute presigned R2 download URL, uses native conditional formatting, no chart images needed. Discloses behavioral aspects beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, front-loaded with purpose, no redundancy. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers what the tool does, what it returns (presigned URL), and how to use it in context. Despite nested parameters, the description and schema together provide sufficient information. Output schema existence reduces the need to explain return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 67% with descriptions for dcf_result and company_name. Description adds little beyond schema, except the pairing hint. ticker has no extra explanation. Baseline 3 is appropriate as schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Render'), resource ('forward DCF result into a professional Excel workbook'), and includes specific components (Summary, 5x5 Sensitivity heatmap, Inputs sheet). It distinguishes from siblings like compute_dcf and generate_comps_xlsx.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly pairs with compute_dcf, describing a typical analyst workflow. It also mentions 'Tier: pro+' indicating licensing. However, it does not explicitly state when not to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_research_brief_docxGenerate Research Brief (docx)AInspect

Render a structured research brief into a professionally-styled Word document — cover, abstract, optional snapshot table, body sections, and a citations table with clickable SEC EDGAR links. No embedded charts in v1; pair with generate_dcf_xlsx / generate_comps_xlsx for visuals the analyst pastes in.

Consumes the same sections + citations shape create_report emits, so the typical flow is two tool calls: create_report → generate_research_brief_docx.

Tier: pro+.

ParametersJSON Schema

Name	Required	Description	Default
`title`	Yes
`ticker`	Yes
`abstract`	No
`sections`	Yes
`snapshot`	No
`citations`	No
`company_name`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`r2_key`	Yes
`filename`	Yes
`expires_at`	Yes
`size_bytes`	Yes
`content_type`	Yes
`expires_in_seconds`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint=false, etc.) are present. The description adds value by explaining it consumes the same shape as `create_report`, the output is a Word doc with no embedded charts, and it notes v1 limitations. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two well-structured paragraphs. Key information is front-loaded (purpose, components) and the second paragraph provides workflow context. No unnecessary sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters, an output schema exists (relieving the need to describe return values), and the complexity is moderate, the description covers the core purpose and typical flow with sibling tools. It could briefly mention the output format (docx file) more explicitly, but overall it's adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions that the tool consumes `sections` and `citations` shapes (as emitted by `create_report`), but does not explain individual parameters like `ticker`, `title`, `abstract`, `snapshot`, or `company_name`. Some compensation, but not thorough.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it renders a structured research brief into a Word document with specific components (cover, abstract, snapshot table, body, citations). It also distinguishes from sibling tools like `generate_dcf_xlsx` and `generate_comps_xlsx` by explicitly noting the lack of embedded charts and suggesting pairing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides good context on when to use: it's for generating the docx after `create_report`, and notes that it should be paired with other tools for visuals. However, it could be more explicit about when not to use or list alternatives directly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_blockholdersBlockholders (SC 13D / 13G)A

Read-onlyIdempotent

Inspect

Returns SC 13D / SC 13G blockholder disclosures (5%+ stakes) for a US public company. Each row carries percent_owned, sole/shared voting + dispositive split, schedule_type, and the first-class going_active flag — TRUE when the same filer flipped 13G → 13D within the lookback window (the single most actionable activist signal in this dataset). Use latest_only=true (default) to dedupe to the most recent filing per filer. Use collapse_groups=true to fold multi-person filings into one row. Institutional tier only.

ParametersJSON Schema

Name	Required	Description	Default
`ticker`	Yes	Stock ticker symbol of the issuer.
`as_of_date`	No	PIT filter on accepted_at — only filings on or before this date.
`latest_only`	No	When true (default), keep only the most recent filing per (filer, schedule prefix) — typically what analysts want. Set false to see the full filing history.
`lookback_days`	No	Window for the going_active (13G → 13D) detection. Default 365 days.
`lineage_detail`	No	Per-row provenance envelope.	compact
`collapse_groups`	No	When true, fold multi-reporting-person filings into a single row, with secondary persons in the ``persons[]`` field. Default false: each person stays as its own row.
`schedule_filter`	No	Which schedule(s) to return. '13D' = activist (intent to influence). '13G' = passive. 'both' = no filter.	both

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`rows`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`company_name`	Yes
`data_age_days`	Yes
`staleness_warning`	Yes

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, and non-destructive behavior. The description adds significant value by highlighting the 'going_active' flag as an actionable signal and explaining parameter effects (e.g., latest_only deduping). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear first sentence defining the purpose, followed by key fields and parameter explanations. It is slightly lengthy but every sentence adds value, earning a score just below perfect.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown but indicated), the description adequately covers purpose, key outputs, and parameter usage. It also includes important context like 'Enterprise tier only' and the 'going_active' signal, making it complete for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed parameter descriptions. The description adds extra context (e.g., 'typically what analysts want' for latest_only, 'single most actionable activist signal' for going_active) that enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns SC 13D/13G blockholder disclosures for US public companies, specifies key fields like percent_owned and the 'going_active' flag, and distinguishes it from siblings by emphasizing a unique activist signal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool (for blockholder disclosures) and includes an 'Enterprise tier only' note. However, it does not explicitly contrast with siblings like get_top_holders or get_institutional_holdings, leaving some ambiguity for an agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_capital_allocation_profileCapital Allocation ProfileA

Read-onlyIdempotent

Inspect

Get a multi-year capital allocation breakdown for a US public company. Shows how management deploys cash across all six categories — capex, R&D, M&A, dividends, buybacks, and debt — plus pre-computed deployment ratios (% of operating cash flow) and over-distribution flags. Use this tool when the user asks: how does a company allocate capital, what's the buyback-vs-dividend mix, is the company over-distributing, is growth funded by R&D or M&A, what's the cash return ratio trend, or any 'where does the money go' question. Also use for owner-earnings analysis (Buffett-style) and reinvestment-rate analysis (Damodaran-style). Data sourced from annual 10-K filings (SEC EDGAR) — income statement (R&D), investing section (capex, M&A), financing section (dividends, buybacks, debt). All figures are point-in-time safe via the as_of_date parameter — no look-ahead bias. R&D semantics: R&D is included as a deployment category despite being an income-statement expense, because for knowledge-economy businesses (tech, pharma, industrials with heavy engineering) it represents the primary growth reinvestment vehicle and often dwarfs capex. R&D is already deducted before reaching operating cash flow, so rd_pct_ocf is INFORMATIONAL — it does not reduce OCF a second time. The total_deployment_pct_ocf field excludes R&D from its sum to preserve the cash-flow identity (OCF = capex + M&A + dividends + buybacks + debt repayment + change in cash). Flags object: pre-computed booleans for common analytical questions. Use buybacks_exceed_fcf to identify years a company returned more to shareholders via repurchases than it generated in free cash flow. Use total_returns_exceed_fcf for the stricter test (buybacks + dividends > FCF). Use debt_funded_distribution to distinguish over-distribution funded by leverage (typical industrials) from over-distribution funded by cash hoard (Apple 2018-2019 post-tax-reform repatriation). NOT yet included (separate roadmap items): buyback_yield_implied requires a price × shares market-cap series; equity-method investments and intangibles are excluded from acquisitions_net to keep M&A semantics tight (request other_investing_outflows if needed). Available on all plans.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT
`as_of_date`	No	Point-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias.
`lookback_years`	No	Number of fiscal years to look back from the most recent filing (1–20). Defaults to 5 years for a full capital allocation cycle.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds value by clarifying point-in-time safety via as_of_date and specifying the data source (SEC EDGAR), which are not in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured, with a clear first sentence stating the core purpose, followed by details on data source and included metrics, ending with key usage notes. No unnecessary sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists the specific metrics included (operating cash flow, capex, FCF, dividends, etc.), making it complete for an analyst to understand what data is returned. The tool is not complex, and the description covers all necessary context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are fully documented. The description adds context for lookback_years (default 5 years for a full cycle) and as_of_date (point-in-time, eliminates look-ahead bias), but these are already covered in the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool retrieves a multi-year capital allocation breakdown for US public companies, specifying the resource (cash flow deployment data) and action (get), and distinguishes it from sibling tools like get_company_fundamentals and get_financial_ratios by focusing on capital allocation inputs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly states the tool is for US public companies and includes data sourced from 10-K filings, providing context for when to use it. However, it does not explicitly state when not to use it or mention alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_company_fundamentalsCompany FundamentalsA

Read-onlyIdempotent

Inspect

Retrieve standardized SEC EDGAR fundamental financial metrics for a US public company. Returns revenue, gross profit, operating income, net income, EPS (diluted), total assets, total liabilities, stockholders' equity, cash & equivalents, total debt, operating cash flow, and capital expenditures for one or more fiscal periods. Data sourced from 10-K (annual) and 10-Q (quarterly) filings. Point-in-time: no look-ahead bias.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of periods to return (1–40). Defaults to 5.
`period`	No	Filing period granularity. Annual uses 10-K; quarterly uses 10-Q.	annual
`strict`	No	When true, fail with PLAN_LIMIT_EXCEEDED if the plan cannot satisfy the requested limit. Default false: return what's available and explain the gap in _meta.truncation.
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT, BRK.B
`as_of_date`	No	Point-in-time date (YYYY-MM-DD). Only returns facts with accepted_at on or before this date — eliminates look-ahead bias for backtesting. Omit for the full dataset.
`fiscal_year`	No	Fiscal year (YYYY). Omit to return the most recent available years.
`lineage_detail`	No	Per-period provenance envelope. 'compact' (default) returns source_filing + source_url + restated flag for one-click SEC verification. 'full' adds first_filed_at + accepted_at for Bloomberg-Option-C restatement reasoning. 'off' omits lineage entirely.	compact

Output Schema

ParametersJSON Schema

Name	Required	Description
`data`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`period`	Yes
`ticker`	Yes
`as_of_date`	Yes
`company_name`	Yes
`years_returned`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. Description adds useful context: point-in-time behavior, no look-ahead bias, and data source. Does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding value: first states purpose and metrics, second specifies source, third clarifies point-in-time property. No fluff, front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with 100% schema coverage, output schema exists, and annotations present, description covers key aspects. Could mention return value shape or how to handle missing data, but not necessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds some context beyond schema (e.g., 'point-in-time date eliminates look-ahead bias') but doesn't detail each parameter's interaction. Adequate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it retrieves standardized SEC EDGAR fundamental metrics for US public companies, listing specific metrics (revenue, net income, EPS, etc.) and source filings. This distinguishes it from siblings like get_financial_ratios or get_valuation_metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description mentions data source (10-K/10-Q), point-in-time capability, and lack of look-ahead bias. However, it doesn't explicitly state when not to use this tool versus siblings like get_financial_ratios or get_earnings_signals.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_compute_ready_streamCompute-Ready StreamA

Read-onlyIdempotent

Inspect

STATUS: pending — direct R2 Parquet access is in private beta (ETA 2026-Q3). Calls return 501 FEATURE_NOT_AVAILABLE today. When live: returns a pre-signed Cloudflare R2 URL for bulk Parquet access that can be piped into Python/DuckDB/Polars for high-throughput computation that exceeds the MCP context window. Datasets: fact (per-entity partition — requires ticker), ratio (all computed ratios), valuation (DCF inputs), filing (SEC filing metadata), references (company universe), index_membership (historical index composition). URL would expire in 15 minutes. TODAY use the Python SDK (pip install valuein-sdk) for the same data via DuckDB.

ParametersJSON Schema

Name	Required	Description	Default
`ticker`	No	Required when dataset_type is 'fact'. Resolves to the per-entity fact/{CIK}.parquet partition for that company.
`dataset_type`	Yes	Dataset to access. 'fact' requires ticker (per-entity partition). All others are full-universe tables.

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false, covering safety. The description adds value by stating URL expiry (1 hour) and plan availability ('Available on every plan — sample returns a URL scoped to the sample bucket'), which are not in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise (3 sentences) and front-loaded with the core purpose. It covers use case, datasets, expiry, and example without redundancy. Could be slightly more structured (e.g., bullet points for datasets) but currently efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 params, no output schema), the description is complete: it explains the return value (pre-signed URL), use case, datasets, ticker requirement, expiry, and plan info. No output schema exists, so description must cover return; it does adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add parameter details beyond what schema provides (e.g., ticker required for 'fact', dataset_type enum). It lists dataset types and briefly explains 'fact' requiring ticker, but this largely mirrors the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a pre-signed Cloudflare R2 URL for direct Parquet file access, with specific verb ('Get') and resource ('pre-signed Cloudflare R2 URL'). It distinguishes from siblings by explaining the use case (bulk data exceeding MCP context window) and listing datasets, which no sibling tool does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use this tool ('when the query requires bulk data that exceeds the MCP context window') and provides an example usage pattern. It does not explicitly state when not to use or name alternatives, but the sibling tools list and context imply alternatives exist for smaller data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_earnings_signalsEarnings SignalsA

Read-onlyIdempotent

Inspect

Reported earnings results and earnings-surprise metrics for a company, by fiscal period: actual EPS, an expected-EPS estimate, the surprise versus that estimate (as a %), reported revenue, and year-over-year revenue growth. Use it to see whether a company beat or missed on earnings and by how much, and to track revenue trajectory. Point-in-time safe — pass as_of_date to filter by SEC acceptance (accepted_at) for look-ahead-free backtests. Available on all plans.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of periods to return (1–40), most recent first. Defaults to 8 — covers 2 years of quarterly signals plus their TTM equivalents. earnings_signals.parquet currently emits one row per (entity, period_end); older rows surface here when the pipeline emits historical trend snapshots.
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT
`as_of_date`	No	Point-in-time filter: only return signals with accepted_at on or before this date. Use for backtesting to avoid look-ahead bias.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, and non-destructive behavior. The description adds context about trend calculation (trailing 4-quarter average) and plan availability, which is helpful but not extensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no fluff; each sentence adds distinct value (definition, usage guidance, availability).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the key metrics (EPS trend, surprise_pct). However, it could mention expected output format or additional fields. Still adequate for a simple data retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents parameters well. The description adds brief context about as_of_date for backtesting but does not add significant new meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Trend-based earnings expectations and surprise metrics' and explains EPS trend and surprise_pct, which distinguishes it from sibling tools like get_company_fundamentals or get_financial_ratios that focus on different metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly mentions point-in-time safety for backtesting with as_of_date, but does not explicitly say when NOT to use it or compare to alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_financial_ratiosFinancial RatiosA

Read-onlyIdempotent

Inspect

Get pipeline-computed financial ratios from ratio.parquet. Covers 7 categories: profitability (margins, ROE, ROA, ROIC), liquidity (current ratio, quick ratio), leverage (D/E, interest coverage, net debt/EBITDA), efficiency (asset turnover, inventory days), per_share (EPS, BVPS, FCF/share), owner_earnings (Buffett FCF, owner yield), and valuation (EV/EBITDA, P/E, P/FCF). Includes TTM (trailing twelve months) rows alongside annual periods. Each row carries is_calendar_aligned (boolean) — TRUE when period_end is actually on the entity's fiscal year boundary (±7 days), FALSE when the pipeline emitted a calendar-quarter row tagged fiscal_period='FY' for a non-December fiscal-year filer. Filter to is_calendar_aligned=TRUE if you're joining ratios with fact-table fundamentals on (entity, fiscal_year). Ratios are pipeline-derived — they're recomputed on each pipeline run with ON CONFLICT DO UPDATE. There is NO accepted_at column; for historical cuts use period_end_before (filters by ratio.period_end), not as_of_date. _meta.pit_safe is structurally false for this tool — for true PIT analysis, use get_company_fundamentals (which carries fact.accepted_at). Use this instead of get_valuation_metrics when you only need ratios (no DCF wiring); use get_valuation_metrics when you also need DCF/DDM. Available on all plans.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Number of distinct period_end dates to return (1–20). Defaults to 5. Within each period, all matching ratio_names are included.
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT
`categories`	No	Ratio categories to include. Omit to return all categories. Options: profitability, liquidity, leverage, efficiency, per_share, owner_earnings, valuation.
`fiscal_period`	No	Filter to a specific fiscal period type. Use 'TTM' for trailing twelve months. Omit to return both annual (FY) and TTM rows.
`period_end_before`	No	Historical cutoff: only return ratios with period_end on or before this date. Use this instead of as_of_date — ratio data has no PIT accepted_at.

Output Schema

ParametersJSON Schema

Name	Required	Description
`data`	Yes
`note`	Yes
`plan`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`periods_returned`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the tool's non-destructive, read-only nature is clear. Description adds value by noting ratios are pipeline-derived and that knowledge_at does not apply, but no additional behavioral traits beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise, front-loading the main action and then listing categories. The note about period_end vs knowledge_at is important but could be slightly shorter. No wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the rich input schema and annotations, the description covers all necessary context: data source, categories, special behavior (TTM, period_end usage), and plan availability. Output schema exists so return values need not be explained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter is already well-documented in the schema. The description adds no further parameter details, but baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves pipeline-computed financial ratios from the ratio table, covering 7 specific categories. It distinguishes itself from siblings like get_valuation_metrics by focusing on comprehensive ratio categories including TTM rows.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to filter by period_end for historical analysis, not knowledge_at, clarifying correct usage. Implicitly suggests not using as_of_date. Provides guidance on when to omit categories or fiscal_period.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_insider_sentimentInsider Sentiment (composite)A

Read-onlyIdempotent

Inspect

Role-weighted insider sentiment score on a fixed [-100, +100] scale for a single issuer over a lookback window. Role weights: CEO/CFO = 3.0 (via officer_title pattern), other NEO Officer = 2.0, 10%-Owner = 1.5, Director = 1.0. P = +1, S = -1; option exercises, grants, and tax withholdings are neutralised. Cluster flag = TRUE when ≥3 distinct insiders transacted within any 30-day window inside the lookback. Institutional tier only.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes	Issuer ticker symbol.
`lookback_days`	No	Days back from today to scan transactions for. Default 180.
`cluster_window_days`	No	Sliding window for the cluster_flag detection. Default 30 days.

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`buy_count`	Yes
`sell_count`	Yes
`cluster_flag`	Yes
`company_name`	Yes
`lookback_days`	Yes
`total_buy_usd`	Yes
`total_sell_usd`	Yes
`sentiment_score`	Yes
`top_contributors`	Yes
`total_buy_shares`	Yes
`total_sell_shares`	Yes

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, idempotent, and non-destructive. The description adds rich behavioral details: specific role weights (CEO/CFO=3.0, etc.), transaction type mapping (P=+1, S=-1), neutralization of options/exercises/grants, and cluster flag logic. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, packing a lot of information into a single paragraph. It is front-loaded with the core purpose. However, it could be slightly more structured (e.g., bullet points for role weights) to aid quick scanning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The output schema exists, so the description doesn't need to detail return format. It covers the score range, cluster flag, and enterprise restriction. For a single-issuer query tool, this is nearly complete. Missing perhaps a note that it returns a single score per request.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all three parameters have descriptions). The description adds context: 'lookback_days' is linked to 'scan transactions for', and 'cluster_window_days' is explained as 'sliding window for the cluster_flag detection'. This goes beyond the schema's minimal descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the purpose: 'Role-weighted insider sentiment score on a fixed [-100, +100] scale for a single issuer over a lookback window.' It specifies the resource (insider sentiment), the scale, and scope (single issuer, lookback). This distinguishes it from sibling tools like get_insider_transactions or get_smart_money_flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes 'Enterprise tier only,' indicating when the tool is available. It also provides detailed role weights and transaction handling, which helps agents decide when to use this aggregated sentiment score versus raw transaction data. However, it lacks explicit 'when not to use' or direct comparison to alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_insider_transactionsInsider TransactionsA

Read-onlyIdempotent

Inspect

Form 3 / 4 / 5 / 144 line items for a US public company. Returns each transaction (or initial holding / proposed sale) with the insider's name, role, transaction code, share count, price, and notional. Filters by lookback window, transaction code (P=purchase, S=sale, A=grant, M=option exercise, F=tax withholding, etc.), insider role, and minimum share threshold. Institutional tier only — sample / sp500 / pro return ENTITLEMENT_DENIED with an upgrade link.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum rows to return. Default 100, max 500.
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT, BRK.B
`roles_in`	No	Insider roles to keep. Omit to include any role.
`as_of_date`	No	Point-in-time date (YYYY-MM-DD). Only returns transactions with accepted_at <= this date — eliminates look-ahead bias. When set, lookback_days is ignored.
`min_shares`	No	Minimum \|shares\| per transaction. Omit for no floor.
`lookback_days`	No	How many days back from today to scan transactions for. Ignored when as_of_date is set.
`lineage_detail`	No	Per-row provenance envelope. 'compact' (default) returns source_filing + source_url. 'full' adds accepted_at. 'off' omits lineage.	compact
`transaction_codes`	No	SEC transaction codes to keep (uppercase, single-letter): P=purchase, S=sale, A=grant, M=option exercise, F=tax withholding, G=gift, J=other. Omit to include all codes.

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`rows`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`company_name`	Yes
`data_age_days`	Yes
`staleness_warning`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the annotations (readOnlyHint, idempotentHint), the description adds key behavioral info: the tier-gated response, the point-in-time date behavior (as_of_date eliminates look-ahead bias), and the lineage_detail parameter options. This provides comprehensive transparency without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two short paragraphs, front-loading the core purpose and immediately listing filters and constraints. Every sentence adds value, with no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 8 parameters, 100% schema description coverage, and an existing output schema, the description covers all essential aspects: what data is returned, available filters, tier restrictions, and behavioral nuances. No missing information is critical for the agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds extra meaning by explaining that as_of_date overrides lookback_days and listing transaction code mappings (P, S, A, etc.) that are not fully detailed in the schema. This adds moderate value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns Form 3/4/5/144 line items for a US public company, specifying the exact data fields (insider's name, role, transaction code, etc.). It distinguishes itself from siblings like get_insider_sentiment by focusing on raw transaction data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly notes that the tool is 'Enterprise tier only' and that lower tiers receive ENTITLEMENT_DENIED, providing clear usage constraints. It does not explicitly mention alternative tools, but given the specificity of the data, the guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_institutional_holdingsInstitutional Holdings (by issuer)A

Read-onlyIdempotent

Inspect

Returns top-N institutional holders of a US public company at a specific period_end (latest by default), with aggregate institutional shares, total market value, holder count, and HHI concentration (sum of squared share-of-total percentages). Sourced from Form 13F-HR via the by-issuer partition. Institutional tier only. 13F filings carry a ~45-day reporting lag — staleness_warning fires when latest data is older than 90 days.

ParametersJSON Schema

Name	Required	Description	Default
`top_n`	No	Maximum holders to return, ranked by market_value_usd. Default 25, max 200.
`ticker`	Yes	Stock ticker symbol of the issuer.
`period_end`	No	Quarter-end of the 13F reporting period (YYYY-MM-DD). Omit to use the latest period available.
`lineage_detail`	No	Per-row provenance envelope. compact / full / off.	compact

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`rows`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`period_end`	Yes
`company_name`	Yes
`data_age_days`	Yes
`holders_count`	Yes
`hhi_concentration`	Yes
`staleness_warning`	Yes
`total_market_value_usd`	Yes
`total_institutional_shares`	Yes

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds valuable behavioral context: a ~45-day reporting lag and a staleness_warning when data is older than 90 days. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no fluff, front-loaded with the main function. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters and an output schema, the description covers the essential aspects: what data is returned, source, lag, staleness warning, and enterprise restriction. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds context like 'latest by default' for period_end and mentions top-N, but most parameter details are already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns top-N institutional holders with aggregate metrics and HHI, sourced from Form 13F-HR. However, it does not explicitly differentiate from sibling tools like get_top_holders or get_blockholders.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for institutional holdings data but lacks explicit guidance on when to use this tool versus alternatives. It mentions 'Enterprise tier only' but no when-to-use or when-not-to-use conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_manager_portfolioManager Portfolio (13F by filer)A

Read-onlyIdempotent

Inspect

Returns a 13F filer's full portfolio at a specific period_end (latest by default), with QoQ deltas vs the prior quarter (new / increased / decreased / exited / unchanged). Specify the filer either by filer_cik (preferred) or filer_name (fuzzy match against entity.name; multiple matches raise an ambiguity error so you can disambiguate by CIK). Institutional tier only.

ParametersJSON Schema

Name	Required	Description	Default
`top_n`	No	Maximum positions to return, ranked by market_value_usd. Default 25.
`filer_cik`	No	CIK of the 13F filer (1-10 digits; will be zero-padded to 10). Preferred over filer_name when known.
`filer_name`	No	Filer name to fuzzy-match against entity.name. Case-insensitive substring match. Multiple matches raise INVALID_ARGUMENT — use filer_cik in that case.
`period_end`	No	Quarter-end (YYYY-MM-DD). Omit to use latest available.
`lineage_detail`	No	Per-row provenance envelope.	compact

Output Schema

ParametersJSON Schema

Name	Required	Description
`rows`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`filer_cik`	Yes
`filer_name`	Yes
`period_end`	Yes
`data_age_days`	Yes
`positions_count`	Yes
`prior_period_end`	Yes
`staleness_warning`	Yes
`total_market_value_usd`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by detailing the QoQ delta categories (new/increased/decreased/exited/unchanged) and the ambiguity error for multiple name matches. This context goes beyond annotations without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: main function, identifier usage, and tier restriction. Front-loaded with the primary action. No redundant or unnecessary words. Every sentence adds essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description mentions 'full portfolio' but the top_n parameter limits positions (default 25, max 200), which is a notable discrepancy. The description should clarify that results are a ranked subset. Otherwise, it covers identifier usage, defaults, and tier. Output schema exists, so return format details are not required here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all parameters with descriptions (100% coverage). The description reinforces key points (prefer filer_cik, default period_end, top_n default) and explains the fuzzy match logic. This adds usability guidance beyond the schema descriptions, earning above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a 13F filer's full portfolio at a specific period_end with QoQ deltas. The verb 'Returns', resource '13F filer's full portfolio', and additional details (QoQ deltas, default period) make the purpose unambiguous. It distinguishes from siblings like get_institutional_holdings by focusing on a single filer's portfolio with quarterly changes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on specifying the filer via filer_cik (preferred) or filer_name (fuzzy match, ambiguity error). Mentions the enterprise tier restriction. However, does not discuss when not to use this tool compared to siblings or alternative approaches for different needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_peer_comparablesPeer ComparablesA

Read-onlyIdempotent

Inspect

Get ratio-based peer comparison for a company and its closest competitors. Peers are selected by matching 2-digit SIC industry code. Returns pipeline-computed ratios from up to 10 peers alongside the subject company for direct benchmarking. Ratio categories: profitability, liquidity, leverage, efficiency, per_share, owner_earnings, valuation. TTM (trailing twelve months) ratios are used when available for the most current view. Use period_end_before to compare peers at a specific historical date. Available on every plan — sample returns the subset covered by the sample bucket.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of peers to return alongside the subject company (1–10). Defaults to 5.
`ticker`	Yes	Subject company ticker, e.g. AAPL. Peers are auto-selected by SIC code.
`categories`	No	Ratio categories to include in the comparison. Defaults to profitability, valuation, and leverage.
`period_end_before`	No	Historical cutoff: only include ratios with period_end on or before this date. Use this for historical peer benchmarking snapshots.

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, idempotentHint, and non-destructive behavior. The description adds context about data sources (pipeline-computed ratios, TTM ratios) and plan availability. It does not contradict annotations; annotation_contradiction is false.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise at 4 sentences, covering purpose, peer selection, ratio categories, TTM usage, historical comparison, and plan availability. It is front-loaded with the primary purpose. Minor redundancy could be trimmed, but it is well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains what is returned (ratios from up to 10 peers alongside subject company) and mentions categories. It also covers plan availability and sampling behavior. It could add details about return format (e.g., data structure), but completeness is high for a tool with strong annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds value by explaining that peers are auto-selected by SIC code and that TTM ratios are used, but it does not add new meaning beyond the schema for each parameter. For example, 'limit' and 'categories' are well-described in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves ratio-based peer comparisons for a company and its closest competitors, selecting peers by 2-digit SIC code. It distinguishes from sibling tools by mentioning peer selection and specific ratio categories, making it unique among sibling tools like get_financial_ratios or get_valuation_metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly mentions when to use it (for peer benchmarking) and provides guidance on historical comparisons via period_end_before. However, it does not explicitly state when not to use it or mention alternatives for peer selection (e.g., if user wants custom peers), which would elevate it to a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_pit_universePoint-in-Time UniverseA

Read-onlyIdempotent

Inspect

Use this tool to answer questions about historical index membership — e.g. "Was Company X in the S&P 500 on date Y?" or "Which companies were in the Russell 2000 on 2010-01-01?" Use this INSTEAD OF search_companies when the question involves a specific historical date or asks whether a company was an index member at a point in the past. search_companies only returns current membership snapshots and cannot answer historical membership questions.

Returns a survivorship-free universe of companies valid on a specific as_of_date: only companies that existed and were index members on that exact date — no hindsight contamination. Supports SP500, RUSSELL1000, RUSSELL2000, RUSSELL3000 via index_membership.parquet (accurate join/leave dates with [) interval semantics). To check a single company's membership, pass its ticker and the target date; if the company appears in the response it was a member, if absent it was not.

Returns per company: CIK, ticker, name, sector, industry, SIC code, plus per-row membership confidence (high/medium/low). Check _meta.pit_safe: true only when every matched row is high-confidence; medium/low rows downgrade it to false — treat low-confidence rows with caution for backtest use.

NOTE: sector is SIC-derived (GICS-aligned labels via sic_to_sector.csv), not licensed GICS — industrial conglomerates may map differently. Treat as a screening bucket, not an authoritative GICS label.

Use as the first step of a quantitative backtest before calling get_compute_ready_stream to pull Parquet data for the universe. Returns empty array (with error detail) if the date is out of range or the index_membership data has no coverage for that date. Available on every plan — sample tier returns the subset covered by the sample bucket.

ParametersJSON Schema

Name	Required	Description	Default
`index`	No	Index filter. 'sp500' (~500 large caps), 'russell1000' (~1000 large/mid), 'russell2000' (~2000 small caps), 'russell3000' (~3000 broad market). Omit for no index filter (sector-only or full universe queries).
`limit`	No	Maximum number of companies to return (1–3500). Defaults to 100. Sized to fit Russell 3000 (~3000 members) plus headroom for delisted/historical members. Universe is deduped to one row per CIK by default, so set near the index size (SP500 ~505, Russell 1000 ~1010, Russell 3000 ~3050) — no need for share-class headroom.
`sector`	No	Sector filter (case-insensitive substring match) over the SIC-derived, GICS-aligned sector label. E.g. 'Technology', 'Health Care', 'Energy'. Not licensed GICS — see tool description for caveat.
`is_active`	No	Filter to active (currently trading) companies only. Omit to include all. WARNING: setting this to true on a HISTORICAL query reintroduces survivorship bias — companies that were active on as_of_date but later went bankrupt or got acquired will be filtered out. Leave unset for true PIT backtests.
`as_of_date`	No	Historical date (YYYY-MM-DD) for survivorship-free universe construction. For an index: uses index_membership join/leave dates — companies that entered after this date are excluded; companies later removed are kept. For sector: uses security valid_from/valid_to. Omit for the current universe.
`as_of_basis`	No	Which date column to filter on for historical universe construction. 'effective' (default) — match against effective_date / removal_date (first trading day as a member; passive index replication). 'announcement' — match against announcement_date / removal_announcement_date (the day S&P publicly announced the change; use for inclusion-arb backtests). 'announcement' rows with NULL announcement_date are silently skipped — pre-2015 changes generally lack press-release coverage.	effective
`include_share_classes`	No	When false (default), the universe is collapsed to one row per CIK — matches how index providers count constituents (BRK is one S&P 500 member, not two for BRK-A + BRK-B). When true, every share-class row is returned (GOOG and GOOGL emit separate rows, etc.). Use only for security-level (not company-level) analysis.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, destructiveHint=false, so the safety profile is clear. Description adds valuable context: explains how historical data is constructed (using index_membership.parquet with accurate join/leave dates), which is beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise yet comprehensive: 6 sentences covering purpose, usage, key implementation detail, output fields, recommended workflow, and availability. Front-loaded with the most critical information. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters with 100% schema coverage, no output schema, and annotations present, the description fully addresses what the agent needs: purpose, when to use, how it works internally, output fields, and relationship to sibling tools. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. Description adds meaning by explaining the purpose of 'as_of_date' (historical date for survivorship-free universe) and 'index' (returns S&P 500 constituents). However, it doesn't detail all parameter behaviors (e.g., sector substring match). Slight additional value over schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get a survivorship-free universe of companies valid on a specific historical date.' It specifies the verb (get), resource (universe of companies), and key property (survivorship-free, point-in-time). Differentiates from siblings like 'screen_universe' and 'get_compute_ready_stream' by highlighting its role as the first step in a backtest and its historical validity focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use: 'Use as the first step of a quantitative backtest before calling get_compute_ready_stream.' Also states what not to expect: no survivorship bias, uses historical data, not current snapshots. Provides context for bias-free research and mentions plan availability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_reportGet Research ReportA

Read-onlyIdempotent

Inspect

Fetch a report by id. format=markdown returns the rendered body, format=json returns the full structured payload (sections + citations + report-type-specific data), format=preview returns abstract-only. Sample tier is downgraded to preview regardless of input.

ParametersJSON Schema

Name	Required	Description	Default
`format`	No	Response shape. Defaults to markdown.	markdown
`report_id`	Yes	Id from `create_report` or `list_my_reports`.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`format`	Yes
`report`	Yes
`markdown`	Yes
`sections`	Yes
`citations`	Yes
`structured`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint, so the safety profile is clear. Description adds the important behavioral trait that sample tier always returns preview regardless of input format, which is not in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the verb, each sentence adds essential information. No redundant or unclear phrases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (2 params, read-only, output schema exists) and the comprehensive annotation coverage, the description adequately covers the behavior, including format details and tier constraint.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions, but the description adds significant meaning by explaining what each format returns (e.g., 'rendered body' for markdown, 'full structured payload' for json). It also clarifies the tier-related preview override.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Fetch a report by id' with specific verb and resource, and distinguishes the three format options. It is well-differentiated from sibling tools like create_report, delete_report, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides context on when to use each format (e.g., 'rendered body' for markdown, 'structured payload' for json) and notes the tier-dependent preview behavior. However, it does not explicitly exclude alternatives or give when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_report_versionGet Report VersionA

Read-onlyIdempotent

Inspect

Author-only fetch of a specific archived version of one of your reports. Returns metadata + the full payload (sections, citations, structured, markdown) — enough to render a diff against the current HEAD in the workspace editor. Use after list_report_versions identifies the version number you want.

ParametersJSON Schema

Name	Required	Description	Default
`version`	Yes	Version number to fetch (from list_report_versions).
`report_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`payload`	Yes
`version`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, destructiveHint. Description adds value by detailing return payload (metadata + full payload) and use case (render diff against HEAD). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences. First sentence states purpose and scope, second explains usage and return value. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given presence of output schema, return values are documented elsewhere. Description covers purpose, usage guidelines, and behavioral context. Adequate for a fetch tool with output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50%. Description clarifies the 'version' parameter (from list_report_versions) but offers minimal insight for 'report_id' beyond the tool's context. Baseline 3 with partial compensation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description specifies verb 'fetch', resource 'archived version of report', and scope 'author-only'. Clearly distinguishes from siblings like 'get_report' and 'list_report_versions'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Use after list_report_versions identifies the version number you want', providing clear context. Does not explicitly exclude alternatives like 'get_report' for the current version, but the instruction is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_sec_filing_linksSEC Filing LinksA

Read-onlyIdempotent

Inspect

Get direct links to original SEC EDGAR filings for any US public company. Returns two per-filing deep links: sec_url (the EDGAR filing-index page listing every document) and viewer_url (the SEC iXBRL inline-viewer for the specific accession). Supported form_types (enum): 10-K, 10-Q, 8-K, 20-F, 40-F, 10-K/A, 10-Q/A, 20-F/A, 40-F/A. Other forms (6-K, DEF 14A, Form 4, 13F) are NOT yet exposed by this tool — use describe_schema to confirm the parquet has them, then read raw via the SDK. 8-K item codes are filterable via event_types (e.g. ['2.02'] for earnings, ['1.01'] for material agreements, ['5.02'] for officer changes). PIT-safe — filings are filtered by accepted_at, never by report_date alone. Use this instead of verify_fact_lineage when you want a list of filings; use verify_fact_lineage when you want one specific fact-to-filing trace. Available on all plans.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of filings to return (1–50). Defaults to 10.
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT
`end_date`	No	Inclusive upper bound on filing_date (YYYY-MM-DD). E.g. '2023-12-31'.
`form_types`	No	Filing form types to include. Defaults to 10-K and 10-Q.
`start_date`	No	Inclusive lower bound on filing_date (YYYY-MM-DD). E.g. '2023-01-01'.
`event_types`	No	8-K item codes to filter by. E.g. ['1.01'] for material agreements, ['2.01'] for asset acquisitions, ['5.02'] for director/officer changes. Only relevant when form_types includes '8-K'.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`filings`	Yes
`filings_returned`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and non-destructive nature. The description adds transparency by specifying the exact return content (EDGAR interactive viewer URL and raw filing index URL) and availability on all plans, complementing the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at three sentences, front-loaded with the core purpose, and every sentence adds value: purpose, output details, and availability. No waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, return values need no further explanation. The description covers the tool's scope, constraints (US public companies, EDGAR links), and availability. However, it could briefly mention that the tool is limited to EDGAR filings, which is already implied, making it largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 6 parameters. The description adds some context (e.g., form types listed, 8-K event codes explained) but does not significantly enhance beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns direct links to original SEC EDGAR filings for US public companies, specifying form types (10-K, 10-Q, etc.) and output details (interactive viewer URL and raw index URL). This differentiates it from siblings like search_filing_text (which searches content) and get_company_fundamentals (which provides metrics).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implicitly indicates when to use this tool (when needing links to filings) and mentions availability on all plans, but does not explicitly state when not to use it or compare to alternatives. For example, it doesn't contrast with search_filing_text for content search or get_company_fundamentals for financial data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_smart_money_flowSmart Money Flow (composite)A

Read-onlyIdempotent

Inspect

Composite flow score on [-100, +100] aggregating insider transactions, 13F institutional Δ-shares vs the prior quarter, and SC 13D/13G blockholder changes over a lookback window. Each component normalised independently, then combined with configurable weights (default: institutional 0.4, blockholder 0.4, insider 0.2). Returns per-component attribution so an agent can see WHY the score is what it is — not just the headline number. Institutional tier only.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes	Issuer ticker symbol.
`lookback_days`	No	Lookback window for insider + blockholder components. Default 90.
`weight_insider`	No	Weight applied to the insider component (0–1).
`weight_blockholder`	No	Weight applied to the blockholder component (0–1).
`weight_institutional`	No	Weight applied to the institutional component (0–1).

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`weights`	Yes
`components`	Yes
`period_end`	Yes
`company_name`	Yes
`composite_score`	Yes
`insider_component`	Yes
`blockholder_component`	Yes
`institutional_component`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds significant behavioral context beyond annotations: composite nature, normalization, configurable weights, per-component attribution, and access restriction. Annotations already indicate read-only, idempotent, and non-destructive behavior, which the description does not contradict.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is compact (3 sentences) and front-loaded with key information. Each sentence adds unique value: definition, mechanics, and output feature with access note. No redundant or superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and the presence of an output schema, the description covers all essential aspects: components, normalization, weights, attribution, and access. It is fully sufficient for an agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description provides overarching context for weight parameters and lookback window, clarifying their roles in the composite score. It adds value by explaining the default weights and how they affect the output.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly defines the tool as a composite flow score aggregating insider, institutional, and blockholder changes, and distinguishes it from sibling tools that provide individual components. It states the score range and the three data sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for a holistic smart money flow view and mentions enterprise tier requirement, but lacks explicit guidance on when to use this composite versus the individual component tools or alternatives. It provides context but no exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_thesisGet Saved ThesisA

Read-onlyIdempotent

Inspect

Fetch a single saved thesis by its id. Returns the full record including outcome (if scored). Returns NOT_FOUND if the id is unknown or belongs to another user.

ParametersJSON Schema

Name	Required	Description	Default
`thesis_id`	Yes	Id returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`thesis`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark it as read-only and idempotent. The description adds that it returns the full record including outcome and a NOT_FOUND error for unknown or unauthorized ids, providing useful behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two compact sentences that each add value. No redundant or irrelevant information. Properly front-loaded with the action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-by-id tool with one parameter and existing annotations, the description covers the key points: what it does, what it returns, and error conditions. Output schema exists (not shown but referenced), so return details are covered.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The thesis_id parameter description adds context: 'Id returned by save_thesis or list_theses', which explains its origin beyond the schema's type and length constraints. Since schema coverage is 100% and the parameter is well-documented, this scores high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses 'Fetch a single saved thesis by its id' which is a specific verb-resource combination. It clearly distinguishes itself from sibling tools like list_theses and save_thesis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use this tool (by thesis id) and mentions the NOT_FOUND error. While it doesn't explicitly state when not to use it, the use case is straightforward given its simplicity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_top_holdersTop Holders (composite, classified)A

Read-onlyIdempotent

Inspect

Classification-aware UNION across insider transactions (latest post_transaction_shares per insider), 13F institutional holdings, and SC 13D / 13G blockholder filings for one issuer. Each row carries holder_class ∈ {insider, institutional, blockholder_13D, blockholder_13G}. Dedupes overlapping filers by precedence (13D > 13G > institutional > insider). One call, classified cap table — Bloomberg charges separately for INSIDER, OWNER, and HDS; this consolidates them.

ParametersJSON Schema

Name	Required	Description
`top_n`	No	Maximum holders to return, ranked by shares. Default 25.
`ticker`	Yes	Issuer ticker symbol.
`period_end`	No	13F period_end. Omit for latest.

Output Schema

ParametersJSON Schema

Name	Required	Description
`cik`	Yes
`rows`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`ticker`	Yes
`period_end`	Yes
`company_name`	Yes
`sources_breakdown`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent behavior. The description adds value by detailing deduplication precedence (13D > 13G > institutional > insider) and classification schema, which are not in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (one paragraph, ~90 words) and front-loaded with the core function. Every sentence adds value, from the union concept to the deduping rule and competitive positioning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the composite nature across multiple filings, the description explains the classification and deduping logic. An output schema exists, so return values are covered. Minor omission: no mention of error handling or performance, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all three parameters. The description does not add parameter-specific detail beyond what the schema provides, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as a composite that merges insider, 13F, and 13D/13G data with classification. It distinguishes itself from siblings like get_insider_transactions and get_institutional_holdings by being a unified view.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is for consolidated cap table needs, contrasting with Bloomberg's separate tools. However, it does not explicitly state when to avoid it in favor of sibling tools for single source queries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_valuation_metricsValuation MetricsA

Read-onlyIdempotent

Inspect

Get comprehensive valuation and profitability metrics for a US public company. Returns per-period data combining computed ratios (gross_margin, operating_margin, net_margin, ROE, ROA, ROIC, debt_to_equity, FCF, FCF margin) with optional pre-computed DCF model inputs (WACC, fcf_base, stage1_growth_rate, terminal_growth_rate, dcf_value_per_share, ddm_value_per_share). Profitability + cash flow + leverage fields are sourced from fact.parquet (PIT-safe via accepted_at). DCF/DDM fields are sourced from valuation.parquet (pipeline-computed; recomputed each pipeline run, so NOT strictly PIT-safe). DCF/DDM fields are commonly null — for newer tickers, transition periods, or before the valuation pipeline has run. Each null carries a null_reasons[field] entry with one of: VALUATION_NOT_COMPUTED, INPUT_MISSING, INPUT_NEGATIVE, PIPELINE_PENDING. ALWAYS check null_reasons before assuming a field is zero — null != 0. For agents needing strict PIT DCF reasoning, use the SDK or compute DCF inputs from get_company_fundamentals directly. Use this instead of get_financial_ratios when DCF/intrinsic value matters; use get_financial_ratios when you only need the ratio table without DCF wiring. Available on all plans.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of periods to return (1–40). Defaults to 5.
`period`	No	Filing period granularity. Annual uses 10-K; quarterly uses 10-Q.	annual
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT
`as_of_date`	No	Point-in-time date (YYYY-MM-DD). Only returns data with accepted_at on or before this date. Eliminates look-ahead bias for backtesting.
`fiscal_year`	No	Fiscal year (YYYY). Omit to return most recent periods.

Output Schema

ParametersJSON Schema

Name	Required	Description
`data`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`period`	Yes
`ticker`	Yes
`as_of_date`	Yes
`periods_returned`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the description's behavioral burden is lower. The description adds value by disclosing point-in-time safety with as_of_date and that data is 'per-period' and includes both computed ratios and pre-computed DCF inputs. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph. The first sentence front-loads the core purpose. Every sentence adds unique information: metrics list, as_of_date behavior, plan availability. No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 params, output schema present), the description is complete. It explains what data is returned (computed ratios + DCF inputs), the point-in-time feature, and plan availability. With an output schema, return value details are not needed. The description covers all necessary aspects for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds value by explaining the purpose of as_of_date ('eliminates look-ahead bias for backtesting') and clarifying that limit controls 'maximum number of periods'. However, it does not elaborate on all parameters beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'comprehensive valuation and profitability metrics for a US public company' and lists specific metrics (gross margin, ROE, DCF, etc.). The verb 'Get' plus the resource 'valuation metrics' precisely defines purpose. It distinguishes itself from siblings like get_company_fundamentals and get_financial_ratios by being comprehensive and including DCF/DDM values.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly mentions that as_of_date enables 'historical analysis and backtesting without look-ahead bias', guiding when to use that parameter. It also says 'Available on all plans', clarifying no plan restrictions. However, it does not explicitly contrast with sibling tools or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_watchlistGet WatchlistA

Read-onlyIdempotent

Inspect

Fetch a single watchlist by name. NOT_FOUND if the name is unknown to this user.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`watchlist`	Yes

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, destructiveHint, and idempotentHint. The description adds the NOT_FOUND behavior, which provides some context but does not go beyond annotations significantly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the key action. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple fetch operation with an output schema, the description covers the main behavior and error case. Could mention more about what the output contains, but output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions 'by name' but does not elaborate on constraints like maxLength=80 or minLength=1 from the schema. The user-scoping hint ('unknown to this user') adds some value but insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fetch a single watchlist by name', which is a specific verb and resource. It distinguishes from sibling tools like list_watchlists (list all) and delete_watchlist (delete).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when you have a specific watchlist name but provides no explicit guidance on when to use this tool over siblings or what prerequisites exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_alert_inboxList Alert InboxA

Read-onlyIdempotent

Inspect

Newest-first listing of the caller's in-app alert inbox. Each item is a single fire of an alert with a dashboard channel — written by the cron evaluator (or test_alert). By default dismissed items are hidden and read items are included. Cursor-paginated by fired_at. Sample tier rejected — alerts are a paid-tier feature.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`cursor`	No	Pagination cursor — the `fired_at` of the last item on the previous page.
`unread_only`	No	When true, return only items where read_at IS NULL.
`include_dismissed`	No	When true, also return items the caller previously dismissed.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`items`	Yes
`next_cursor`	Yes
`unread_count`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds substantial behavioral detail beyond annotations: pagination by fired_at, default filters for dismissed/read, and tier restriction. Annotations already indicate read-only and idempotent; description enriches understanding without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: first states purpose and ordering, second details defaults, third mentions tier restriction. No wasted words, front-loaded, well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given that an output schema exists (noted in context signals), the description does not need to explain return values. It covers ordering, pagination, defaults, and tier restriction appropriately for a read-only listing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already documents 3 of 4 parameters with descriptions and defaults. The tool description restates default behavior but does not add new parameter semantics beyond summarizing defaults, which is helpful but not additive.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists the caller's in-app alert inbox in newest-first order. It specifies the scope ('caller's') and the channel ('dashboard'), distinguishing it from other tools like list_alerts or dismiss_inbox_item.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains default behavior (dismissed hidden, read included) and notes that alerts are a paid-tier feature, setting usage context. However, it does not explicitly mention when to use alternatives like list_alerts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_alertsList AlertsB

Read-onlyIdempotent

Inspect

Paginated newest-first listing of the caller's alerts. Sample tier rejected.

ParametersJSON Schema

Name	Required	Default
`limit`	No
`cursor`	No
`status`	No	active

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`alerts`	Yes
`next_cursor`	Yes
`total_count`	Yes

Tool Definition Quality

B3.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, and non-destructive behavior. The description adds useful behavioral traits: pagination and newest-first ordering, complementing the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The first sentence is front-loaded with core purpose. The second sentence 'Sample tier rejected' is irrelevant and wastes space, but otherwise the description is concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, return values need not be detailed. However, the description misses key filtering by status enum, which is present in the schema but not mentioned. It adequately covers ordering and pagination but not all parameter-driven behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description should compensate by explaining parameter meanings. It mentions 'paginated' vaguely but does not detail limit, cursor, or status parameters, relying on self-explanatory names which may not suffice (e.g., cursor).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists the caller's alerts with pagination and newest-first ordering, which distinguishes it from create/delete alert tools. However, the phrase 'Sample tier rejected' is confusing and detracts from clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool versus alternatives like list_my_reports or create_alert. Usage is implied by name and context, but no explicit guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_citation_overridesList Citation OverridesA

Read-onlyIdempotent

Inspect

Author-only newest-first listing of the caller's citation corrections. Filterable by ticker (e.g. all AAPL corrections) or by a single fact_id (returns 0 or 1 row). Pair with save_citation_override and delete_citation_override. Sample tier rejected.

Agent use: call with ticker to introspect what corrections the user has previously applied on that ticker — useful for system prompts that respect prior corrections during regeneration.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`cursor`	No	Cursor from the previous response's `next_cursor` — the updated_at of the last row on that page. Omit for first page.
`ticker`	No	Optional ticker filter, case-insensitive. Uppercased internally.
`fact_id`	No	Optional fact_id filter — returns at most one row.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`overrides`	Yes
`next_cursor`	Yes
`total_count`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds behavioral context beyond annotations: author-only access, newest-first ordering, 0-or-1 row for fact_id, and a cryptic 'Sample tier rejected' note. Annotations already cover read-only and idempotence, so description adds value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short paragraphs, front-loaded with main purpose. Every sentence adds unique information (filters, siblings, agent use). No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and annotations are present, the description is complete. It explains ordering, filtering, caller scope, and use case. No gaps that hinder agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 3 of 4 parameters with descriptions. Description adds filter examples ('all AAPL corrections') and clarifies fact_id returns at most one row, providing extra meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool lists citation overrides for the caller, with 'Author-only newest-first' ordering and filters by ticker or fact_id. It differentiates by naming sibling tools save and delete citation overrides, making purpose distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a concrete agent use case: call with ticker to introspect prior corrections for system prompts. It implies when to use (for checking user corrections) but lacks explicit when-not-to-use or comparisons with other listing tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_my_reportsList My Research ReportsA

Read-onlyIdempotent

Inspect

Cursor-paginated newest-first listing of the caller's own reports. Filters compose with AND. Use cursor from the previous response's next_cursor to fetch the next page. Sample tier rejected (no per-author state).

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Page size.
`cursor`	No	Cursor from previous `next_cursor`.
`status`	No	Filter by status. Default 'ready' (excludes drafts + delisted).	ready
`ticker`	No	Filter to a single ticker (case-insensitive).
`report_type`	No	Filter by report type.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`reports`	Yes
`next_cursor`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive. Description adds pagination mechanism, ordering, filter logic, and authorization note, enhancing behavioral understanding.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy, front-loaded with purpose and key behavior. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given full schema coverage and presence of output schema, description covers pagination, ordering, filtering, and access context. No missing behavioral aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters with descriptions (100% coverage). Description adds value by stating filters compose with AND, but does not elaborate on each parameter beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists the caller's own reports, newest-first, cursor-paginated, with AND filters. Distinguishes from siblings by specifying 'caller's own reports' and noting sample tier rejection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides pagination and filter composition instructions but does not explicitly compare with alternatives like search_reports. Lacks when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_public_theses_by_userList Public Theses by UserA

Read-onlyIdempotent

Inspect

Return the PUBLIC theses + reputation aggregate for a user identified by Stripe customer_id. Used by the /[handle] profile page to render an analyst's track record. Only entries with visibility='public' are surfaced — private theses never leak. Reputation is correct/(correct+wrong) over graded theses; null when n < 5 (sample too small). Sample tier rejected; sp500+ only.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max public theses to return. Defaults to 20.
`customer_id`	Yes	Target user's Stripe customer_id (resolved by the frontend from the handle).

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`theses`	Yes
`reputation`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent, non-destructive), the description reveals key behavioral traits: visibility filter (public only), reputation calculation formula, null for small samples (n<5), and sample tier rejection (sp500+ only). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with core purpose, no redundant information. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (theses, reputation, privacy) and the presence of an output schema, the description covers privacy, reputation edge cases, and sample restrictions adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaning by linking customer_id to Stripe and the profile page usage, and explaining the limit is implicit. It provides context beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns public theses and reputation aggregate for a user, distinguishing it from sibling tools like list_theses (likely general) and get_thesis (single). The specific use case (profile page) further clarifies its role.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies the context: used by the /[handle] profile page and that only public entries are surfaced. It implicitly guides when to use (for public theses) but does not explicitly state when not to use or name alternatives like list_theses.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_report_versionsList Report VersionsA

Read-onlyIdempotent

Inspect

Author-only newest-first listing of a report's archived version history. Each entry summarises what changed (sections edited, etc.) so the workspace UI can render a clickable history without loading every artifact. Pair with get_report_version to fetch a specific version's content for diffing against HEAD.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`cursor`	No	Cursor from the previous response's `next_cursor` — the smallest version number on the previous page. Omit for the first page.
`report_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`versions`	Yes
`next_cursor`	Yes

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate safe read operation (readOnlyHint, idempotent, not destructive). Description adds that it is author-scoped and returns change summaries, but no additional behavioral traits beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two front-loaded sentences with no filler. Concise and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return values need not be explained. Description covers purpose, ordering, and pairing. Could add brief pagination note but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 33% (cursor documented). Description does not explain `limit` and `cursor` pagination or how `report_id` is used, leaving agents to infer from schema defaults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists report version history with newest-first ordering and per-entry change summaries. Explicitly distinguishes from sibling `get_report_version` by mentioning pairing usage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Specifies 'Author-only' and 'newest-first' usage, and recommends pairing with `get_report_version` for diffing. Does not explicitly state when not to use, but context from siblings is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_thesesList Saved ThesesA

Read-onlyIdempotent

Inspect

Return the caller's saved theses, newest-first. Filters: ticker (exact), view, status. Cursor-based pagination — pass next_cursor from the previous response to fetch the next page. Sample tier rejected (no per-user state).

ParametersJSON Schema

Name	Required	Description	Default
`view`	No	Filter to a single view.
`limit`	No	Page size, 1–100. Defaults to 20.
`cursor`	No	Pagination cursor returned by the previous `list_theses` call's `next_cursor`.
`status`	No	'active' (default) hides archived theses; pass 'all' to include them.	active
`ticker`	No	Filter to theses on this ticker (case-insensitive).

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`theses`	Yes
`next_cursor`	Yes
`total_count`	Yes

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds ordering (newest-first), cursor-based pagination, filter specifics, and a behavioral constraint (sample tier rejected), providing full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, covering ordering, filters, pagination, and constraint. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists and annotations cover safety, the description sufficiently covers purpose, filtering, pagination, ordering, and a key constraint. It is complete for a read-only list operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so each parameter is fully documented. The description summarizes filters and pagination but adds no new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns the caller's saved theses, newest-first, with filters. This distinguishes it from siblings like get_thesis (single thesis) or save_thesis (write operation).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions filters and pagination, guiding when to use the tool. It also notes the sample tier restriction. However, it does not explicitly contrast with get_thesis for individual theses or other list operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_watchlistsList WatchlistsA

Read-onlyIdempotent

Inspect

Paginated newest-first listing of the caller's watchlists. Filter by status. Sample tier rejected.

ParametersJSON Schema

Name	Required	Default
`limit`	No
`cursor`	No
`status`	No	active

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`watchlists`	Yes
`next_cursor`	Yes
`total_count`	Yes

Tool Definition Quality

A3.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and destructiveHint=false. The description adds pagination, ordering, and filtering behavior. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise: one sentence plus a note. Every word adds value. Front-loaded with key details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Has output schema, so return values are covered. Mentions pagination, ordering, filtering, and scope. However, 'Sample tier rejected' is unclear and no sibling comparison is provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. The description hints at filtering by status but does not explain limit or cursor parameters. Does not compensate for low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists the caller's watchlists with pagination, newest-first order, and filtering by status. It distinguishes from siblings like get_watchlist (single) and create/delete watchlists.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like get_watchlist or list_alerts. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

mark_inbox_readMark Inbox Item ReadA

Idempotent

Inspect

Set read_at on a single inbox item. Idempotent — re-marking does NOT reset the first-read timestamp. Returns the new unread_count so the agent/UI can update its badge without a follow-up call.

ParametersJSON Schema

Name	Required	Description	Default
`inbox_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`inbox_id`	Yes
`marked_read`	Yes
`unread_count`	Yes

Tool Definition Quality

A3.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint=true and destructiveHint=false. Description adds that re-marking does NOT reset the first-read timestamp and that it returns the unread_count for badge updates. This adds value beyond annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each adding value. Front-loaded with the core action, followed by key behavioral note and return value. No redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, idempotent, returns unread_count), the description is nearly complete. It explains the return value and idempotency. Minor gap: no mention of error conditions (e.g., invalid inbox_id) but output schema may cover that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% on the single parameter 'inbox_id'. Description does not explain the parameter's meaning, format, or constraints beyond what the schema provides (string, length constraints). For a single required param, minimal compensation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Set read_at' on a single inbox item, which distinguishes it from sibling tools like 'dismiss_inbox_item' that perform a different action. The purpose is specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like dismiss_inbox_item. Only mentions idempotency, which is behavioral, not usage context. The description does not clarify prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

render_reportRender Report Download URLA

Idempotent

Inspect

Return a 15-minute presigned download URL for a report in the requested binary format.

format=md presigns the cached markdown — instant, no compute. format=docx returns a branded Word document with a cover page (logo, title, ticker, tier badge), the report body (abstract, sections, citations table with clickable SEC EDGAR links), and a back page (methodology, sources, disclaimer). The DOCX is cached in R2 alongside the markdown after first build so repeat downloads are instant; pass force_regenerate: true to bust the cache (e.g. right after update_report).

Tier gate mirrors get_report: authors always see their own reports; non-authors below the report's required tier get an upgrade prompt.

ParametersJSON Schema

Name	Required	Description
`format`	Yes	md = raw markdown (the same body the editor renders). docx = branded Word document with cover + back page.
`report_id`	Yes	Id from create_report or list_my_reports.
`force_regenerate`	No	If true, ignore the cached DOCX and re-render. No effect on md (markdown is canonical).

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`format`	Yes
`filename`	Yes
`expires_at`	Yes
`from_cache`	Yes
`size_bytes`	Yes
`content_type`	Yes
`expires_in_seconds`	Yes

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes beyond annotations by detailing that the URL is presigned for 15 minutes, caching behavior (docx cached after first build, md always instant), and that force_regenerate busts the cache. Annotations only indicate idempotent and non-destructive; the description adds depth on side effects and limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is about 150 words across four paragraphs, front-loaded with the core purpose. Each paragraph adds value without redundancy. Slightly verbose in docx details but remains efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, an output schema, and complex caching/format behavior, the description covers all agent needs: what it returns, format differences, cache invalidation, and access control. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaningful context: format enum values are explained with content specifics (cover page, ticker, citations), report_id uses authoritative sources (from create_report or list_my_reports), and force_regenerate clarifies cache behavior and its lack of effect on md.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a 15-minute presigned download URL for a report in a chosen binary format. It distinguishes between 'md' (cached markdown, instant) and 'docx' (branded Word document), differentiating it from siblings like 'get_report' which likely returns content directly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use each format (md vs docx) and when to use force_regenerate (e.g., after update_report). It mentions tier gate mirrors get_report, implying similar access constraints, but does not explicitly exclude alternatives or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_workflowRun Saved WorkflowA

Read-onlyIdempotent

Inspect

Resolve a saved workflow by id and return a structured execution plan for a single ticker. Each plan entry names a real MCP tool or SOP plus its ticker-substituted arguments; the calling agent invokes them in order, applying any skip_if predicate against the previous step's output.

This tool does NOT execute the steps server-side. It plans; the agent runs. Iterate through plan[] in order, call the named tool/SOP with args, accumulate outputs, and apply each step's skip_if (skip the step when the previous output's path equals equals).

Workflows are private state owned by the calling user. Sample-tier callers are rejected. Pair with list_workflows (frontend) to discover available workflow_ids.

ParametersJSON Schema

Name	Required	Description
`ticker`	No	US-listed ticker the workflow should run against, e.g. 'AAPL'. Pass either `ticker` (single) or `tickers` (batch up to 50). Exactly one is required.
`tickers`	No	Batch mode — array of US-listed tickers, up to 50. When provided, the response has `plans[]` (one plan per ticker) instead of `plan`. Parity with the frontend batch-runner so agents can request 'plan over my watchlist' in a single call.
`workflow_id`	Yes	Workflow id returned by the frontend workflow builder.

Output Schema

ParametersJSON Schema

Name	Required	Description
`meta`	Yes	Provenance envelope — data lineage for every MCP response
`plan`	Yes
`plans`	Yes
`ticker`	Yes
`workflow`	Yes
`instructions`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint, destructiveHint), the description discloses critical behavioral traits: no server-side execution, planning-only nature, and details of the plan structure including skip_if logic. This adds value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at four sentences, front-loading the core purpose and key behavioral note. Every sentence adds distinct value, avoiding redundancy. It is well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and clear annotations, the description covers essential usage context: return format (plan entries with tools/args and skip_if), user-private state, and tier restrictions. It provides sufficient completeness for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the input schema provides concise descriptions for both parameters (ticker and workflow_id). The tool description reiterates these with minor additions (e.g., 'US-listed ticker', 'id returned by frontend') but does not significantly expand beyond schema semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to resolve a saved workflow by ID and return a structured execution plan for a single ticker. It explicitly distinguishes itself from siblings by noting it does NOT execute steps server-side, serving as a planner rather than executor.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context on when to use (with workflow_id and ticker), mentions pairing with list_workflows for discovery, and notes tier restrictions. However, it does not explicitly compare to alternative tools for similar tasks, though the unique planning role is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_citation_overrideSave Citation OverrideA

Idempotent

Inspect

Persist a correction of a citation value. The correction is keyed on the canonical fact_id (a stable hash of CIK + accession + concept + period) so it applies to every report that references that same fact — including agent-regenerated reports. Re-saving the same fact_id replaces the prior correction in place (no duplicate row).

Use this when the user notices an inaccuracy in an AI-generated report and wants the fix to persist. Provide notes for the rationale (≤500 chars) and source_report_id for provenance. Tier caps: sp500=500, pro=5000, full=50000.

ParametersJSON Schema

Name	Required	Description
`notes`	No	Optional free-form rationale, ≤500 chars.
`ticker`	No	Optional ticker — denormalised for fast filtering. Required for the workspace UI's 'my corrections on AAPL' view.
`fact_id`	Yes	Canonical fact identifier — usually returned in a citation's `fact_ids` array by get_report or any compute tool. Stable across report regenerations.
`corrected_value`	Yes	User-corrected value, stringified. The frontend interprets it based on the fact's known datatype (number, string, ISO date).
`source_report_id`	No	Optional report id the user was viewing when they applied the correction (provenance).

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`created`	Yes
`capacity`	Yes
`override`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already set idempotentHint=true and destructiveHint=false. The description adds crucial detail: re-saving same fact_id replaces prior correction in place, no duplicate rows. It also explains the stable hash keying and cross-report applicability. While it omits explicit auth requirements, the readOnlyHint=false implies write access, and the added context earns a 4.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single, well-structured paragraph. Key information (purpose, behavioral nuance, usage guidance) is front-loaded. Every sentence adds value with no redundancy or filler. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a write tool that applies overrides globally, the description covers purpose, behavior, usage, parameter hints, and tier caps. It doesn't discuss error handling or validation failure outcomes, but the presence of an output schema partially compensates. Fine for most use cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with each parameter well-documented. The description reinforces the role of notes and source_report_id for provenance and mentions tier caps. However, it adds minimal new semantic value beyond the schema, so a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The title 'Save Citation Override' and description 'Persist a correction of a citation value' clearly state the tool's action and resource. It distinguishes from siblings like delete_citation_override and list_citation_overrides by focusing on saving/persisting corrections, with specifics like keying on fact_id.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use this when the user notices an inaccuracy in an AI-generated report and wants the fix to persist.' Also provides tier caps for volumes (sp500=500, etc.) and hints about notes and source_report_id for provenance. This fully guides an agent on context and constraints.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_thesisSave Investment ThesisA

Idempotent

Inspect

Persist a directional investment thesis (bull / bear / neutral) on a ticker. The thesis becomes part of the caller's private research diary; pair with list_theses + score_thesis_outcome to track conviction-vs-outcome over time. Pass idempotency_key for at-most-once semantics from a retrying agent.

Use this AFTER the agent has finished its analysis, not before — the thesis records the conclusion, not the question. Pair with source_report_id to link the thesis back to a published report so the buyer's thesis-tracking carries provenance.

Tier: all paid + free tiers (sample tier rejected — sample is guest access with no customerId binding). Per-tier cap on # of stored theses: sp500=100, pro=500, full=10,000.

ParametersJSON Schema

Name	Required	Description	Default
`view`	Yes	Directional view: bull (expect outperformance), bear (under), neutral (mean-revert).
`notes`	No	Free-form rationale, ≤4000 chars. Stored verbatim; trim before submitting.
`ticker`	Yes	US-listed ticker. Case-insensitive — normalised to upper. E.g. 'AAPL'.
`conviction`	Yes	1 = low conviction (gut feel) → 5 = high conviction (deep analysis).
`visibility`	No	Phase 3: 'private' (default) is owner-only; 'unlisted' is visible at a known direct URL; 'public' surfaces on the author's /[handle] profile and contributes to their reputation score.	private
`horizon_days`	Yes	Investment horizon in days. 1 day–5 years (1825d). The grader uses this to pick the as-of period.
`idempotency_key`	No	Optional client-supplied key. If a previous `save_thesis` from the same user used this key, the existing thesis is returned instead of creating a duplicate.
`source_report_id`	No	Optional id of a report (from `create_report` / `publish_report`) that contains the supporting analysis.
`thesis_at_price_cents`	No	Optional snapshot of the ticker's market price (integer cents) at thesis creation. Used by future versions of the grader that mix in price returns; null for now is fine.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`thesis`	Yes
`capacity`	Yes
`deduplicated`	Yes

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that the thesis persists in a private research diary, that `idempotency_key` ensures at-most-once semantics, and that tier caps limit stored theses. No contradictions with annotations; the description adds useful behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loaded with the core purpose, and structured with clear sections for usage guidelines and additional details. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having an output schema (not shown), the description covers the tool's purpose, usage guidelines, tier constraints, idempotency, and optional features. It is fully adequate for an agent to understand when and how to invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all parameters with descriptions (100% coverage), so baseline is 3. The description adds value by explaining the purpose of `idempotency_key` for retrying agents and `source_report_id` for linking to reports, moving the score to 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool persists a directional investment thesis (bull/bear/neutral) on a ticker, and distinguishes it from siblings like `delete_thesis`, `list_theses`, and `score_thesis_outcome` by specifying it is used to record conclusions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to use AFTER analysis is complete, not before, and mentions pairing with `list_theses` and `score_thesis_outcome` for tracking. Also provides tier limitations and rejects sample tier, offering clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_watchlistSave WatchlistA

Idempotent

Inspect

Upsert a named watchlist with a list of tickers. Replace semantics — the full ticker list is the source of truth for that name. Use this for both creation AND modification (delete + recreate is not required for edits). 500-ticker cap per list. Names are case-insensitive uniqueness.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Unique-per-user display name.
`tickers`	Yes	US tickers. Normalised to uppercase, deduped.
`criteria`	No	Optional free-form screening criteria description.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`created`	Yes
`watchlist`	Yes

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-read-only, idempotent, and non-destructive. The description adds valuable behavioral details: 'Replace semantics' clarifies that the ticker list overwrites any existing list, '500-ticker cap' and 'Names are case-insensitive uniqueness' provide constraints not fully captured by schema alone. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences and a fragment, no wasted words. It front-loads the core purpose and then provides key behavioral details. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a moderate complexity (3 parameters, output schema present), the description covers purpose, usage scenarios, behavioral constraints, and limits. It does not need to explain return values because an output schema exists. The description is complete for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good parameter descriptions. The tool description adds context like 'Replace semantics' for tickers and mentions the ticker cap, but these are already implicit in the schema (maxItems). Overall, the description adds marginal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Upsert', 'creation AND modification') and clearly identifies the resource ('named watchlist with a list of tickers'). It distinguishes from siblings like delete_watchlist, get_watchlist, and list_watchlists by stating the replace semantics and the fact that both creation and modification are covered.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (for creation and modification, with replace semantics, no need to delete/recreate). It does not explicitly mention when not to use it, such as for reading or deleting watchlists, but the sibling tools cover those cases. The guidance is clear but could be slightly more directive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_due_thesesScore Due Theses (bulk auto-grader)AInspect

Find every thesis past its horizon with no outcome yet, and grade each via score_thesis_outcome. Operates on the caller's own theses by default; can target any user when called with customer_id from a full-tier (admin) token. Returns a summary + per-thesis results. Idempotent — a re-call only re-grades anything not already graded.

ParametersJSON Schema

Name	Required	Description
`max`	No	Soft cap on theses scored per call. Defaults to 100. The frontend cron walks users serially so a low cap per user keeps each MCP request bounded.
`as_of`	No	Snapshot date for the 'current' fundamentals window. Defaults to today UTC.
`customer_id`	No	Stripe customer_id of the target user. Defaults to the caller's own; required to be the caller's own unless the caller's plan is `full` (admin / internal cron).

Output Schema

ParametersJSON Schema

Name	Required	Description
`due`	Yes	Subset that were past their horizon AND ungraded.
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`errors`	Yes	Per-thesis errors caught + logged.
`scored`	Yes	Successfully scored + persisted.
`results`	Yes
`scanned`	Yes	Total active theses inspected.
`skipped`	Yes	Skipped because already graded or not yet due.
`target_customer_id`	Yes

Tool Definition Quality

A3.6/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description claims idempotency, but the annotation idempotentHint is false, creating a direct contradiction. Per scoring rules, a contradiction warrants a score of 1. The description adds some behavioral context otherwise.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the main action, no unnecessary words. Efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema, the description adequately covers the tool's operation, scope, and parameters. It lacks only an explicit contrast with score_thesis_outcome, but is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The description adds minor context (e.g., 'soft cap', 'snapshot date'), but does not significantly enhance meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: find overdue theses without outcomes and grade them using score_thesis_outcome. It specifies the scope (caller's own by default, any user with admin token) and distinguishes it from the sibling score_thesis_outcome tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on default scope and admin override, explains the soft cap for frontend use, and implies when to use this tool versus the single-grading sibling. It lacks explicit alternatives but is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_thesis_outcomeScore Thesis OutcomeAInspect

Grade a saved thesis against fundamental momentum since its creation. Pulls revenue / operating-margin / EPS / OCF deltas and aggregates into a score in [-1, +1]. Bull theses are graded by directional alignment, bear by inverse, neutral by closeness-to-flat. The grade is persisted back to the thesis row; re-call to refresh once new fundamentals land.

Note (PR 2): scoring is fundamental-only — does NOT yet include market-price returns. Phase 2 will mix in price data via a partner feed; the response shape is stable.

ParametersJSON Schema

Name	Required	Description	Default
`as_of`	No	Snapshot date for the 'current' fundamentals window. Defaults to today UTC. The scorer picks the fiscal period closest to this date.
`thesis_id`	Yes	Id returned by `save_thesis` or `list_theses`.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`thesis`	Yes
`outcome`	Yes

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by disclosing that the grade is persisted, and re-calling refreshes it. It also notes limitations (fundamental-only, future price data phase). There is no contradiction with annotations (readOnlyHint=false, destructiveHint=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: three sentences and a note. It is well-structured, starting with purpose, followed by input and scoring logic, then persistence behavior, and a future note. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity and presence of an output schema, the description covers purpose, inputs, scoring method, persistence, and refresh behavior. It could mention that the thesis must exist, but this is implied. Overall, it is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters well-described. The tool description adds some context (e.g., thesis_id from save_thesis or list_theses, as_of defaults to today) but does not significantly extend beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Grade a saved thesis against fundamental momentum since its creation', using a specific verb and resource. It distinguishes itself from sibling tools like save_thesis, list_theses, and delete_thesis by focusing on scoring, not CRUD operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use: after saving a thesis and when fundamentals update. It mentions re-calling to refresh. However, it lacks explicit when-not-to-use or alternatives among siblings, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_universeScreen Universe by Factor ScoresA

Read-onlyIdempotent

Inspect

Rank companies by cross-sectional factor scores from factor_scores.parquet. Returns the underlying factors (roe, gross_margin, operating_margin, net_profit_margin, revenue_growth_yoy, fcf_to_assets, debt_to_equity, asset_turnover, current_ratio, piotroski_f_score) plus their percentile ranks (1.0 = best in universe; 0.0 = worst). composite_rank is a proprietary blend of the factor ranks — a one-number screening shortcut. For single-factor sorts, use the specific rank column (e.g. sort_by=roe_rank); for a balanced multi-factor view, use sort_by=composite_rank (default). Two modes: full-universe (omit ticker) or single-entity (ticker set — useful for spot-checking ONE company's factor profile without scrolling the whole leaderboard). Sector filter is SIC-derived (GICS-aligned labels, not licensed GICS — see get_pit_universe description for the caveat). Use this instead of get_financial_ratios when you want CROSS-SECTIONAL comparison (rank vs peers); use get_financial_ratios when you want one company's ratios over time. Available on every plan — sample returns the subset covered by the sample bucket.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Number of results to return (1-100). Defaults to 25.
`sector`	No	Filter to a specific sector (case-insensitive partial match). E.g. 'Technology', 'Healthcare'.
`ticker`	No	If provided, show only this ticker's factor scores (single-entity mode). Omit to screen the full universe.
`sort_by`	No	Which factor rank to sort by. Options: roe_rank, gross_margin_rank, operating_margin_rank, net_profit_margin_rank, revenue_growth_yoy_rank, fcf_to_assets_rank, debt_to_equity_rank, asset_turnover_rank, current_ratio_rank, piotroski_f_score_rank, composite_rank. Defaults to composite_rank.	composite_rank

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint false, so the agent knows it's a safe, idempotent read operation. The description adds value by stating the return format ('percentile ranks where 1.0 = best') and sample bucket behavior, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loading the core purpose, then adding key details. Every sentence adds value: first sentence defines the tool, second sentence adds key features and availability. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description compensates by specifying return values (percentile ranks) and key features (filtering, sorting, single-entity mode). With 4 parameters all fully described in the schema, the description is complete for an agent to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema documents all parameters thoroughly. The description adds context about 'single-entity mode' for ticker and mentions sorting by 'any factor rank column,' which aligns with the sort_by parameter's listed options. However, it doesn't elaborate on each parameter beyond what the schema provides, but this is acceptable given full coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool ranks companies by cross-sectional factor scores, with a specific verb ('rank') and resource ('companies by factor scores'). It distinguishes itself from siblings like 'get_pit_universe' or 'get_company_fundamentals' by focusing on factor-based screening and percentile ranks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly mentions filtering by sector, sorting by factor rank, and single-entity mode via ticker. It also states 'Available on every plan — sample returns the subset covered by the sample bucket,' which provides context on availability. However, it doesn't explicitly say when not to use this tool or compare to alternatives like search_companies.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_companiesSearch CompaniesA

Read-onlyIdempotent

Inspect

Search for US public companies by name, ticker symbol, CIK (SEC identifier), or SIC industry code. Returns ticker, company name, sector, industry, exchange, and current S&P 500 membership status. Use this tool to resolve a company name to ticker/CIK before calling get_company_fundamentals, get_valuation_metrics, or other tools that require a ticker — they do not fuzzy-match company names.

Use this tool — NOT get_pit_universe — when the user asks about CURRENT S&P 500 members. To list current S&P 500 members, call this tool with no search parameters and filter results by sp500_member=true. This returns the live snapshot as of query time. Example: "List 5 current S&P 500 members" → call search_companies with no parameters, then filter by sp500_member=true and return the first 5.

Use get_pit_universe ONLY when the user explicitly needs a survivorship-free historical universe as of a specific past date (e.g. "S&P 500 members as of March 2018"). If the user says "current," "today," "now," or gives no date, use search_companies instead.

Data details: sic_code is the 4-digit SIC; industry is the human-readable label. sector is SIC-derived with GICS-style labels — NOT licensed GICS, so industrial conglomerates may map differently from official GICS (e.g. 3M → 'Health Care' by SIC vs Industrials by GICS). S&P 500 membership is sourced from index_membership.parquet (current SP500 = index_name='SP500' AND removal_date IS NULL). Available on all plans.

ParametersJSON Schema

Name	Required	Description
`cik`	No	SEC CIK identifier (exact match). E.g. '0000320193' for Apple.
`limit`	No	Maximum number of results to return (1–50). Defaults to 25.
`query`	No	Free-text search over company name and ticker. Case-insensitive. E.g. 'Apple', 'AAPL', 'Microsoft', 'semiconductor'.
`is_sp500`	No	Filter to current S&P 500 members only.
`sic_code`	No	4-digit SIC industry code. E.g. '7372' for Prepackaged Software.
`is_active`	No	Filter to active (currently trading) companies only.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`query`	Yes
`companies`	Yes
`results_returned`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, idempotentHint, non-destructive. The description adds context about plan availability ('Available on all plans') and search capabilities, but doesn't need to repeat annotations. Minor gap: no mention of search behavior (e.g., partial matching) beyond case-insensitivity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, all substantive: what it does, what it returns, when to use it, availability. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (search with filters, no nested objects), the description fully covers the purpose, usage, and outputs. Output schema exists, so return details need not be in description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and each parameter already has a description. The tool description adds no additional parameter meaning beyond what's in the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches US public companies by multiple identifiers (name, ticker, CIK, SIC) and lists returned fields (ticker, company name, sector, etc.). It distinguishes itself from sibling tools by being a prerequisite lookup tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Use this tool before other tools to resolve a company name to its ticker or CIK,' providing clear guidance on when to use it. No explicit when-not, but the purpose is well-scoped.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_filing_textSearch Filing TextA

Read-onlyIdempotent

Inspect

Semantic search over SEC 10-K / 10-Q narrative sections — Risk Factors, MD&A, Business, Legal Proceedings, Controls & Procedures, Footnotes. Phrase the query as a statement rather than a question for best recall (e.g. "companies with rising supply chain concentration risk" not "what companies have supply chain risk?"). Returns passages ranked by semantic similarity with the source filing accession id + URL for citation. Cached at the per-plan tier for 10 min.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max passages to return (1-20). Defaults to 5. Pro plan capped at 10.
`query`	Yes	Natural-language query. Statement form beats question form for recall.
`ticker`	No	Restrict to one company (case-sensitive uppercase). Omit for cross-company search.
`section`	No	Restrict to a single section label. Omit to search all sections.
`form_type`	No	Restrict to one form type.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`query`	Yes
`passages`	Yes
`results_returned`	Yes

Tool Definition Quality

A3.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds context: it returns top passages by semantic similarity, includes links to source filings, and notes that the feature launches in Q3 2026 (currently unavailable). This goes beyond annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: two sentences. The first sentence clearly states purpose and scope. The second adds a time constraint (coming soon). No fluff. Could be slightly more structured by separating usage guidance, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters (100% schema coverage), good annotations, and no output schema, the description covers the essential: what it searches, what it returns, and a critical caveat (launch date). It omits details like pagination or error behavior, but these are less critical with a read-only tool. A dedicated user might want more, but it's reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds no extra parameter details beyond what the schema provides (e.g., query phrasing hint is in schema). It does mention returning 'top passages' and 'links', which is output behavior, not parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs semantic search over specific SEC filing sections (Risk Factors, MD&A, etc.) and returns top passages with links. It differentiates from siblings like 'get_sec_filing_links' (which retrieves links, not text) and 'screen_universe' (which screens, not searches). However, it doesn't explicitly distinguish from 'search_companies', though the latter searches companies, not filing text.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when a user wants to find passages in SEC filings relevant to a query. It does not explicitly say when to use this tool vs. alternatives like 'get_sec_filing_links' or 'search_companies'. No exclusion criteria or prerequisites are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

test_alertTest Alert (synthetic fire)AInspect

Fire a synthetic notification through the alert's configured channel. Use this immediately after create_alert to verify the channel (email address valid / webhook URL reachable + HMAC verification on the receiver). The synthetic fire is logged as attempt=1 channel='test' so it doesn't affect the real fire counter — the next genuine match still fires normally.

ParametersJSON Schema

Name	Required	Description	Default
`alert_id`	Yes

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`outcome`	Yes
`alert_id`	Yes
`status_code`	Yes
`channel_type`	Yes
`error_message`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds details beyond annotations: synthetic fire is logged as `attempt=1 channel='test'`, does not affect real fire counter, and verifies channel/HMAC. Annotations indicate non-read-only, non-destructive, non-idempotent, open world, and description aligns.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no superfluous information. Front-loaded with purpose, then usage guidance, then behavioral detail. Well-structured and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, usage timing, effect on counters, and verification details. Return values are likely covered by the output schema. Slightly incomplete regarding parameter description, but overall sufficient for a simple test tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter `alert_id` is not described in the schema (0% coverage) and the description does not explicitly explain it, though its role is inferable from context. A brief mention of what `alert_id` should be would improve clarity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fires a synthetic notification through the configured alert channel. It distinguishes from real firing and ties directly to `create_alert`, differentiating it from sibling tools like `create_alert` or `list_alerts`.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use immediately after `create_alert` to verify channel reachability and HMAC. Provides clear context, though it does not explicitly list when not to use it or alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_reportUpdate Report SectionsA

Idempotent

Inspect

Replace one or more sections of an existing report owned by the caller. Useful for authoring workflows where the agent's first draft (create_report) is refined by additional analysis before publishing. Bumps version. Re-embeds for Vectorize if the report is currently listed. Does NOT change price / tier / visibility — use publish_report for those.

ParametersJSON Schema

Name	Required	Description
`title`	No	Optional new title.
`abstract`	No	Optional new abstract.
`sections`	Yes	Sections to replace. Sections not listed are preserved. Section ids must match the existing payload.
`report_id`	Yes
`expected_version`	No	Optimistic concurrency check. If supplied and the current HEAD version is different, the call returns a `version_conflict` error WITHOUT writing. Pass the version you loaded so a concurrent agent edit produces a 'conflict — review' UX instead of silently overwriting (eng review A3A).

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`report`	Yes
`version`	Yes
`archived`	Yes
`previous_version`	Yes
`vectorize_reindexed`	Yes

Tool Definition Quality

A3.8/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description states version bump on each call, contradicting the 'idempotentHint: true' annotation which implies multiple identical calls produce the same result. This is a clear annotation contradiction. Additionally, description discloses useful behavioral details (re-embed for Vectorize), but contradiction forces score to 1.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: first states core functionality, second provides usage context, third lists important side effects and exclusions. No redundant or wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects: ownership requirement, version bump, re-embedding, and exclusion of price/tier/visibility changes. Output schema exists, so return value details are not needed. Minor gap: no mention of error conditions (e.g., report not found or not owned).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 75% with descriptions for three of four parameters. Description adds no additional parameter-level information beyond the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states specific action: replace sections of an existing report owned by the caller. Clearly distinguishes from siblings like 'create_report' (first draft) and 'publish_report' (visibility changes).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly indicates use in authoring workflows for refining drafts before publishing, and notes what it does NOT change (price/tier/visibility), directing to 'publish_report'. Provides clear when-to-use and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_fact_lineageVerify Fact LineageA

Read-onlyIdempotent

Inspect

Use this tool when the user asks BOTH what a financial figure is AND which filing reported it — for example "What was Apple's most recently reported revenue, and which 10-Q filed it?" or "Show me the accession ID for Tesla's latest net income" or "Which filing form reported Amazon's Q3 operating cash flow?" This tool returns a single fact plus its complete filing provenance: entity, concept, period, value, accession ID, filing URL, and form type (10-K, 10-Q, etc.).

Use this INSTEAD OF search_companies when the user already names a company and wants a financial figure with its source filing — search_companies only resolves company identifiers and returns no financial data. Use this INSTEAD OF get_company_fundamentals when the user explicitly wants to know which filing or form type reported a number, or needs the accession ID — get_company_fundamentals returns metrics across multiple periods but omits filing provenance.

Two lookup modes: (1) by fact_id (SHA-256 hash of entity_id|accession_id|concept|period_end|unit) for deterministic identity; or (2) by concept name (e.g., TotalRevenue, NetIncome, EPSDiluted, TotalAssets, OperatingCashFlow) plus a ticker to retrieve the most recently reported fact. Optionally pin a point-in-time cutoff via period_end (YYYY-MM-DD) — returns the latest filing accepted by SEC on or before that date, eliminating look-ahead bias. Check _meta.pit_safe in the response to confirm PIT correctness.

Provide either fact_id or concept (required). Returns empty result with error_code FACT_NOT_FOUND if no matching fact exists for the given concept and ticker. Available on all plans.

ParametersJSON Schema

Name	Required	Description
`ticker`	Yes	Stock ticker symbol, e.g. AAPL, MSFT, BRK.B
`concept`	No	Standard concept name to look up the most recently known fact for. Use this when you don't have a fact_id. Allowed: TotalRevenue, NetIncome, EPSDiluted, TotalAssets, TotalLiabilities, OperatingCashFlow, CAPEX. Provide either concept OR fact_id.
`fact_id`	No	Deterministic fact identity hash: SHA-256(entity_id\|accession_id\|concept\|period_end\|unit). 64-char lowercase hex. Use this when you already have the hash from a previous query. Provide either fact_id OR concept (not both required, but at least one must be set).
`as_of_date`	No	Point-in-time cutoff (YYYY-MM-DD) used with `concept`. Returns the most recently known fact whose 10-K / 10-Q filing was accepted by SEC on or before this date — true PIT, no lookahead bias. Any calendar date works (not limited to fiscal-period closes); the latest preceding filing is returned. This is the canonical name shared by every other PIT tool in the suite; supersedes the legacy `period_end` parameter on this tool.
`period_end`	No	[DEPRECATED — pass `as_of_date` instead.] Filing-acceptance cutoff (YYYY-MM-DD) used with `concept`. The name is misleading — it actually filters on filing accepted_at, not on the period_end of the returned fact (the returned fact's period_end is whatever the SEC filed, e.g. a Q3 cumulative figure). Kept for one release for backwards compat; new callers MUST use `as_of_date`.

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint as false, providing a strong safety profile. The description adds that the fact_id is deterministic and describes the hashing formula, plus confirms no destructive actions. Since annotations already cover the key traits, the description adds value by explaining the deterministic nature and the return format, but does not repeat annotation information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each adding value: first defines the tool's action, second lists return values, third provides the hash formula and a use case. It is front-loaded with the purpose. The only minor issue is that the sentence about availability ('Available on all plans') is not essential for tool invocation but does not harm.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a clear deterministic operation, no output schema, and sibling tools that are all different (no overlap), the description covers the key aspects: what it does, what it returns, and requirements. It could potentially mention that it does not modify data (but annotations already handle that). Overall, it is complete for a read-only verification tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already fully describes both parameters (ticker and fact_id) with patterns, lengths, and descriptions. The description does not add significant meaning beyond the schema: it mentions the fact_id hash formula and that ticker is required, but these are already in the schema. Therefore, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies provenance of a financial data point using a deterministic fact_id hash. It specifies the return values (lineage chain with entity, concept, period, value, SEC filing accession ID, and URL), which distinguishes it from sibling tools that operate on financial data but serve different purposes like searching or comparing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool: to confirm a data point traces back to a specific SEC EDGAR filing. It also states a requirement (ticker and fact_id needed) and mentions availability on all plans. However, it does not explicitly exclude alternatives or provide when-not-to-use guidance, though the specificity of the tool makes this less critical.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

watchlist_diffWatchlist DiffA

Read-onlyIdempotent

Inspect

Return new SEC filings across the caller's watchlist tickers since a given date. Reads filing.parquet — does not call insider/ratio surfaces (use those tools separately if you need them). Concurrency-bounded; max 50 tickers per call.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Watchlist name.
`since`	Yes
`form_types`	No	Filing forms to include. Defaults to 10-K + 10-Q + 8-K.

Output Schema

ParametersJSON Schema

Name	Required	Description
`_meta`	Yes	Provenance envelope — data lineage for every MCP response
`since`	Yes
`filings`	Yes
`watchlist_name`	Yes
`tickers_scanned`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds concurrency bounding, source data (filing.parquet), and clarifies it does not call other surfaces, providing extra behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: first states purpose, second provides data source and tool differentiation, third gives concurrency limit. Every sentence adds value; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With rich annotations and an output schema present, description covers purpose, data source, limitation, and relation to siblings. No major gaps given complexity and structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67% with descriptions for name and form_types. Description mentions 'since a given date' which adds context to the since parameter, but does not elaborate on form_types defaults or other semantics. Adequate but not enhanced beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Return new SEC filings across the caller's watchlist tickers since a given date.' It specifies the action (return), the resource (watchlist filings), and contrasts with sibling tools by noting it does not call insider/ratio surfaces.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use: for new filings since a date. Provides exclusions: 'does not call insider/ratio surfaces (use those tools separately).' Also states concurrency limit of max 50 tickers per call.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Discussions

Your Connectors

Resources