simplefunctions

Name: simplefunctions
Author: spfunctions

by io.github.spfunctions

Server Details

Calibrated world model for AI agents. 40 tools: world state, markets, trading. Kalshi + Polymarket.

Status: Healthy
Last Tested: 2026-05-26 19:33
Transport: Streamable HTTP
URL
Repository: spfunctions/simplefunctions-cli
GitHub Stars: 10
Server Listing: SimpleFunctions

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.1/5.0

Tool DescriptionsB

Average 3.4/5 across 56 of 56 tools scored. Lowest: 2.5/5.

Server CoherenceB

Disambiguation2/5

Many tools have overlapping or ambiguous purposes, causing potential confusion. For example, 'get_trade_ideas', 'get_edges', 'scan_markets', and 'screen_markets' all relate to finding trading opportunities but with unclear distinctions. Similarly, 'get_world_state' and 'get_world_delta' serve similar functions with incremental vs. full updates, and 'enrich_content' vs. 'monitor_the_situation' both cross-reference content with markets but differ in scope and authentication. The descriptions help somewhat, but the high tool count exacerbates the overlap.

Naming Consistency4/5

Tool names generally follow a consistent verb_noun pattern (e.g., 'create_intent', 'list_theses', 'update_position'), with clear actions like get, create, update, list. However, there are minor deviations such as 'augment_tree' (verb_noun but less common), 'browse_public_skills' (browse vs. list), 'enrich_content' (enrich vs. get), and 'what_if' (non-standard phrasing). Overall, the naming is mostly predictable and readable, with only a few inconsistencies.

Tool Count2/5

With 56 tools, the count is excessive for a single server, making it overwhelming and difficult to navigate. While the domain (prediction market trading and analysis) is broad, many tools could be consolidated or grouped (e.g., multiple 'get_' tools for market data, advanced vs. basic functions). This high number suggests poor scoping and likely includes redundant or overly granular operations that hinder usability.

Completeness5/5

The tool set provides comprehensive coverage for prediction market trading and analysis, with no obvious gaps. It includes full lifecycle management (create, update, list, delete for theses, intents, positions), data retrieval (market prices, edges, context, forecasts), execution tools (intents, strategies), and auxiliary functions (forum, skills, legislation). The surface supports end-to-end workflows from discovery to execution and monitoring, ensuring agents can perform all necessary operations without dead ends.

Available Tools

108 tools

add_positionBInspect

Record a new position in a thesis for tracking. Use after an intent fills or a manual trade.

ParametersJSON Schema

Name	Required	Description
`size`	No	Position size (contracts)
`venue`	Yes	Exchange venue
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID
`direction`	Yes	Position direction
`rationale`	No	Why this position
`entryPrice`	Yes	Entry price in cents
`marketTitle`	Yes	Human-readable market name
`externalMarketId`	Yes	Market ticker

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it implies a write operation ('Record a new position'), it lacks details on permissions, side effects, error handling, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, consisting of just two sentences that directly state the purpose and usage context. Every word earns its place, with no redundant or unnecessary information, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a write operation with 9 parameters, 7 required), no annotations, and no output schema, the description is incomplete. It doesn't address behavioral aspects like what happens on success/failure, authentication needs (implied by 'apiKey' but not explained), or how it integrates with sibling tools, leaving significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents all parameters thoroughly. The description doesn't add any additional meaning or context beyond what the schema provides (e.g., it doesn't explain relationships between parameters like 'thesisId' and 'externalMarketId'). Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Record') and resource ('new position in a thesis for tracking'), making it understandable. However, it doesn't explicitly differentiate from sibling tools like 'update_position' or 'close_position', which would require more specific context about when to use each.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidelines by stating 'Use after an intent fills or a manual trade,' which gives some context for when to invoke this tool. However, it doesn't explicitly mention alternatives (e.g., when to use 'update_position' instead) or exclusions, leaving room for ambiguity compared to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

augment_treeAInspect

Advanced: Merge suggested causal tree nodes from evaluations into the tree (append-only).

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`dryRun`	No	Preview without applying (default: false)
`thesisId`	Yes	Thesis ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the 'append-only' behavioral trait, which is valuable context beyond the basic 'merge' action. However, it doesn't mention permissions, side effects, rate limits, or what happens on failure. The 'Advanced' label hints at complexity but lacks specifics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and includes a qualifier ('append-only') that adds necessary context without verbosity. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is minimal but covers the core purpose. It lacks details on permissions, error handling, or return values, which are important for an 'Advanced' operation. The 100% schema coverage helps, but behavioral context is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description doesn't add any parameter-specific semantics beyond what the schema provides (e.g., it doesn't explain how 'dryRun' interacts with merging). Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('merge suggested causal tree nodes from evaluations') and resource ('into the tree'), with the qualifier 'append-only' indicating a specific type of merge. It distinguishes from siblings like 'update_nodes' by focusing on merging from evaluations rather than general updates, though it doesn't explicitly name alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when there are 'suggested causal tree nodes from evaluations' to be merged, but doesn't specify when to use this versus alternatives like 'update_nodes' or 'update_thesis'. No explicit when-not or prerequisite guidance is provided, leaving usage context somewhat implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

batch_marketsAInspect

Fetch many markets at once by ticker list. Cheaper than calling get_market_detail in a loop.

ParametersJSON Schema

Name	Required	Description	Default
`depth`	No	Orderbook depth levels (0 = none)
`tickers`	Yes	Comma-separated tickers

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description adds only the cost hint ('Cheaper'), but does not disclose rate limits, error behavior, or response handling. It meets minimum disclosure but is not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no wasted words. Every part adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-param tool with no output schema, the description covers the purpose and cost advantage but omits return format, limits, and error scenarios. Adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already described. The description adds no extra meaning beyond 'by ticker list', which matches the schema's 'Comma-separated tickers'. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a clear verb ('Fetch') and resource ('markets'), and explicitly contrasts with a sibling ('get_market_detail'), distinguishing the tool's batch nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It states when to use this tool ('cheaper than calling get_market_detail in a loop'), providing a clear usage context, though it does not mention when not to use it or other alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

browse_public_skillsAInspect

Browse public skills from the community. No auth required. Filter by category, search, or sort by popularity.

ParametersJSON Schema

Name	Required	Description
`q`	No	Search skills by name/description
`sort`	No	Sort order: popular or new (default: new)
`category`	No	Filter by category: custom, trading, research, monitoring

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds valuable context: 'No auth required' clarifies access requirements, and 'Filter by category, search, or sort by popularity' hints at functionality beyond basic listing. It doesn't cover rate limits or response format, but provides useful operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with zero waste: the first states the core purpose and auth requirement, the second lists capabilities. It is appropriately sized and front-loaded, with every sentence earning its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with 3 parameters and 100% schema coverage but no output schema, the description is reasonably complete. It covers purpose, auth, and capabilities, though it doesn't explain return values or pagination. Given the tool's moderate complexity, it provides sufficient context for an agent to use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all three parameters (q, sort, category). The description mentions filtering and sorting but adds no syntax or format details beyond what the schema provides. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('browse') and resource ('public skills from the community'), making the purpose specific and unambiguous. It distinguishes this tool from siblings like 'list_skills' or 'explore_public' by emphasizing the public/community aspect and browsing functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('browse public skills from the community') and mentions filtering/sorting capabilities. However, it does not explicitly state when not to use it or name alternatives among siblings (e.g., 'list_skills' might be for private skills), leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_intentBInspect

Cancel an active intent. Stops trigger evaluation and prevents execution.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`intentId`	Yes	Intent ID to cancel

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'Stops trigger evaluation and prevents execution', which adds some behavioral context about halting processes. However, it lacks details on permissions, reversibility, side effects, or response format, leaving significant gaps for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two short sentences that directly state the tool's purpose and effect. Every word contributes meaning, with no wasted text, making it front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks information on success/failure outcomes, error conditions, or broader system impact, which are critical for an agent to use this tool effectively in context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters ('apiKey', 'intentId') well-documented in the schema. The description adds no additional parameter details beyond what the schema provides, so it meets the baseline of 3 without compensating for any gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Cancel') and resource ('an active intent'), specifying what the tool does. It distinguishes from siblings like 'create_intent' or 'list_intents' by focusing on termination rather than creation or listing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites (e.g., needing an active intent) or compare to other tools like 'close_position' or 'update_strategy', leaving usage context unclear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

close_positionCInspect

Delete a position record from a thesis.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID
`positionId`	Yes	Position ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states 'Delete,' implying a destructive mutation, but doesn't disclose behavioral traits like whether deletion is permanent, requires specific permissions, has side effects on related data, or returns confirmation. This leaves significant gaps in understanding the tool's impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, direct sentence with zero wasted words, efficiently conveying the core action and target. It's appropriately sized and front-loaded, making it easy for an agent to parse quickly without unnecessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity as a destructive operation with no annotations and no output schema, the description is incomplete. It fails to address critical aspects like return values, error conditions, or confirmation of deletion, leaving the agent with insufficient information for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions in the input schema (e.g., 'Thesis ID,' 'Position ID,' 'SimpleFunctions API key'). The description adds no additional semantic context beyond what the schema provides, so it meets the baseline for adequate but not enhanced parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete') and target resource ('a position record from a thesis'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'update_position' or 'add_position', which would require more specific context about what distinguishes deletion from modification or creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'update_position' or other position-related operations. The description lacks context about prerequisites, such as ensuring the position exists or isn't actively in use, leaving the agent without usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

configure_heartbeatAInspect

Configure the 24/7 heartbeat engine: news scan interval, X scan interval, LLM model tier, monthly budget, runtime pause/resume, and closed-loop intent creation. Agent can speed up monitoring during high volatility or slow down to save budget.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`paused`	No	Pause/resume heartbeat
`thesisId`	Yes	Thesis ID
`xIntervalMin`	No	X/social scan interval in minutes (60-1440, default 240)
`evalModelTier`	No	LLM model for evaluations
`closedLoopExit`	No	Enable/disable closed-loop exit intent creation
`closedLoopEntry`	No	Enable/disable closed-loop entry intent creation
`newsIntervalMin`	No	News scan interval in minutes (15-1440, default 240)
`monthlyBudgetUsd`	No	Monthly budget cap in USD (0 = unlimited)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It mentions the tool configures settings but doesn't disclose behavioral traits like whether changes are immediate, reversible, require specific permissions, or have side effects. The description adds some context about volatility/budget scenarios but lacks details about the mutation's impact, error conditions, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences. The first sentence front-loads the purpose and lists key parameters efficiently. The second sentence provides usage guidance. There's minimal waste, though the parameter listing could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a configuration/mutation tool with 7 parameters, no annotations, and no output schema, the description is adequate but has gaps. It covers what the tool does and when to use it but lacks details about the mutation's behavior, response format, error handling, or prerequisites. The schema provides parameter documentation, but the description doesn't compensate for missing behavioral context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 7 parameters thoroughly. The description lists parameter categories (news scan interval, X scan interval, etc.) but doesn't add meaningful semantic context beyond what's in the schema descriptions. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool configures the '24/7 heartbeat engine' with specific configuration parameters (news scan interval, X scan interval, LLM model tier, monthly budget, pause/resume). It specifies the verb 'configure' and resource 'heartbeat engine' but doesn't explicitly differentiate from sibling tools like 'get_heartbeat_status' or 'monitor_the_situation'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage context: 'Agent can speed up monitoring during high volatility or slow down to save budget.' This gives practical guidance on when to adjust settings. However, it doesn't explicitly state when NOT to use this tool or mention alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_intentAInspect

Declare an execution intent: "buy X when condition Y, expire at Z." Intents are the single gateway for all order execution. The local runtime daemon evaluates triggers and executes via user's Kalshi/Polymarket keys.

ParametersJSON Schema

Name	Required	Description	Default
`venue`	Yes	Exchange venue
`action`	Yes	Trade action
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`source`	No	Intent source: agent, manual, idea	agent
`expireAt`	No	ISO timestamp when intent expires (default: +24h)
`marketId`	Yes	Market ticker (e.g. KXFEDDEC-25DEC31-T100)
`maxPrice`	No	Max price per contract in cents (1-99). Omit for market order.
`sourceId`	No	Source reference (idea ID, thesis ID, etc.)
`direction`	Yes	Contract direction
`rationale`	No	Why this trade — logged for audit trail
`autoExecute`	No	Auto-execute without later human confirmation. Defaults to false unless explicitly set true.
`marketTitle`	Yes	Human-readable market name
`triggerType`	No	When to execute	immediate
`triggerPrice`	No	Price trigger threshold in cents (for price_below/price_above)
`targetQuantity`	Yes	Number of contracts

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions that intents are evaluated and executed via user's keys, hinting at external dependencies and potential financial actions, but lacks details on permissions, rate limits, error handling, or what happens upon execution (e.g., order confirmation). This leaves gaps for a tool with significant real-world impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and front-loaded, using just two sentences that efficiently convey the core purpose and mechanism without any wasted words. Every sentence earns its place by defining the tool's role and execution process clearly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (15 parameters, financial trading context) and lack of annotations or output schema, the description is incomplete. It adequately explains the high-level purpose but misses critical details like return values, error cases, or behavioral nuances needed for safe use in a trading environment, leaving room for improvement.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents all 15 parameters thoroughly. The description adds minimal value by implying parameters like 'buy X' (action/market), 'condition Y' (trigger), and 'expire at Z' (expireAt), but does not provide additional semantics beyond what the schema offers, such as usage examples or constraints not in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('declare an execution intent') and resources ('buy X when condition Y, expire at Z'), explicitly distinguishing it from siblings by noting it's 'the single gateway for all order execution.' This goes beyond just restating the name to explain its unique role in the system.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context that this tool is for declaring intents that are evaluated and executed by a local runtime daemon, implying it should be used for automated trading setups. However, it does not explicitly state when not to use it or name specific alternatives among the many sibling tools, such as 'add_position' or 'update_position,' which might handle similar trading actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_skillBInspect

Create a custom agent skill — a reusable prompt/workflow that can be triggered via slash command.

ParametersJSON Schema

Name	Required	Description
`auto`	No	Auto-trigger condition
`name`	Yes	Skill name (e.g. "precheck")
`tags`	No	Tags for discovery
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`prompt`	Yes	The full prompt/instructions for the skill
`trigger`	Yes	Slash command trigger (e.g. "/precheck")
`category`	No	Category: custom, trading, research, monitoring
`toolsUsed`	No	SimpleFunctions tools this skill uses
`description`	Yes	What this skill does
`estimatedTime`	No	Estimated run time

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but only states the basic function. It doesn't disclose behavioral traits such as permissions needed (e.g., API key usage), whether creation is reversible, rate limits, or what happens on success/failure. This is inadequate for a mutation tool with zero annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. Every word earns its place without redundancy, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with 10 parameters, no annotations, and no output schema, the description is insufficient. It lacks context on usage scenarios, error handling, or what the tool returns (e.g., a skill ID). This leaves significant gaps for an agent to operate effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 10 parameters thoroughly. The description adds no additional meaning beyond what's in the schema (e.g., it doesn't clarify relationships between parameters like 'trigger' and 'name'). Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create') and resource ('custom agent skill'), specifying it's a reusable prompt/workflow triggered via slash command. It distinguishes from siblings like 'fork_skill' (which duplicates) or 'list_skills' (which retrieves).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'fork_skill' (for duplicating) or 'publish_skill' (for making public). The description implies usage for initial creation but lacks context on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_strategyAInspect

Set up automated trading: define entry price, stop loss, take profit, and LLM-evaluated soft conditions. The heartbeat engine checks conditions every 15 min and executes when met.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`market`	Yes	Human-readable market name
`horizon`	No	Time horizon	medium
`marketId`	Yes	Market ticker e.g. KXWTIMAX-26DEC31-T150
`stopLoss`	No	Stop loss: bid <= this value (cents)
`thesisId`	Yes	Thesis ID
`direction`	Yes	Trade direction
`rationale`	No	Full logic description
`entryAbove`	No	Entry trigger: ask >= this value (cents, for NO direction)
`entryBelow`	No	Entry trigger: ask <= this value (cents)
`takeProfit`	No	Take profit: bid >= this value (cents)
`maxQuantity`	No	Max total contracts
`softConditions`	No	LLM-evaluated conditions
`perOrderQuantity`	No	Contracts per order

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds valuable context about the 'heartbeat engine' checking conditions every 15 minutes and executing when met, which reveals automation behavior. However, it doesn't cover important aspects like authentication needs (apiKey is documented in schema but not described in behavior), rate limits, error handling, or what happens on execution failure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences that efficiently convey the core functionality. The first sentence establishes the purpose, and the second adds important behavioral context about the heartbeat engine. There's minimal waste, though it could be slightly more structured for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex 14-parameter tool with no annotations and no output schema, the description provides adequate but incomplete context. It covers the automation aspect well but doesn't address important considerations like what the tool returns, error conditions, or how it interacts with other trading tools. The description compensates somewhat for the lack of structured metadata but leaves gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 14 parameters thoroughly. The description mentions key parameters like 'entry price, stop loss, take profit, and LLM-evaluated soft conditions' but doesn't add meaningful semantic context beyond what the schema provides. It establishes the purpose but doesn't clarify parameter relationships or usage nuances.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Set up automated trading') and resources ('define entry price, stop loss, take profit, and LLM-evaluated soft conditions'), distinguishing it from siblings like 'create_intent' or 'update_strategy' by focusing on automated trading strategy creation with heartbeat monitoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('automated trading', 'heartbeat engine checks conditions every 15 min') but doesn't explicitly state when to use this tool versus alternatives like 'create_intent' or 'update_strategy'. It provides some operational context but lacks explicit guidance on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_thesisCInspect

Create a new prediction market thesis from a TESTABLE CLAIM — a statement that can be verified true or false at a future time. GOOD: "Bitcoin closes 2026 above $50,000". BAD: "High conviction due to large price gap" (that is reasoning, not a claim). Builds a causal tree and scans for mispriced contracts. Formation takes ~60s.

ParametersJSON Schema

Name	Required	Description
`sync`	No	Wait for formation to complete
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesis`	Yes	Your thesis statement

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Builds a causal tree and scans for mispriced contracts', which hints at computational behavior, but it lacks critical details like whether this is a read-only or mutating operation, what permissions or API keys are required (though the schema covers the apiKey parameter), rate limits, or what the output looks like. For a creation tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose and adds implementation details without redundancy. Every word earns its place, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of creating a prediction market thesis with no annotations and no output schema, the description is incomplete. It lacks behavioral context (e.g., mutation effects, error handling), output details, and usage guidelines, leaving significant gaps for the agent to navigate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (sync, apiKey, thesis) with descriptions. The description doesn't add any meaning beyond what the schema provides, such as explaining the relationship between parameters or usage examples. The baseline score of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create a new prediction market thesis') and the resource ('thesis'), and it adds implementation details about building a causal tree and scanning for mispriced contracts. However, it doesn't explicitly differentiate this tool from sibling tools like 'create_intent', 'create_skill', or 'create_strategy', which all involve creation operations in the same domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, compare it to similar tools like 'create_intent' or 'create_skill', or specify scenarios where it's appropriate. The agent must infer usage from the purpose alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

enrich_contentBInspect

Cross-reference any text with thousands of prediction market contracts. Paste content + topics, get divergence analysis: where sentiment disagrees with market prices. No auth, no Firecrawl needed. Demo/trial entry point.

ParametersJSON Schema

Name	Required	Description
`model`	No	LLM model for digest generation. Default: google/gemini-2.5-flash
`topics`	Yes	Topics to search in prediction markets, e.g. ["iran", "oil"]
`content`	Yes	Text content to cross-reference (up to 50K chars)
`includeIndex`	No	Include SimpleFunctions Index

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds useful context: 'No auth, no Firecrawl needed' indicates accessibility and integration simplicity, and 'Demo/trial entry point' hints at potential limitations or trial nature. However, it lacks details on rate limits, error handling, or what 'divergence analysis' entails in practice, leaving gaps for a tool with no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded, starting with the core functionality. Each sentence adds value: the first explains the tool's purpose, the second specifies input/output, and the third provides contextual details. However, the phrase 'Demo/trial entry point' could be more precise, and the structure slightly mixes operational details with purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of cross-referencing text with prediction markets and no output schema, the description is moderately complete. It covers the tool's purpose, input requirements, and some behavioral context (e.g., no auth needed), but lacks details on output format, error cases, or how 'divergence analysis' is presented. For a tool with 4 parameters and no annotations, more completeness would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value beyond the schema, mentioning 'Paste content + topics' which aligns with the required parameters but doesn't provide additional syntax or format details. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't significantly enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Cross-reference any text with thousands of prediction market contracts' and 'get divergence analysis: where sentiment disagrees with market prices.' It specifies the verb ('cross-reference', 'get divergence analysis') and resource ('text', 'prediction market contracts'), though it doesn't explicitly differentiate from sibling tools like 'scan_markets' or 'screen_markets' which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'Demo/trial entry point' and 'No auth, no Firecrawl needed,' suggesting it's for initial exploration or testing. However, it doesn't provide explicit guidance on when to use this tool versus alternatives like 'scan_markets' or 'query', nor does it specify prerequisites or exclusions beyond the implied trial context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explore_publicAInspect

Browse public theses from other users. No auth required. Pass a slug to get details, or omit to list all.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	No	Specific thesis slug, or empty to list all

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that 'No auth required' and implies read-only behavior ('Browse'), but lacks details on rate limits, pagination for listing, or error handling. This is adequate but has gaps for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with zero waste: it states the purpose, auth requirement, and parameter usage efficiently. Every sentence adds value, and it's front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a simple input schema, the description is moderately complete. It covers purpose, auth, and parameter usage but lacks details on return values, error cases, or behavioral constraints like rate limits. This is adequate but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents the 'slug' parameter. The description adds minimal value by restating the parameter's purpose ('Pass a slug to get details, or omit to list all'), which aligns with the schema but doesn't provide additional semantics like format examples. Baseline 3 is appropriate here.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Browse') and resource ('public theses from other users'), making the purpose understandable. However, it doesn't explicitly differentiate from sibling tools like 'list_theses' or 'browse_public_skills', which would require a 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool: 'Pass a slug to get details, or omit to list all.' It also mentions 'No auth required,' which is helpful. However, it doesn't specify when to use this tool over alternatives like 'list_theses' or 'browse_public_skills', preventing a score of 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explore_thesesAInspect

Browse public theses (alias of explore_public). Pass slug to get one, omit to list.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	No	Thesis slug, or empty to list

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the burden. It states the tool is for browsing (read-only), which is appropriate, but lacks details on auth, rate limits, or other behavioral traits. It is adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the key action, and contains no waste. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple browse tool with one optional parameter and no output schema, the description is mostly sufficient. However, it fails to mention what the tool returns (e.g., a list of thesis summaries or a full thesis object), which would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has one parameter 'slug' with a clear description. The tool description essentially repeats that ('Pass slug to get one, omit to list'). With high schema coverage (100%), baseline is 3, and the description adds no significant new meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Browse'), the resource ('public theses'), and the behavior (list or get one by slug). It also distinguishes itself from siblings by noting it's an alias of explore_public.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly explains how to use the tool: pass slug to get one thesis, omit to list all. It implies it's an alias of explore_public, but does not provide explicit alternatives or when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fork_skillBInspect

Fork a public skill into your collection. No slug needed — just the skill ID.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`skillId`	Yes	Public skill ID to fork

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the operation ('fork') which implies a copy/create action, but doesn't describe what happens during forking (e.g., whether it creates an editable copy, maintains links to original, includes dependencies). No information about permissions needed, rate limits, error conditions, or what the forked skill inherits. The description is minimal and leaves significant behavioral aspects unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just two sentences, with zero wasted words. It's front-loaded with the core purpose, followed by a brief parameter guidance. Every sentence earns its place by providing essential information about the tool's function and parameter expectations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete for a mutation tool. While concise, it lacks critical information about what the fork operation actually does behaviorally, what permissions are required, what the response contains, or how the forked skill relates to the original. For a tool that presumably creates a copy of someone else's work, more context about the operation's implications would be valuable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so both parameters are documented in the schema. The description adds minimal value beyond the schema: it mentions 'No slug needed' which clarifies what's NOT required, and 'just the skill ID' reinforces the skillId parameter's purpose. However, it doesn't provide additional context about parameter relationships, validation rules, or usage patterns that aren't already in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Fork') and resource ('a public skill into your collection'), providing a specific verb+resource combination. It distinguishes from siblings like 'create_skill' by specifying it's for forking existing public skills rather than creating new ones from scratch. However, it doesn't explicitly differentiate from 'fork_thesis' which is a similar operation on a different resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by specifying 'public skill' and 'into your collection,' suggesting this is for copying shared skills to personal use. It mentions 'No slug needed' which provides some guidance about parameter requirements. However, it doesn't explicitly state when to use this versus alternatives like 'create_skill' or 'browse_public_skills,' nor does it provide exclusion criteria or prerequisites beyond having the skill ID.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fork_thesisCInspect

Fork a thesis. Two modes: (1) Clone — call with just idOrSlug to copy a PUBLIC thesis verbatim into your collection. (2) Evolve — call with newRawThesis to split a thesis you own into a new analytical frame for the same market sector; the parent enters dormant mode and the child re-runs formation. Use evolve when the current frame is fundamentally inadequate for what the evidence now shows.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`reason`	No	Evolve mode: why the original frame is no longer adequate
`idOrSlug`	Yes	Thesis ID or public slug
`newTitle`	No	Evolve mode: short title for the new thesis, ≤60 chars
`newRawThesis`	No	Evolve mode: new frame in 1-3 sentences. Omit for clone mode.
`inheritEdgeMarketIds`	No	Evolve mode: subset of current edge marketIds to carry over. Omit to let formation rescan fresh.

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It mentions copying content and running 'formation to scan for fresh edges,' which hints at post-fork automation, but fails to disclose critical behavioral traits like authentication needs (implied by apiKey), rate limits, side effects, or what 'formation' entails. This leaves significant gaps for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with the main action, using three sentences that each add value: forking, copying content, and post-fork automation. There is no wasted text, though it could be slightly more structured for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is incomplete. It misses details on authentication (only implied by apiKey), error handling, what 'formation' does, and the return value. Given the complexity and lack of structured data, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (apiKey and idOrSlug). The description adds no additional meaning about parameters beyond implying 'idOrSlug' refers to a public thesis, which is minimal value. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Fork a public thesis into your collection') and specifies what gets copied ('thesis statement and causal tree'), which distinguishes it from generic creation tools. However, it doesn't explicitly differentiate from sibling tools like 'fork_skill' or 'create_thesis' beyond mentioning 'public thesis'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for forking public theses but provides no guidance on when to use this versus alternatives like 'create_thesis' or 'fork_skill', nor does it mention prerequisites or exclusions. It lacks explicit when/when-not instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_agent_guideBInspect

Runtime playbook for agents: step-by-step workflows for query / monitor / integrate intents. Use when an agent is lost or needs onboarding.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No	Specific question to scope the guide
`intent`	No	Workflow intent

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It only says it's a 'playbook' with workflows, but doesn't state whether it's read-only, has side effects, or requires authentication. Inadequate given the lack of annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two lean sentences, front-loaded with purpose. Efficient but could include a bit more structure without bloat.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema or annotations, the description should cover return format, example usage, or cross-reference siblings. It only gives high-level purpose and usage scenario, leaving gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline of 3. The description mentions intents (matching the enum) but adds no new meaning beyond what schema already provides for either parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides a runtime playbook with workflows for specific intents (query, monitor, integrate). It distinguishes itself from generic 'get_' tools by specifying its role for agent onboarding.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when an agent is lost or needs onboarding', giving clear context. Does not mention alternatives or when not to use, but the guidance is direct.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_answerBInspect

Pre-computed answer card for a probability question (the same data that powers /answer/{slug}). Returns probability, confidence, and citations.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	Answer slug (e.g. will-the-fed-cut-rates-in-december)

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses return fields but does not state whether the tool is read-only, requires authentication, has rate limits, or any side effects. The term 'pre-computed' implies safety but is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no superfluous text. Purpose and output are front-loaded, making it easy for an AI agent to quickly understand the tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description sufficiently explains what is returned. However, it lacks context about data availability (e.g., when answer cards exist) and does not clarify how this differs from similar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and documents the slug parameter adequately. Description does not add any additional parameter information beyond what the schema provides, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool retrieves a pre-computed answer card for a probability question, including specific return fields (probability, confidence, citations). This differentiates it from sibling get_* tools like get_forecast or get_opinion by focusing on answer cards for probability questions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. While it mentions the data powers /answer/{slug}, it doesn't indicate when to prefer this over related tools (e.g., get_forecast) or any prerequisites or constraints.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_balanceCInspect

Advanced: Kalshi account balance and portfolio value.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys

Tool Definition Quality

C2.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Advanced:' which might imply complexity or additional requirements, but doesn't explain what that entails (e.g., rate limits, authentication needs, or data freshness). It also doesn't describe the return format (e.g., numeric balance, structured portfolio data) or potential errors. For a financial tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief with two phrases, but 'Advanced:' is vague and doesn't earn its place by adding useful information—it could be omitted without loss. The core purpose is stated efficiently, but the structure isn't front-loaded with the most critical details (e.g., it doesn't immediately clarify this is a read-only tool for account data). It's concise but could be more effectively structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a financial tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., balance in dollars, portfolio breakdown), behavioral aspects like authentication requirements or rate limits, or how it fits with sibling tools. For a tool that likely provides critical account information, this leaves too much unspecified for reliable agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its single parameter (apiKey), detailing it as a 'SimpleFunctions API key' with a URL for acquisition. The description adds no parameter-specific information beyond what the schema provides, such as how the apiKey relates to the Kalshi account or any additional context. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool retrieves 'Kalshi account balance and portfolio value', which specifies the resource (account/portfolio) and the action (get/retrieve). However, it's prefixed with 'Advanced:' which adds ambiguity rather than clarity, and it doesn't distinguish this from potential sibling tools like 'get_fills' or 'get_settlements' that might also relate to account data. The purpose is clear but not optimally specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an API key for authentication), context for usage (e.g., checking account status before trading), or exclusions (e.g., not for real-time market data). With many sibling tools present, this lack of differentiation leaves the agent to infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_briefingBInspect

Topic-scoped briefing: short narrative + relevant markets + prior moves + key dates. Reusable as a callable /briefing card.

ParametersJSON Schema

Name	Required	Description	Default
`topic`	No	Topic keyword
`window`	No	Lookback window

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden of disclosure. It implies a read-only operation ('briefing') and lists outputs, but does not explicitly state side effects, permissions, or rate limits. It is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (two sentences) and front-loaded with the main purpose. It avoids redundancy but could be slightly more structured. However, it is concise enough for an agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and two parameters with full schema coverage, the description lists the output components (narrative, markets, moves, dates) which helps. However, it omits return format, error handling, and defaults, leaving gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds minimal value beyond 'topic keyword' and 'lookback window', merely restating them in context. No additional semantics are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a 'topic-scoped briefing' with specific components (narrative, markets, moves, dates), distinguishing it from sibling tools like get_market_detail or get_calendar. However, it could be more explicit about how it differs from other get_* tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool vs alternatives. It mentions 'reusable as a callable /briefing card' but does not specify scenarios or prerequisites, leaving the agent uncertain about context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calendarAInspect

Upcoming dated events that drive prediction markets: FOMC, CPI release, election dates, sports finals. Returns date, topic, and linked tickers.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Lookahead days (default 30)
`category`	No	Category filter (econ, election, sports, geo)

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description discloses output fields but omits details like default lookahead (30 days as per schema) or any authentication/rate limit requirements. Includes relevant examples, but lacks comprehensive behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no wasted words. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description outlines return fields (date, topic, tickers). With two optional parameters and clear context among many siblings, the description is mostly complete for a retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3 is appropriate. Description adds minimal value beyond schema by giving example categories but does not clarify parameter formats or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'upcoming dated events that drive prediction markets' with specific examples (FOMC, CPI, election dates, sports finals) and output fields (date, topic, linked tickers). This distinguishes it from other get_ tools on the server.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's for market-relevant events, but does not state when not to use it or compare with other tools like get_schedule.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calibrationCInspect

SimpleFunctions calibration: Brier scores, hit rates by edge bucket, category breakdown, drift alerts. Measured against resolved/settled markets.

ParametersJSON Schema

Name	Required	Description	Default
`period`	No	Period (30d, 90d, all)
`category`	No	Topic filter (fed, elections, ai, crypto, sports, ...)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully convey behavioral traits. It names outputs (e.g., drift alerts) but does not explain behaviors like data freshness, authentication requirements, or operational constraints. The description is insufficient for an agent to understand side effects or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, listing key outputs and measurement context. It is concise without superfluous words. However, the first sentence is a comma-separated list which slightly reduces readability. Still efficient overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has two parameters, no output schema, and no annotations, the description is minimally complete. It covers what the tool returns but omits details like output format, interpretation of drift alerts, or examples. Could be improved with more context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters (period and category) are fully described in the input schema, achieving 100% coverage. The description adds no extra semantic nuance beyond the schema. Baseline score of 3 is appropriate as the schema already provides adequate meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns calibration data including Brier scores, hit rates, category breakdowns, and drift alerts. It specifies the resource (calibration) and that measurements are against resolved markets. However, it could be more explicit about what exactly is being calibrated (e.g., forecast probabilities).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, exclusions, or specific contexts. While the purpose is clear, an agent lacks cues for appropriate invocation relative to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_changesAInspect

Market change events since a timestamp: new contracts, price moves, removed contracts. Used by the live feed and agent context refreshers.

ParametersJSON Schema

Name	Required	Description
`q`	No	Keyword filter
`type`	No	Change type
`since`	No	ISO timestamp lower bound

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the tool returns change events, implying a read operation, but does not disclose other behaviors like authentication, rate limits, or pagination. The description is adequate but lacks detailed behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences (17 words), front-loaded with key information, and wastes no words. It achieves efficiency without sacrificing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 3 parameters and no output schema, the description covers the essential aspects: purpose, relevant event types, and usage context. It does not explain return format, but that is acceptable given the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so the description adds limited additional value. It mentions the event types (new_contract, price_move, removed_contract) which align with the type parameter, but does not provide further semantic enrichment beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves market change events (new contracts, price moves, removed contracts) since a timestamp. It does not explicitly differentiate from sibling tools like get_changes_delta, but it provides enough context to understand its purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it is used by the live feed and agent context refreshers, giving context for when to use it. However, it does not provide explicit alternatives or when-not-to-use guidance, leaving room for ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_changes_deltaAInspect

Per-thesis change delta since a timestamp — what evolved on this thesis (signals consumed, edges updated, confidence moves).

ParametersJSON Schema

Name	Required	Description
`since`	No	ISO timestamp lower bound
`apiKey`	Yes	SimpleFunctions API key
`thesisId`	Yes	Thesis ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description carries full burden. It indicates a read operation returning evolution details but does not disclose side effects, permissions beyond the API key, or any limitations. Adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the main action. Every word contributes value; no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description sufficiently explains return value (signals, edges, confidence moves). Missing pagination or ordering details, but acceptable for a delta tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds context by linking 'since' to the timestamp and 'thesisId' to the thesis, but no extra details beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a per-thesis change delta since a timestamp. It specifies the resource (thesis) and the action (get delta), and distinguishes from the sibling 'get_changes' by adding 'delta' and listing what evolved.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for tracking changes over time but does not explicitly mention when not to use it or suggest alternatives like 'get_changes'. No guidance on selection between similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_congress_memberBInspect

Get a single Congress member by bioguide ID.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes	Bioguide ID

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose any behavioral traits (e.g., read-only, rate limits, or side effects). 'Get' implies read-only but is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff, directly conveys the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple lookup by ID with one parameter, the description is sufficient. No output schema exists, but description doesn't need to detail return values. Minimal but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with 'Bioguide ID' for the parameter. The description adds no additional meaning beyond the schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get a single Congress member by bioguide ID', specifying the action (get) and the resource (Congress member), and distinguishes from sibling 'list_congress_members'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'list_congress_members'. No explicit when-to-use or when-not-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_contagionAInspect

Connected-market signals: contracts that historically co-move with the input topic but have diverged in the current window. Surfaces "this market should have moved but didn't" trades.

ParametersJSON Schema

Name	Required	Description	Default
`topic`	No	Topic keyword (fed, election, ai)
`window`	No	Lookback (e.g. 24h, 7d)

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the tool returns signals of diverged co-moving contracts and mentions temporal aspects ('current window'), but lacks details on data freshness, measurement methodology, or any side effects. Basic transparency is present but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences efficiently convey the tool's purpose and value proposition. No redundancy, but could be slightly more structured (e.g., separate usage note). Still, it's well within acceptable limits for conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given moderate complexity (2 params, no output schema), the description adequately explains the tool's function and context. However, it omits return format or example output, which would improve completeness for an agent. Still, it provides sufficient context for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions for 'topic' and 'window'. The description adds conceptual context (contagion, divergence) but does not enhance parameter semantics beyond what the schema already provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb (get/surfaces), resource (contracts that co-move and diverge), and unique scope (connected-market signals, divergence). It effectively differentiates from sibling tools like get_market_diff or get_changes by focusing on historical co-movement and current divergence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use when looking for diverged co-moving markets but provides no explicit when-to-use or when-not-to-use guidance. No alternatives are named, leaving the agent to infer context from sibling tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_contextAInspect

START HERE. Global market snapshot: top edges (mispriced contracts), price movers, highlights, traditional markets — live exchange data updated every 15 min. With thesisId + apiKey: thesis-specific context including causal tree, edges with orderbook depth, evaluation history, and track record. No auth needed for global context.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	No	SimpleFunctions API key. Required only for thesis-specific context. Get one at https://simplefunctions.dev/dashboard/keys or run: sf login
`thesisId`	No	Thesis ID. Omit for global market snapshot (no auth needed).

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: data freshness ('live exchange data updated every 15 min'), authentication requirements ('No auth needed for global context'), and the two distinct operational modes. It doesn't mention rate limits, error conditions, or response format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with front-loaded information about global context, followed by thesis-specific details. Every sentence adds value, though the 'START HERE' prefix is slightly redundant and the description could be more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, no annotations, and no output schema, the description provides good coverage of functionality and usage modes. However, it lacks details about response format, error handling, and doesn't explain what 'edges' or 'causal tree' mean in this context, which could be important for proper tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds meaningful context beyond the schema by explaining the functional relationship between parameters: 'With thesisId + apiKey: thesis-specific context' and 'Omit for global market snapshot'. This clarifies how parameter combinations affect tool behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides market data and thesis-specific context, with specific resources mentioned (mispriced contracts, price movers, highlights, traditional markets, causal tree, edges with orderbook depth, evaluation history, track record). However, it doesn't explicitly differentiate from siblings like 'get_edges' or 'get_world_state' which might provide overlapping market data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use different modes: 'No auth needed for global context' and 'With thesisId + apiKey: thesis-specific context'. It clearly distinguishes between global snapshot and thesis-specific access, though it doesn't mention alternatives among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_economic_anchorsAInspect

Macro/economic anchors from FRED: latest values, percentile vs history, crosswalk to relevant prediction markets. For grounding macro theses.

ParametersJSON Schema

Name	Required	Description	Default
`series`	No	FRED series ID (e.g. CPIAUCSL)
`category`	No	Category (rates, inflation, employment, growth)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description implies read-only operation and describes outputs, but does not disclose authentication needs, rate limits, or detailed behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no waste, front-loading the purpose and key outputs.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema; description covers return value types (latest values, percentiles, crosswalks). Could benefit from more detail on format or structure, but sufficient for a moderately complex tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds minimal new information beyond clarifying the source (FRED) and context (macro anchors).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves macro/economic anchors from FRED, providing latest values, percentiles, and crosswalks to prediction markets, distinguishing it from other data retrieval tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Indicates usage for grounding macro theses, but does not explicitly mention when not to use or suggest alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_edgesAInspect

Top mispriced markets across all theses, ranked by edge size. Shows where your model disagrees with the market. No auth = public thesis edges. Auth = private + public.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max edges to return
`venue`	No	Filter: kalshi or polymarket
`apiKey`	No	SimpleFunctions API key. Omit for public edges only.
`minEdge`	No	Minimum edge in cents

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: the ranking mechanism (by edge size), the effect of auth (public vs. private+public edges), and the focus on model-market disagreement. However, it lacks details on rate limits, error handling, or output format, which would be needed for a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by clarifying details in two concise sentences. Every sentence adds value (e.g., explaining auth impact), with no wasted words or redundancy, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, no annotations, no output schema), the description is adequate but has gaps. It covers the purpose, auth context, and ranking logic, but lacks details on output format, error cases, or deeper behavioral aspects like pagination. This makes it minimally viable for a tool with this parameter count and no structured support.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (limit, venue, apiKey, minEdge). The description adds marginal value by implying the apiKey's role in auth (private vs. public edges), but does not provide additional semantics beyond what the schema offers, such as explaining 'edge' units or venue specifics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it retrieves 'Top mispriced markets across all theses, ranked by edge size' and explains that it 'Shows where your model disagrees with the market.' This specifies the verb (get/retrieve), resource (edges/mispriced markets), and scope (across all theses), distinguishing it from siblings like get_markets or get_forecast by focusing on edge-based ranking of mispricing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use it: to find mispriced markets based on edge size. It distinguishes between public and private edges based on auth presence, but does not explicitly mention when not to use it or name specific alternatives among siblings (e.g., scan_markets or screen_markets might be related but are not referenced).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_evaluation_historyBInspect

Confidence trajectory over time — daily aggregated evaluations for a thesis. Use to analyze trends, detect convergence/divergence.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes the tool as retrieving aggregated data over time, which implies a read-only operation, but does not specify aspects like rate limits, authentication needs (beyond the apiKey parameter), data format, or pagination. This leaves gaps in understanding how the tool behaves in practice.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded, consisting of two short sentences that directly state the tool's function and usage. There is no wasted language, and it efficiently communicates the core purpose without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a read operation with 2 parameters) and the absence of annotations and output schema, the description is moderately complete. It explains what data is retrieved but lacks details on return values, error handling, or behavioral constraints. This is adequate for a basic tool but leaves room for improvement in guiding the agent fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (thesisId and apiKey) documented in the schema. The description does not add any additional meaning or context beyond what the schema provides, such as explaining how the thesisId relates to evaluations or the aggregation process. Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to retrieve 'daily aggregated evaluations for a thesis' and analyze 'confidence trajectory over time.' It specifies the resource (thesis evaluations) and the verb (get/analyze), but does not explicitly differentiate from siblings like 'get_milestones' or 'get_world_state,' which might also involve thesis-related data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'use to analyze trends, detect convergence/divergence,' which implies a context for usage but does not provide explicit guidance on when to use this tool versus alternatives. No sibling tools are referenced, and there is no mention of prerequisites or exclusions, leaving the agent to infer usage based on the purpose alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_feedBInspect

Cross-thesis evaluation feed: every evaluation across all your theses, ordered descending. Powers sf feed CLI.

ParametersJSON Schema

Name	Required	Description
`hours`	No	Lookback hours (default 24)
`limit`	No	Max rows
`apiKey`	Yes	SimpleFunctions API key

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description only states basic behavior (returns evaluations, ordered descending). It does not disclose important details like pagination behavior via the limit parameter, authentication requirements beyond mentioning apikey, or whether results are filtered by user.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence. Front-loaded with the key purpose. Every word is necessary and adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool is a straightforward list, the description is adequate but lacks details that would help an agent choose it over siblings like get_evaluation_history. No output schema means the description could clarify return format. Also no mention of rate limits or other behavioral constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (all three parameters have descriptions). The tool description adds no additional meaning beyond the schema for the parameters. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Purpose is clear: returns evaluations across all theses, ordered descending. The description also mentions it powers the CLI. It implicitly distinguishes from per-thesis evaluation tools like get_evaluation_history by emphasizing cross-thesis aggregation, but does not explicitly differentiate from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or alternatives. The description implies it's for a broad overview, but provides no guidance on when not to use or how it compares to get_evaluation_history or other similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_fillsCInspect

Advanced: Recent trade fills on Kalshi.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`ticker`	No	Filter by market ticker

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'recent trade fills' but doesn't specify time ranges, pagination, rate limits, authentication needs beyond the apiKey parameter, or what 'Advanced:' entails operationally. This leaves significant gaps for a tool that likely involves financial data access.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—a single sentence with no wasted words. It's front-loaded with the key information ('recent trade fills on Kalshi'), and the 'Advanced:' prefix, while minimal, adds context efficiently. Every part earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a tool that likely returns financial data (trade fills), the description is incomplete. It doesn't explain return values, error conditions, or behavioral traits like data recency or access constraints. For a tool with potential complexity in a trading context, this is inadequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (apiKey and ticker) well-documented in the schema. The description adds no additional parameter semantics beyond implying filtering by 'recent' and 'Kalshi', which are already covered or inferred. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('get') and resource ('recent trade fills on Kalshi'), making the purpose specific and understandable. It distinguishes from siblings like 'get_orders' or 'get_settlements' by focusing on trade fills, though it doesn't explicitly contrast with them. The 'Advanced:' prefix adds context but doesn't detract from clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'Advanced:' which might imply it's for experienced users, but offers no explicit when/when-not rules or references to sibling tools like 'get_orders' or 'get_settlements' for comparison. Usage is implied rather than stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forecastCInspect

Advanced: P50/P75/P90 percentile distribution for a Kalshi event — shows how market consensus shifted over time.

ParametersJSON Schema

Name	Required	Description
`days`	No	Days of history (default 7)
`apiKey`	Yes	SimpleFunctions API key (needed for authenticated Kalshi calls). Get one at https://simplefunctions.dev/dashboard/keys
`eventTicker`	Yes	Kalshi event ticker (e.g. KXWTIMAX-26DEC31)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool shows 'how market consensus shifted over time' which implies historical trend analysis, but doesn't disclose authentication requirements (though the apiKey parameter hints at this), rate limits, data freshness, or what format the percentile distribution is returned in. For a financial data tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that conveys the core functionality. The 'Advanced:' prefix could be considered slightly redundant but doesn't significantly detract. The sentence is front-loaded with the main purpose and doesn't waste words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a financial data tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (beyond mentioning percentile distributions), doesn't specify data formats or units, and provides no context about Kalshi events themselves. Given the complexity of financial forecasting data and lack of structured output documentation, more completeness is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema - it doesn't explain how the 'days' parameter relates to the percentile calculations or provide examples of event ticker formats. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves percentile distribution data (P50/P75/P90) for a Kalshi event and shows how market consensus shifted over time. It specifies the resource (Kalshi event) and verb (get/show distribution), but doesn't explicitly differentiate from sibling tools like 'get_markets' or 'scan_markets' which might provide related market data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_markets' or 'scan_markets' that might provide overlapping functionality, nor does it specify prerequisites beyond the implied need for a Kalshi event ticker. The 'Advanced:' prefix suggests specialized use but offers no concrete usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_glossary_termBInspect

Get a single glossary term with full definition and links.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	Glossary slug

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose error handling (e.g., missing slug), idempotency, or any side effects. It only states it 'gets' data, which is safe by default, but lacks behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one short sentence with no redundant words. It is appropriately concise for a simple tool, though it could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a one-parameter read tool without an output schema, the description is adequate but could specify the return format (e.g., object with definition and links) to improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a description for 'slug'. The description adds no extra meaning beyond the schema, so it meets the baseline but does not enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Get), the resource (single glossary term), and the output (full definition and links). It effectively distinguishes from siblings like list_glossary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. It implies usage for retrieving a single term, but does not mention when to prefer it over list_glossary or other get tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_heartbeat_configCInspect

Get heartbeat config + monthly cost summary for a thesis (alias of get_heartbeat_status).

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key
`thesisId`	Yes	Thesis ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It only states the tool returns config and cost summary, but does not mention side effects, permissions, rate limits, or whether it is read-only. For a read operation, this is minimal but not misleading.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence), which is concise but arguably under-specified. It lacks structure or front-loading of key details. While every word serves purpose, more context could be added without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, and the description only briefly mentions return content (config + monthly cost summary). Given the tool's simplicity and high schema coverage, it is adequate but could be improved by listing specific config fields or cost details to fully inform invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100%, with both parameters (apiKey, thesisId) already described in the schema. The description adds negligible extra meaning beyond what the schema provides. A baseline score of 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets heartbeat config and monthly cost summary for a thesis, and explicitly identifies it as an alias of get_heartbeat_status. This provides a specific verb-resource pair and clarifies its relationship to a sibling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided. The description does not indicate when to use this tool versus alternatives, nor does it mention any prerequisites or exclusions. Given the sibling set includes a nearly identical name (get_heartbeat_status), explicit guidance would be helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_heartbeat_statusCInspect

Get heartbeat config + this month's cost summary for a thesis. See news/X scan intervals, model tier, budget usage.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It implies a read-only operation ('Get'), but doesn't specify if it requires specific permissions, has rate limits, returns real-time or cached data, or details error conditions. The mention of 'scan intervals' and 'budget usage' hints at monitoring behavior, but lacks operational clarity for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. The additional detail about 'news/X scan intervals, model tier, budget usage' is relevant but could be slightly more integrated. Overall, it's appropriately sized with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with 2 parameters and no output schema, the description covers the basic purpose but lacks completeness. Without annotations or output schema, it should ideally specify return format, error handling, or data freshness. The mention of specific data points helps, but doesn't fully compensate for missing behavioral context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (thesisId, apiKey) well-documented in the schema. The description adds no parameter-specific information beyond what the schema provides, such as format examples or constraints. Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'heartbeat config + this month's cost summary for a thesis', specifying both the resource (thesis) and what data is returned (config and cost summary). It distinguishes from siblings like 'get_balance' or 'get_milestones' by focusing on heartbeat-specific monitoring data, though it doesn't explicitly contrast with them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_balance' (for general balance) or 'configure_heartbeat' (for setup). It mentions 'See news/X scan intervals, model tier, budget usage' as data points, but this is output detail, not usage context. No exclusions or prerequisites are stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_highlightsAInspect

Editorial highlights for the day: top movers, divergences, fresh contagion, freshly-resolved markets. Curated summary view.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It describes the output as a 'curated summary view' but does not disclose potential side effects, data sources, or access requirements. It is a read operation but not explicitly stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no superfluous words. It is front-loaded with the purpose and provides specific examples. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description is reasonably complete but could add more context about return format (e.g., is it a list of items?), ordering, or data recency. It covers the basic purpose but lacks some operational details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so by baseline it scores 4. The description adds meaning about the content of the output (top movers, divergences, etc.), which is helpful beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides 'editorial highlights for the day' with specific examples like 'top movers, divergences, fresh contagion, freshly-resolved markets'. It is a specific verb+resource (get highlights) and distinguishes from siblings like get_feed or get_briefing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for a curated daily summary but does not explicitly state when to use versus alternatives or any exclusions. There is no guidance on context or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_index_historyBInspect

Historical SimpleFunctions Index snapshots. Pre-computed every 15 minutes, stored since v2 launch (2026-04-09). For charting trends.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Lookback days (default 7)

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses update frequency ('Pre-computed every 15 minutes') and data start date ('stored since v2 launch (2026-04-09)'), adding value beyond annotations (which are absent). However, it omits key behavioral details like read-only nature, rate limits, or output format, limiting transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, with two short sentences front-loading the purpose and adding essential details. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description provides sufficient context: purpose, data freshness, and storage start date. It adequately informs an agent, though output structure is omitted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers the only parameter ('days') with a description. The tool description adds no additional meaning beyond the schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides 'Historical SimpleFunctions Index snapshots' and 'For charting trends,' indicating the resource and action. However, it does not explicitly differentiate from sibling tools like get_market_index or get_market_history, leaving room for confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for charting trends but offers no explicit guidance on when to use this tool versus alternatives, nor any conditions or exclusions. Given many sibling get_* tools, this is insufficient for optimal tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_legislationCInspect

Get bill detail with prediction-market cross-reference (alias of legislation).

ParametersJSON Schema

Name	Required	Description	Default
`billId`	Yes	Bill ID, e.g. "119-hr-22"

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description bears full burden. It does not disclose any behavioral traits (e.g., read vs. write, authentication needs, rate limits, or output behavior). The phrase 'prediction-market cross-reference' hints at extra data but is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no extraneous information. It could benefit from slight structuring (e.g., listing behavior), but it is short and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter, no output schema), the description is minimally adequate. It provides the core purpose but lacks completeness regarding return format or any side effects. A more complete description would help the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'billId' is fully described in the input schema with an example. The description adds minimal value beyond the schema, mentioning prediction-market cross-reference but no additional parameter context. Schema coverage is 100%, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves bill details and mentions a prediction-market cross-reference. It explicitly calls itself an alias of 'legislation', helping differentiate from a direct sibling. However, it could be more specific about the nature of the detail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. While it mentions being an alias, it does not explain the context for using this over other related tools like 'list_legislation' or 'legislation'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_detailAInspect

Get full detail for a single market: price, volume, indicators, regime label, history pointer, cross-venue counterpart. Lower-level than inspect_ticker — raw JSON only.

ParametersJSON Schema

Name	Required	Description	Default
`depth`	No	Orderbook depth levels to include (0 = none)
`ticker`	Yes	Market ticker

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It clarifies the output is 'raw JSON only', a behavioral trait. However, it does not disclose potential side effects, authorization requirements, or rate limits, leaving some behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences with no redundant information. It front-loads the primary purpose and immediately provides context that distinguishes it from a sibling, making every word earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists key return elements (price, volume, indicators, etc.) and compares to 'inspect_ticker'. The sibling context is large, but the description fully captures the tool's role and output, making it complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers both parameters ('ticker', 'depth') with descriptions, achieving 100% coverage. The description does not add new meaning beyond the schema, meeting the baseline expectation but not exceeding it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves full detail for a single market, listing specific data types (price, volume, indicators, etc.). It explicitly distinguishes from the sibling 'inspect_ticker' by noting it is lower-level and returns raw JSON only, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear comparison with the sibling 'inspect_ticker' (lower-level vs. higher-level), which helps the agent decide which tool to use. However, it does not explicitly state when not to use this tool or mention alternatives among other sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_diffAInspect

Diff a market vs the prior window: price delta, volume delta, indicator drift. For "what changed in the last 6h?" questions.

ParametersJSON Schema

Name	Required	Description
`sort`	No	Sort field (e.g. priceDelta)
`topic`	No	Topic keyword if no tickers
`window`	No	Lookback (default 24h)
`tickers`	No	Comma-separated tickers

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It mentions comparing to a prior window and outputs, but does not clarify side effects (e.g., read-only), authentication needs, or rate limits. The description is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the main action and outputs. Every sentence serves a purpose, with no redundant information. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, no required fields, and no output schema, the description covers the core purpose and a typical use case. It lacks details on return format or edge cases, but given the tool's simplicity, it is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add significant meaning beyond the schema; it mentions output deltas but not how parameters like 'sort' or 'window' work. Minimal added value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action (diff), resource (market), and specific outputs (price delta, volume delta, indicator drift). It also provides a concrete use case ('what changed in the last 6h?'), distinguishing it from sibling tools like get_changes or get_world_delta.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives a clear usage scenario ('For what changed in the last 6h? questions'), implying when to use it. However, it does not explicitly state when not to use it or mention alternatives among siblings, leaving some inference needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_historyAInspect

Get rolling 7-day price + indicator history for a single market. For trajectory questions and chart rendering.

ParametersJSON Schema

Name	Required	Description	Default
`ticker`	Yes	Market ticker

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full behavioral burden. It discloses the rolling 7-day window and that it returns price+indicator history, but omits specifics like which indicators, data resolution, or response format. Acceptable but leaves questions about exact output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two compact sentences. First sentence states action and scope, second provides usage context. Every word earns its place with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and description does not hint at return structure (e.g., array of objects, fields). For a charting/data tool, this leaves ambiguity. However, for a simple single-param tool, the description is minimally adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds meaning beyond the schema: 'for a single market' clarifies the ticker's context and that it's not for indices or batches. This elevates from baseline 3 to 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get rolling 7-day price + indicator history for a single market', specifying the exact data scope and time window. It distinguishes from siblings like get_index_history (which is for indices) and get_market_detail (which provides different data).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions two use cases: 'For trajectory questions and chart rendering', giving clear context for when to use. Does not explicitly exclude when not to use or name alternatives, but the guidance is sufficient for a focused data retrieval tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_indexAInspect

SimpleFunctions Prediction Market Index v2. Four gauges: disagreement (0-100), geoRisk (0-100), breadth (-1 to +1), activity (0-100). Updated every 15 minutes.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides partial transparency: update frequency (every 15 minutes) and value ranges. However, it does not disclose behavior like rate limits, authentication requirements, or consequences of calling between updates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no unnecessary words. Efficiently conveys purpose and key details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description adequately explains return values (four gauges with ranges). Could be more complete by specifying the output structure (e.g., object with named fields), but sufficient for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters (100% coverage), so baseline is 4. The description adds meaning about the output (four gauges with ranges) which helps the agent understand what to expect.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves the prediction market index with specific gauges (disagreement, geoRisk, breadth, activity) and their ranges. It distinguishes from 'get_index_history' by implying current snapshot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like 'get_index_history' or other data tools. The description only mentions update frequency but no context on appropriate scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_microstructure_historyAInspect

Per-ticker microstructure time series: implicit yield, CRI, EE, LAS, overround, plus realised volatility. Used for charting indicator drift.

ParametersJSON Schema

Name	Required	Description
`days`	No	Lookback days (default 7)
`ticker`	Yes	Market ticker
`interval`	No	Bucketing

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It mentions the tool returns time series data but does not disclose data retention, update frequency, error handling, or any side effects. The lack of behavioral context limits tool selection.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each adding clear value: the first defines the output, the second states the usage. No wasted words, well-front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description lists the returned metrics, which partially compensates. However, it lacks details on data structure (e.g., array of objects, timestamps). Still, it provides sufficient context for a simple time series tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have descriptions in the input schema (100% coverage), so the baseline is 3. The description adds context about the output metrics, but no additional parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns per-ticker microstructure time series with specific metrics like implicit yield, CRI, etc., and explicitly mentions its use for charting indicator drift. This effectively distinguishes it from siblings like get_market_history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for charting indicator drift but does not explicitly compare to alternative tools or provide when-not-to-use guidance. Without clear differentiation, an agent might not know when to choose this over similar tools like get_market_history.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_marketsAInspect

Get traditional market prices via Databento. Default: SPY, VIX, TLT, GLD, USO. Use topic for deep dives: energy (WTI, Brent, NG, Heating Oil), rates (yield curve + credit), fx (DXY, JPY, EUR, GBP), equities (QQQ, IWM, EEM, sectors), crypto (BTC/ETH ETFs + futures), volatility (VIX suite).

ParametersJSON Schema

Name	Required	Description	Default
`topic`	No	Deep dive topic: energy, rates, fx, equities, crypto, volatility. Omit for core 5.

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes what data is returned (market prices) and the default behavior (core 5 markets). However, it doesn't disclose important behavioral traits like whether this is a read-only operation, potential rate limits, authentication requirements, data freshness, or error handling. The description adds some context but leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences. The first sentence states the core purpose and default behavior. The second sentence explains the parameter usage with specific examples. Every sentence earns its place, though the second sentence is somewhat dense with multiple topic examples packed together.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (1 optional parameter, no output schema, no annotations), the description is somewhat complete but has gaps. It explains what data is returned and how to use the parameter, but doesn't describe the return format, data structure, or any limitations. For a data retrieval tool with no output schema, more information about the response would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its single parameter, so the baseline is 3. The description adds significant value by explaining the parameter's purpose ('Use topic for deep dives') and providing concrete examples of what each topic value returns (e.g., 'energy (WTI, Brent, NG, Heating Oil)'). This goes beyond the schema's generic description and helps the agent understand the parameter's practical use.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get traditional market prices via Databento.' It specifies the verb ('Get') and resource ('traditional market prices'), and mentions the data source. However, it doesn't explicitly distinguish this tool from sibling tools like 'scan_markets' or 'screen_markets', which appear related but have different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage guidance: 'Default: SPY, VIX, TLT, GLD, USO. Use topic for deep dives...' It explains when to use the optional parameter (for specific topic areas) versus when to omit it (for core 5 markets). However, it doesn't explicitly state when NOT to use this tool or mention alternatives among siblings like 'scan_markets' or 'query_databento'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_milestonesAInspect

Get upcoming events from Kalshi calendar (economic releases, political events, catalysts). No auth required.

ParametersJSON Schema

Name	Required	Description	Default
`hours`	No	Hours ahead to look (default 168 = 1 week)
`category`	No	Filter by category (e.g. Economics, Politics, Sports)

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that 'No auth required' which is valuable behavioral context about authentication requirements. However, it doesn't mention rate limits, pagination behavior, error conditions, or what format the events are returned in. The description adds some value but leaves significant behavioral aspects unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise - just two sentences that each earn their place. The first sentence states the purpose and scope, the second provides important behavioral context about authentication. There's zero wasted language or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with 2 parameters and 100% schema coverage but no output schema, the description provides adequate but minimal information. It covers the purpose and authentication context but doesn't describe the return format or data structure. Given the lack of output schema, the description could do more to explain what the tool returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters. The description doesn't add any parameter-specific information beyond what's in the schema. According to guidelines, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('upcoming events from Kalshi calendar'), specifying the content type ('economic releases, political events, catalysts'). It distinguishes from siblings like get_schedule or get_markets by focusing on calendar events rather than trading schedules or market data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Get upcoming events from Kalshi calendar') and mentions 'No auth required' which is helpful guidance. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, though the purpose naturally differentiates it from most siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_newmarketsBInspect

Recently-listed markets (new contracts) on Kalshi and Polymarket. For finding fresh trading opportunities.

ParametersJSON Schema

Name	Required	Description
`hours`	No	Lookback hours (default 24)
`limit`	No	Max rows
`venue`	No	kalshi or polymarket
`minLiquidity`	No	Minimum liquidity threshold

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are available, so the description must bear the full burden. It only states 'recently-listed' and 'fresh opportunities' but fails to disclose what 'recently' means, data freshness guarantees, rate limits, or any mutation behavior. The description is too vague.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (two sentences) and front-loaded with the core purpose. No unnecessary information, but it could be slightly more informative without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters and no output schema, the description is minimal. It does not explain the return format, pagination, or the exact definition of 'new' (e.g., hours-based). Given many sibling tools, more context would help the agent select correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has all four parameters described, achieving 100% coverage. The description does not add extra meaning beyond the schema, so the baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'recently-listed markets' and specifies the venues (Kalshi and Polymarket), which distinguishes it from sibling tools like get_markets or scan_markets. The purpose is specific and actionable for finding fresh trading opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for finding fresh opportunities but does not explicitly state when to use this tool versus alternatives like get_markets or scan_markets. No guidance on when not to use it is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_opinionCInspect

Get a single opinion/essay by slug.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	Opinion slug

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully convey behavioral traits. It implies a read-only operation but does not disclose failure behavior (e.g., invalid slug), rate limits, or whether full content is returned. The agent lacks crucial behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, front-loaded with the action and resource. Every word is necessary, but it could be slightly more informative without losing conciseness (e.g., specifying that it returns the full essay).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description leaves ambiguity about the response structure. The agent does not know if it returns content, metadata, or both. Siblings like 'get_context' suggest similar retrieval but differ in scope, yet this is not clarified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes the 'slug' parameter. The description merely repeats 'by slug', adding no further meaning (e.g., format, endpoint variations). Baseline 3 is appropriate as the description adds no extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a single opinion/essay identified by a slug. The verb 'Get' and resource 'opinion/essay' are specific. However, it does not differentiate from siblings like 'get_answer' or 'get_context', nor does it highlight that 'list_opinions' is for multiple opinions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. For instance, it doesn't mention that this is for fetching one specific opinion, while 'list_opinions' returns a list. There are no conditions, prerequisites, or exclusions stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_ordersCInspect

Advanced: Current resting orders on Kalshi.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	Filter by status: resting, canceled, executed. Default: resting	resting

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'resting orders' which implies read-only retrieval of pending orders, but doesn't specify authentication requirements (beyond the apiKey parameter), rate limits, pagination, error conditions, or what data is returned. For a financial API tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (6 words) and front-loaded with the core functionality. However, the 'Advanced:' prefix adds minimal value and could be considered slightly wasteful. Overall, it's efficient but could be more informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a financial trading API tool with no annotations and no output schema, the description is inadequate. It doesn't explain what data is returned (order details, format, structure), doesn't mention authentication requirements beyond the parameter, and provides no context about Kalshi's platform. The description should do more to compensate for the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents both parameters (apiKey and status). The description doesn't add any parameter semantics beyond what's in the schema - it doesn't explain what 'resting orders' means in relation to the status parameter or provide context about Kalshi's order lifecycle. Baseline 3 is appropriate when schema does the documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('get') and resource ('current resting orders on Kalshi'), making the purpose specific and understandable. It doesn't explicitly distinguish from sibling tools like 'get_fills' or 'get_settlements', but the resource specificity ('resting orders') provides implicit differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_fills' (for executed orders) or 'cancel_intent' (for order management). The 'Advanced:' prefix is vague and doesn't clarify appropriate contexts or prerequisites for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_positionsAInspect

Advanced: Open Kalshi positions with live P&L. Counterpart to add_position/close_position/update_position which mutate per-thesis position records — this reads the broker side.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Requires Kalshi credentials configured via sf setup.

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations, so description must disclose behavior. It states it's a read operation with live P&L, but does not detail return format, error conditions, or permissions beyond apiKey. Could be more transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, all content valuable, front-loaded with purpose and contrast.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema; description mentions live P&L but does not specify the output structure or potential errors. Adequate for a simple read but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter already well-described. Description adds no extra parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it reads positions (not mutates) with live P&L, and distinguishes from sibling mutate tools like add_position/close_position/update_position.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says it is the counterpart to add_position/close_position/update_position and reads, so agent knows when to use this vs those.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_public_skillBInspect

Get a single published skill by slug.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	Public skill slug

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility. It only states the basic purpose without disclosing behavioral traits such as read-only nature, error handling (e.g., missing slug), or any side effects. The description is insufficient for an agent to understand behavior beyond the core operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is concise and front-loaded. Every word carries meaning, and there is no fluff. It efficiently communicates the core function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description provides the essential purpose but lacks details like how to identify a published skill, typical return structure, or error cases. It is minimally complete but could benefit from a hint about the response format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the schema already documents the 'slug' parameter. The description adds marginal value by confirming the slug is for a published skill. Baseline 3 is appropriate as the description does not deepen understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get a single published skill by slug' clearly specifies the action (get), resource (a single published skill), and the method (by slug). It distinguishes from siblings like 'browse_public_skills' (likely listing) and 'get_skills' (possibly private).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description does not mention prerequisites, typical use cases, or conditions like needing a slug. There are no 'when to use' or 'when not to use' instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_regime_scanBInspect

Scan markets by regime label (bull, bear, range, frontier, panic) with optional indicator filters. For regime-based screening.

ParametersJSON Schema

Name	Required	Description
`sort`	No	Sort field (e.g. as, score)
`label`	No	Regime label filter
`limit`	No	Max rows (default 50)
`order`	No	Sort order
`venue`	No	kalshi or polymarket
`hasEdge`	No	Filter to markets with non-trivial SimpleFunctions edge
`eventType`	No	binary, scalar, ladder

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description carries full burden. It fails to mention behavioral traits like read-only nature, pagination, rate limits, or authentication requirements. The description is too brief for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no filler. Front-loaded with purpose and regime labels. Could be slightly more structured but remains concise and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple scan tool with fully described schema. Lacks details on output or return format since no output schema exists. For a tool with 7 parameters and no annotations, more context would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already documented. The description adds 'with optional indicator filters' as context but doesn't elaborate on specific parameters beyond the schema. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Scan markets by regime label' with enumerated labels, and adds 'with optional indicator filters'. It distinguishes its regime-based focus from generic scanning tools, but doesn't explicitly differentiate from siblings like scan_markets or screen_markets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: regime-based screening. No explicit when-to-use, when-not-to-use, or alternatives. The phrase 'For regime-based screening' gives context but lacks guidance on selecting this tool over similar ones.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_scheduleBInspect

Get exchange status and trading hours

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It implies a read-only operation ('get') but doesn't specify whether it requires authentication, has rate limits, returns real-time or cached data, or details the response format. This leaves significant gaps for a tool that likely interacts with market data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's purpose without any fluff or redundancy. It's front-loaded and wastes no words, making it ideal for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's likely complexity (providing exchange status and trading hours), the description is minimal. With no annotations and no output schema, it lacks details on authentication needs, data freshness, or return structure. However, for a zero-parameter tool, it meets a basic threshold but could be more informative about behavioral aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so no parameter documentation is needed. The description appropriately doesn't mention parameters, earning a baseline high score since it doesn't need to compensate for any gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('get') and resources ('exchange status and trading hours'), making it immediately understandable. However, it doesn't differentiate from sibling tools like 'get_markets' or 'get_balance' that might also provide related market information, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_markets' or 'get_forecast', nor does it mention any prerequisites or exclusions. It merely states what the tool does without contextual usage information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_settlementsCInspect

Advanced: Settled contracts with realized P&L.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`ticker`	No	Filter by market ticker

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'advanced' and 'realized P&L' but doesn't disclose behavioral traits like whether this is a read-only operation, authentication requirements beyond the apiKey parameter, rate limits, or what the output format looks like. For a financial data tool with no annotations, this leaves significant gaps in understanding its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that gets straight to the point without unnecessary words. It's appropriately sized for a simple data retrieval tool, though it could be slightly more structured by separating purpose from context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete. It doesn't explain what 'settled contracts' entail, how 'realized P&L' is calculated, or what the return data looks like. For a financial tool with potential complexity, this leaves too much undefined for effective agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (apiKey and ticker). The description doesn't add any parameter-specific information beyond what's in the schema, such as explaining how ticker filtering works or providing examples. With high schema coverage, the baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool retrieves 'settled contracts with realized P&L,' which is a clear verb+resource combination. It specifies 'advanced' context and includes 'realized P&L' to differentiate from generic settlement data. However, it doesn't explicitly distinguish from siblings like get_fills or get_orders, which might also involve financial transactions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'advanced' but doesn't clarify what makes it advanced or when to prefer it over other financial data tools like get_fills or get_orders. There are no explicit when/when-not statements or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_skillsAInspect

List authenticated user's skills. (Public skills are at browse_public_skills.)

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description implies a read-only, authenticated operation and scopes to the user's own skills. Does not detail error cases or side effects, but for a list operation, this is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exceptionally concise: one sentence plus a parenthetical note. Every word adds value, no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with good schema coverage and no output schema, the description fully covers purpose, scope, and alternative tool. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter (apiKey). The description adds context by linking the apiKey to the authenticated user, going beyond the generic schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists the authenticated user's skills, distinguishing it from the sibling tool 'browse_public_skills' for public skills.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use this tool (own skills) and directs to 'browse_public_skills' for public skills, providing clear usage guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_technicalBInspect

Get a single technical reference doc by slug.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	Technical doc slug

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden but only states the action. It does not disclose any behavioral traits such as rate limits, authentication needs, or side effects. For a read-only tool, it minimally conveys no side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with 7 words, front-loaded with the verb and resource, and contains no superfluous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one param, no output schema), the description is minimally adequate but lacks context about the return value or behavior. It is not incomplete but could be more helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the 'slug' parameter. The description adds no extra meaning beyond what the schema provides, meeting the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'technical reference doc', with the parameter 'slug' mentioned. It distinguishes from siblings like get_glossary_term by specifying 'technical reference doc'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_glossary_term or get_market_detail. The description does not provide context for usage exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_thesis_contextAInspect

Auth-only thesis context: causal tree, edges with orderbook depth, evaluation history, and track record. Use get_context with no thesisId for the global market snapshot.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key
`thesisId`	Yes	Thesis ID

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions 'Auth-only' hinting at authentication needs but does not disclose behavioral traits like error handling, rate limits, or what happens if the thesisId is invalid. The description focuses on content rather than behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the key information in the first sentence and providing usage guidance in the second. No redundant text; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (2 params, no output schema), the description lists the specific data elements returned (causal tree, edges, evaluation history, track record), providing a good sense of the output. However, it does not specify the return format or whether pagination applies, which would elevate completeness further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with two string parameters (apiKey, thesisId). The description adds no additional meaning beyond the schema; it does not explain the format or constraints of these parameters. Baseline 3 is appropriate as schema already covers them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states what the tool does: returns a thesis-specific context including causal tree, edges with orderbook depth, evaluation history, and track record. It distinguishes from sibling tool 'get_context' by noting that 'get_context' without thesisId provides the global snapshot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (for a specific thesis) and provides an alternative: 'Use get_context with no thesisId for the global market snapshot.' This leaves no ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_trade_ideasAInspect

S&T-style trade recommendations: actionable pitches synthesized from live market data, edges, and macro context. Each idea has conviction level, catalyst timing, direction, and risk. No auth required. START HERE when looking for what to trade.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max ideas (1-10)
`category`	No	Filter: macro, geopolitics, crypto, policy, event
`freshness`	No	Max cache age: 1h, 6h, 12h, 1d	12h

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses key behavioral traits: 'No auth required' (permissions), and details about idea structure ('conviction level, catalyst timing, direction, and risk'). However, it doesn't mention rate limits, pagination, or response format specifics that would be helpful for a tool returning recommendations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with three sentences that each serve distinct purposes: defining the tool, detailing its output characteristics, and providing usage guidance. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no annotations and no output schema, the description does well by explaining the nature of the recommendations and their components. However, it doesn't describe the return format or structure of the trade ideas, which would be valuable given the absence of an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters. The description doesn't add any parameter-specific information beyond what's in the schema. This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'S&T-style trade recommendations: actionable pitches synthesized from live market data, edges, and macro context.' It specifies the verb ('recommendations') and resource ('trade ideas'), and distinguishes it from siblings by emphasizing it's the starting point for finding trades.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'START HERE when looking for what to trade.' This clearly indicates when to use this tool versus alternatives among the many sibling tools, establishing it as the primary entry point for trade discovery.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_world_deltaAInspect

Incremental world state update — only what changed since a timestamp. ~30-50 tokens vs 800 for full state. For periodic refresh during long tasks.

ParametersJSON Schema

Name	Required	Description	Default
`since`	Yes	Relative (30m, 1h, 6h, 24h) or ISO timestamp
`format`	No	Output format	markdown

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: the incremental nature (only changes since timestamp), performance characteristics (~30-50 tokens vs 800 for full state), and the intended use case (periodic refresh). However, it doesn't mention error handling, rate limits, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: the first sentence states the core purpose, the second provides performance context, and the third gives usage guidance. Every sentence earns its place with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema, no annotations), the description is fairly complete. It explains what the tool does, when to use it, and performance characteristics. However, it lacks details on return values or error conditions, which would be helpful since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema, maintaining the baseline score of 3 for adequate coverage without extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Incremental world state update — only what changed since a timestamp.' It specifies the verb ('update'), resource ('world state'), and scope ('incremental'), and distinguishes it from sibling tools like 'get_world_state' by emphasizing the delta approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'For periodic refresh during long tasks.' It also implies an alternative by contrasting with 'full state' (likely referring to 'get_world_state'), providing clear context for usage vs. other options.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_world_stateAInspect

Real-time world model for agents. ~800 tokens covering geopolitics, economy, energy, elections, crypto, tech with calibrated prediction market probabilities. Anchor contracts (recession, Fed, Iran) always present. No auth needed.

ParametersJSON Schema

Name	Required	Description	Default
`focus`	No	Comma-separated topics for deeper coverage: geopolitics, economy, energy, elections, crypto, tech. Same token budget concentrated on fewer topics.
`format`	No	Output format. markdown (default) is optimized for LLM context.	markdown

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so effectively. It discloses key behavioral traits: 'real-time world model,' token budget (~800 tokens), content scope, prediction market probabilities, always-present anchor contracts, and explicitly states 'No auth needed.' This provides comprehensive operational context beyond what the input schema covers.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is exceptionally concise and well-structured. Every sentence adds value: first establishes the core purpose, second specifies content scope and token budget, third mentions key features (prediction markets, anchor contracts), fourth states authentication status. Zero wasted words, perfectly front-loaded with the most important information first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (real-time world model with multiple content domains) and no annotations or output schema, the description provides excellent coverage. It explains what the tool returns (world model with specific content areas), format considerations (token budget), key features (prediction markets, anchor contracts), and operational constraints (no auth needed). The only minor gap is lack of explicit output format details, though the input schema covers this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters. The description doesn't add any parameter-specific information beyond what's in the schema descriptions. The baseline score of 3 is appropriate when the schema does all the parameter documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: providing a 'real-time world model for agents' with specific content coverage (~800 tokens covering geopolitics, economy, energy, elections, crypto, tech) and includes calibrated prediction market probabilities. It distinguishes itself from siblings by focusing on aggregated world state information rather than specific operations like trading, analysis, or content management.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for obtaining real-time world model information with prediction market probabilities. It mentions 'anchor contracts (recession, Fed, Iran) always present' which helps understand consistent coverage. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, though the content focus distinguishes it from most siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_yield_curveBInspect

Single yield curve for one event series.

ParametersJSON Schema

Name	Required	Description	Default
`event`	Yes	Event ticker (e.g. KXFEDDECISION-26DEC10)
`venue`	No	Venue if needed to disambiguate

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description only states purpose without disclosing behavioral traits like data source, freshness, caching, or that it returns a single curve object. Minimal transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is extremely concise (6 words) and front-loads the purpose. No wasted words, but might be too terse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given low complexity (2 params, no output schema), description is minimal but sufficient for a simple retrieval. However, lacking output format information reduces completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description adds no extra meaning beyond schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Single yield curve for one event series' with specific verb (get) and resource (yield curve), clearly differentiating it from sibling get_yield_curves (plural).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_yield_curves. Lacks explicit context such as 'use this for one specific curve; use get_yield_curves for multiple.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_yield_curvesAInspect

Liquidity-weighted yield curves across event types (e.g. KXFED 6mo, KXBTC 30d). For "where on the curve am I trading?" questions.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max events
`venue`	No	kalshi or polymarket
`minPoints`	No	Minimum curve points to keep an event

Tool Definition Quality

A3.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must fully disclose behavior. It mentions 'liquidity-weighted' but does not state safety (read-only), rate limits, or any side effects. Minimal operational insight.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with key information. No redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema, so description should explain return structure. It gives examples but does not describe the response format or pagination. Adequate but incomplete for a data retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover all three parameters (limit, venue, minPoints). The description adds no extra meaning beyond the schema, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb and resource: 'Liquidity-weighted yield curves across event types' with examples. Clearly states what it does and distinguishes from siblings by mentioning event types and liquidity weighting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly targets 'where on the curve am I trading?' questions, providing a clear use case. Does not mention when to avoid or alternatives like get_yield_curve.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

inject_signalBInspect

Feed an observation into a thesis — news, price move, or external event. Consumed in next evaluation cycle to update confidence and edges.

ParametersJSON Schema

Name	Required	Description	Default
`type`	No	Signal type	user_note
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`content`	Yes	Signal content
`thesisId`	Yes	Thesis ID

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the tool feeds observations that are 'consumed in next evaluation cycle', implying asynchronous processing and potential delay in effect. However, it doesn't disclose critical behavioral traits: whether this is a write operation (likely, given 'inject'), authentication requirements beyond the apiKey parameter, rate limits, error conditions, or what happens if the thesisId is invalid. The description adds some context but leaves significant gaps for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences that are front-loaded with the core purpose. Every word earns its place: the first sentence defines the action and examples, the second explains the outcome. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is a mutation tool (injecting signals) with no annotations and no output schema, the description is incomplete. It doesn't cover authentication needs (implied by apiKey but not explained), error handling, return values, or side effects. The context about the 'next evaluation cycle' is helpful, but for a tool that modifies thesis state, more behavioral disclosure is needed to be complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters well. The description doesn't add any parameter-specific semantics beyond what's in the schema (e.g., it doesn't elaborate on 'content' format or 'thesisId' sourcing). With high schema coverage, the baseline is 3, and the description doesn't compensate with extra parameter insights.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Feed an observation into a thesis' with specific examples (news, price move, external event). It distinguishes from siblings by focusing on signal injection rather than creation, evaluation, or querying of theses. However, it doesn't explicitly differentiate from tools like 'update_thesis' or 'trigger_evaluation' that might also affect thesis state.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Consumed in next evaluation cycle to update confidence and edges'), suggesting this tool is for adding observational data to inform thesis evaluations. However, it doesn't provide explicit guidance on when to use this versus alternatives like 'update_thesis', 'create_thesis', or 'trigger_evaluation', nor does it mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

inspect_tickerAInspect

STEP 2 OF THE LOOP. Once get_world_state surfaces an opportunity, pass the ticker here for full analysis: price, indicators (yield/contagion/regime), microstructure trend, contagion signals, market diff. Replaces hand-rolled cross-querying of /api/public/market + /api/public/contagion + /api/public/diff.

ParametersJSON Schema

Name	Required	Description
`diff`	No	Include market diff vs prior window (default true)
`depth`	No	Include live orderbook depth. Default false for cheaper cached agent sweeps.
`trend`	No	Include microstructure history (default true)
`format`	No	Default markdown
`ticker`	Yes	Market ticker, e.g. KXFEDDECISION-26DEC10-T0
`contagion`	No	Include contagion signals (default true)

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description implies read-only analysis ('full analysis') but does not explicitly state it has no side effects or destructive behavior. Could be more explicit about safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise with two sentences, front-loaded with key context (step 2, relationship to get_world_state). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description lists returned elements (price, indicators, etc.) adequately. For a single-purpose analysis tool, this covers the main semantics. Sibling tools like get_market_detail reinforce its composite nature.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all 5 parameters completely (100% coverage). Description mentions indicators that map to parameters but adds no new information beyond schema defaults and descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear action: analyze ticker as part of a loop. Explicitly distinguishes from get_world_state (step 1) and says it replaces multiple API calls, so purpose is specific and well-differentiated from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States it's step 2 after get_world_state surfaces an opportunity, and that it consolidates three API endpoints. Provides clear context for when to use, but does not explicitly exclude other scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

legislationCInspect

Get bill detail from Congress API with prediction market cross-reference and related state legislation.

ParametersJSON Schema

Name	Required	Description	Default
`billId`	Yes	Bill ID, e.g. "119-hr-22"

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions data sources (Congress API, prediction markets, state legislation) but fails to describe key traits like rate limits, authentication needs, error handling, or response format. This is inadequate for a tool with external API dependencies.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('Get bill detail') and includes all key components without waste. Every word contributes to understanding the tool's scope and data sources.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of integrating multiple data sources (Congress API, prediction markets, state legislation) and no output schema, the description is incomplete. It lacks details on response structure, error cases, or behavioral constraints, which are critical for effective tool invocation in this context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'billId' documented as 'Bill ID, e.g. "119-hr-22"'. The description adds no additional parameter details beyond this, so it meets the baseline score of 3 where the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get bill detail') and resources ('Congress API', 'prediction market cross-reference', 'related state legislation'), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'query_gov' or 'query', which might have overlapping functionality, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as 'query_gov' or other query-related siblings. It lacks context about prerequisites, exclusions, or specific use cases, leaving the agent without clear usage instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_congress_membersBInspect

List sitting US Congress members.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max rows
`state`	No	Two-letter state code
`chamber`	No	Chamber filter
`currentMember`	No	Only currently-serving members

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden of behavioral disclosure. However, it only states the basic function and does not reveal traits like default filtering (e.g., whether it returns all members or only currently-serving by default), pagination, ordering, or data freshness. The currentMember boolean parameter suggests a filter, but the description does not clarify its default behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, making it concise and front-loaded, but it is too brief for a tool with four optional parameters. It could be expanded slightly to include default behavior or usage hints without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 optional parameters, no output schema, no annotations), the description is incomplete. It does not explain the return format, default values for parameters, or how the list is ordered, leaving the agent with insufficient insight for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have descriptions in the input schema (100% coverage), so the description adds no extra meaning. Baseline is 3, and the description does not enhance understanding of parameters such as what 'limit' defaults to or how 'state' and 'chamber' combine.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List sitting US Congress members' clearly states the action (list) and resource (sitting US Congress members), immediately distinguishing this tool from related tools like 'get_congress_member' (which retrieves a single member) and other list tools for different entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'get_congress_member' for individual member lookups or other listing tools. The description lacks any context about prerequisites, ideal scenarios, or comparison with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_forum_channelsAInspect

List available forum channels (general, alerts, signals, etc.) the agent can read or post to.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It does not disclose any behavioral traits such as permissions required, rate limits, pagination, or error handling. Only states the action without side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, efficiently front-loads the verb and resource. No unnecessary words or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple listing tool, but lacks details about the return format or structure (e.g., whether it returns channel IDs, names, etc.), which would be useful for downstream tools. No output schema provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one parameter, apiKey, fully described). The description adds no additional meaning beyond the schema for this parameter. Baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool lists available forum channels, with examples (general, alerts, signals). It distinguishes itself from sibling tools like read_forum and post_to_forum which perform different actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's a prerequisite for reading or posting, but doesn't name other tools or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_glossaryAInspect

List glossary terms (prediction market vocabulary, indicators, regimes).

ParametersJSON Schema

Name	Required	Description	Default
`q`	No	Keyword
`category`	No	Category filter

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description does not disclose behavioral traits such as whether the operation is read-only, if pagination exists, or any prerequisites. For a list tool, more transparency is expected.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded, no wasted words. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations or output schema, the description is minimal. It lacks information about result format, pagination, or any additional context needed for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers both parameters (q, category) with descriptions. Schema description coverage is 100%, so the description adds no additional meaning beyond example content types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'List glossary terms' with examples of content types (prediction market vocabulary, indicators, regimes). It distinguishes itself from sibling 'get_glossary_term' by being a list operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use or not use this tool. While sibling 'get_glossary_term' exists for single term retrieval, the description does not mention trade-offs or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_intentsCInspect

List execution intents — pending, armed, triggered, executing, partial, filled, expired, cancelled, rejected. Shows the full execution pipeline status.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	Filter: pending, armed, triggered, executing, partial, filled, expired, cancelled, rejected
`activeOnly`	No	Only show active intents (default true)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the tool lists intents and shows 'full execution pipeline status', but lacks details on behavioral traits: it doesn't specify if it's read-only (implied by 'List'), authentication requirements beyond the apiKey parameter, rate limits, pagination, or what 'full' means in terms of data returned. This is a significant gap for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. It could be slightly more structured (e.g., separating filter details), but there's no wasted text—every word contributes to understanding the tool's function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (3 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain the return format, error handling, or how the 'full execution pipeline status' is presented. For a tool with no structured output and behavioral gaps, more context is needed to make it fully usable by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (apiKey, status, activeOnly). The description adds minimal value: it lists status values ('pending, armed, triggered, executing, filled') which partially overlaps with the schema's status description, but doesn't explain semantics beyond what the schema provides. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('execution intents'), and specifies the statuses shown ('pending, armed, triggered, executing, filled'). It distinguishes from siblings like 'get_fills' or 'get_orders' by focusing on the execution pipeline status. However, it doesn't explicitly differentiate from all siblings (e.g., 'list_skills', 'list_strategies'), so it's not a perfect 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites like authentication (though the apiKey parameter hints at it), nor does it compare to sibling tools like 'get_fills' or 'monitor_the_situation'. Usage is implied through the status filter but not explicitly stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_legislationBInspect

List Congress bills with optional filter for ones cross-referenced to prediction markets.

ParametersJSON Schema

Name	Required	Description
`q`	No	Keyword
`limit`	No	Max rows
`congress`	No	Congress number (e.g. 119)
`hasMarket`	No	Only bills with a linked market

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full burden. It doesn't state read-only nature, authentication needs, rate limits, or output format. Only lists bills but misses behavioral cues.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 12 words, front-loaded with purpose. No unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a listing tool with 4 params and no output schema, the description lacks context on pagination, ordering, return format, or any constraints. Leaves agent guessing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description adds minimal value beyond the parameter descriptions. The cross-reference mention aligns with 'hasMarket' param, but no additional semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists Congress bills with an optional filter for prediction market cross-references. It distinguishes from sibling tools like 'get_legislation' (single bill) and 'list_congress_members' (members).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'query_gov' or 'list_congress_members'. The optional filter is mentioned but no context for when to apply it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_opinionsBInspect

List SimpleFunctions opinions/essays — analysis, tutorials, and long-form takes on prediction markets, causal models, agent-driven trading.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max rows
`category`	No	Category filter

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden for behavioral disclosure. It only mentions 'List' without specifying order, pagination, or any side effects. The read-only nature is implied but not explicitly stated, and there is no mention of permissions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that immediately states the purpose and includes relevant content context. It is front-loaded and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description does not explain the return format, pagination behavior, or how the category filter works in practice. For a listing tool, more detail would help the agent understand what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description provides minimal additional meaning beyond the existing parameter descriptions ('Max rows', 'Category filter'). It does not elaborate on valid category values or default limits, but the schema already handles the basics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'opinions/essays', specifying the content types (analysis, tutorials, long-form takes). It is unambiguous and implicitly distinguishes from the singular get_opinion by using the plural form.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide any guidance on when to use this tool versus alternatives like get_opinion (singular) or list_theses. It only states what the tool does without context of selection criteria or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_skillsBInspect

List all skills: built-in + user-created custom skills.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions listing skills but fails to disclose behavioral traits like whether this is a read-only operation, if it requires authentication (implied by the apiKey parameter but not stated), potential rate limits, or the format of the returned list. This leaves significant gaps in understanding the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose without any wasted words. It is appropriately sized for a simple listing tool, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (one parameter, no output schema) and the schema's good coverage, the description is minimally adequate. However, with no annotations and no output schema, it lacks details on behavioral aspects and return values, leaving room for improvement in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'apiKey' well-documented in the schema itself. The description adds no additional meaning about parameters, so it meets the baseline of 3 where the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('List all skills') and specifies the scope ('built-in + user-created custom skills'), which is a specific verb+resource combination. However, it does not explicitly differentiate from sibling tools like 'browse_public_skills' or 'publish_skill', which might have overlapping purposes, so it falls short of a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as 'browse_public_skills' or 'fork_skill', nor does it mention any prerequisites or exclusions. It simply states what the tool does without context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_strategiesCInspect

Advanced: List trading strategies for a thesis.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	Filter: active\|watching\|executed\|cancelled\|review
`thesisId`	Yes	Thesis ID

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It only states the action ('List trading strategies') without detailing behavioral traits such as whether it's read-only (implied but not confirmed), pagination, rate limits, authentication needs beyond the apiKey parameter, or what the output looks like. This leaves significant gaps for a tool that likely interacts with trading data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action. However, the word 'Advanced:' is unnecessary and adds noise without value, slightly reducing clarity. Otherwise, it's appropriately sized with zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (trading strategies tool with no annotations and no output schema), the description is incomplete. It lacks details on behavioral traits (e.g., safety, output format), usage context, and doesn't compensate for the absence of annotations or output schema. This makes it inadequate for an agent to fully understand how to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (thesisId, apiKey, status) with descriptions. The description adds no additional meaning beyond implying that 'thesisId' is required (from 'for a thesis'), which is already covered in the schema. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the verb ('List') and resource ('trading strategies') with a qualifier ('for a thesis'), which clarifies the scope. However, it's prefixed with 'Advanced:' which adds ambiguity rather than clarity, and it doesn't distinguish this tool from potential siblings like 'list_intents' or 'list_theses' beyond the resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'for a thesis' but doesn't specify prerequisites (e.g., requires an existing thesis) or compare it to other listing tools (e.g., 'list_intents' for intents vs. 'list_strategies' for strategies). There's no explicit when/when-not or alternative tool references.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_technicalsCInspect

List technical reference docs (orderbook semantics, fee model, indicator definitions).

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max rows
`category`	No	Category filter

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Lacking annotations, the description provides minimal behavioral detail. It does not disclose if the operation is read-only, whether there are rate limits, pagination, or ordering. Only states 'list', leaving the agent unaware of side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 8 words, very concise. It front-loads the core action and resource. However, it may be slightly too terse given the lack of context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with no annotations or output schema, the description covers the basic purpose and mentions example content types. However, it lacks details on return format, filtering behavior, or how it differs from siblings, which is important given the many sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the input schema already describes the parameters. The description adds no extra meaning beyond the schema, but it does hint at possible category values (orderbook semantics, fee model, indicator definitions), which slightly compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists technical reference docs and gives concrete examples (orderbook semantics, fee model, indicator definitions). It is specific about what resource it acts on, though it does not explicitly differentiate from sibling 'get_technical'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'get_technical' for a specific document. The description does not mention prerequisites or context for invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_thesesCInspect

List all theses for the authenticated user.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'authenticated user' but doesn't disclose behavioral traits like pagination, rate limits, sorting, or what 'all theses' entails (e.g., includes drafts, archived items). This leaves significant gaps for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's function. It's front-loaded with the core action and resource, with no wasted words or unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete. It lacks details on return format (e.g., list structure, fields), error handling, or limitations (e.g., max results). For a tool with potential complexity in listing items, this leaves the agent under-informed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the single parameter (apiKey). The description adds no additional meaning about parameters, such as how authentication affects the listing. Baseline 3 is appropriate when the schema handles parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('theses'), specifying it's for the authenticated user. However, it doesn't differentiate from sibling tools like 'fork_thesis' or 'update_thesis', which would require more specific scope details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'fork_thesis' or 'create_thesis'. The description only states what it does without context about prerequisites, timing, or comparisons to other thesis-related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

monitor_the_situationAInspect

Universal web intelligence. Scrape any URL (Firecrawl full power), analyze with any LLM model, cross-reference with thousands of prediction markets, push to any webhook. Requires API key (apiKey parameter). For free demo, use enrich_content instead.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`enrich`	No	Cross-reference with prediction markets
`source`	Yes
`webhook`	No	Push results to a webhook
`analysis`	No	LLM analysis of scraped content

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the API key requirement and the free demo alternative, which is helpful. However, it doesn't describe important behavioral aspects like rate limits, error handling, response format, or what happens during the various actions (scrape, crawl, search, etc.). For a complex tool with 5 parameters and no annotations, this leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: it starts with the core functionality ('Universal web intelligence') and lists key capabilities in a single sentence, then adds important constraints and alternatives. Every sentence earns its place, though it could be slightly more structured by separating capabilities from constraints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema) and lack of annotations, the description is somewhat complete but has gaps. It covers the tool's broad capabilities and key constraints (API key, alternative), but doesn't explain the relationships between parameters, what the output looks like, or detailed behavioral expectations. For such a multifaceted tool, more context would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 80%, which is high, so the baseline is 3 even without parameter information in the description. The description mentions the 'apiKey parameter' requirement, which adds some context beyond the schema, but doesn't explain the semantics of other parameters like 'source', 'analysis', 'enrich', or 'webhook' beyond what's already in their schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs and resources: 'Scrape any URL', 'analyze with any LLM model', 'cross-reference with thousands of prediction markets', and 'push to any webhook'. It distinguishes itself from the sibling 'enrich_content' by mentioning it as a free demo alternative, showing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool versus alternatives: it states 'Requires API key' and 'For free demo, use enrich_content instead', clearly indicating prerequisites and naming a specific alternative tool. This gives clear context for when to choose this tool over others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_activity_listCInspect

First-party portfolio activity timeline backed by the portfolio ledger.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`since`	No
`until`	No
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`cursor`	No
`source`	No
`marketId`	No
`thesisId`	No
`eventType`	No
`confidence`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must carry the full burden. It only states it is backed by the portfolio ledger but does not disclose read-only nature, side effects, pagination, or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise at one sentence, front-loading the purpose. However, it could be more informative without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 11 parameters, no output schema, and no annotations, the description is woefully incomplete. It lacks details on parameters, output, and usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 9% (only apiKey described). The description adds no parameter info, failing to compensate for the lack of schema descriptions for the other 10 parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'portfolio activity timeline' backed by the ledger, implying a list of events. This differentiates it from similar list tools like portfolio_fills_list, though it doesn't explicitly state what kind of activities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use or not use this tool versus alternatives like portfolio_ledger_list. The description does not mention prerequisites, filters, or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_attribution_dailyCInspect

Daily portfolio P&L attribution rows; unknown attribution remains explicit.

ParametersJSON Schema

Name	Required	Description
`to`	No	YYYY-MM-DD
`from`	No	YYYY-MM-DD
`limit`	No
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`cursor`	No
`source`	No
`marketId`	No
`thesisId`	No
`confidence`	No

Tool Definition Quality

C2.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It partially addresses behavior by noting that unknown attribution remains explicit, providing a useful trait. However, it fails to disclose pagination, data range limits, or other behavioral aspects relevant to the 10 parameters.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence), which is concise but at the cost of completeness. It is front-loaded with the key action, but missing essential details reduces its effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the high parameter count (10), lack of output schema, and numerous sibling tools, the description is incomplete. It does not explain filtering, output format, or integration with other portfolio tools, leaving significant gaps for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 30% (3 of 10 parameters have descriptions). The description adds no parameter semantics, leaving the agent to infer meaning from parameter names alone, which is insufficient for many parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool returns 'Daily portfolio P&L attribution rows', which clearly indicates the resource and verb (retrieve). It also distinguishes from the sibling 'portfolio_attribution_grouped' by implying daily granularity, though not explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'portfolio_attribution_grouped' or other portfolio tools. The description lacks context for usage decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_attribution_groupedCInspect

Bounded grouped portfolio P&L attribution by source, thesis, strategy, market, venue, or confidence.

ParametersJSON Schema

Name	Required	Description
`to`	No	YYYY-MM-DD
`from`	No	YYYY-MM-DD
`limit`	No
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`source`	No
`groupBy`	No	day, venue, marketId, thesisId, thesisStrategyId, portfolioStrategyId, source, viewId, or attributionConfidence
`marketId`	No
`thesisId`	No
`confidence`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It only notes 'bounded' (likely time-bound) but fails to specify read-only nature, required permissions, data scope (e.g., all portfolios?), or output format. This is insufficient for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence front-loads core purpose and key grouping dimensions. Concise but sacrifices clarity on details. Could be slightly more structured without losing efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 10 parameters, no output schema, and no annotations, the description is insufficient. Missing: result format, pagination (limit parameter), implications of different groupBy options, whether it's historical or real-time, and relationship to specific portfolios. The tool cannot be used confidently based on this description alone.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 40% (only from, to, groupBy, apiKey have descriptions). The description adds context that grouping by 'confidence' is possible, but does not explain other parameters like venue, source, limit, or thesisId. It partially compensates for the low coverage but leaves many parameters unclear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides grouped P&L attribution and lists grouping dimensions (source, thesis, strategy, market, venue, confidence). It implies differentiation from sibling portfolio_attribution_daily by mentioning grouping, but lacks an explicit verb like 'calculate' or 'retrieve'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., portfolio_attribution_daily) or prerequisites. No exclusions or recommended contexts are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_fills_listCInspect

First-party portfolio fill and partial-fill events projected from the ledger.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`since`	No
`until`	No
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`cursor`	No
`source`	No
`marketId`	No
`thesisId`	No
`confidence`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only states the source ('projected from the ledger') without mentioning side effects, read-only nature, or any other behavioral characteristics. The agent cannot infer safety or impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no unnecessary words. It is well-structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 10 parameters, no output schema, and no annotations, the description is critically incomplete. It lacks information on pagination (cursor), filtering options (venue, marketId, etc.), return format, and typical usage patterns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 10 parameters with only 10% coverage (only 'apiKey' has a description). The description does not elaborate on any of the parameters, failing to add meaning beyond the schema. With low coverage, the description should compensate but does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists first-party portfolio fill and partial-fill events derived from the ledger. The verb 'list' is implicit in the name and description, and it distinguishes from sibling tools like 'get_fills' which may be more general.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_fills' or 'portfolio_activity_list'. No context on prerequisites or intended use cases is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_ledger_listCInspect

First-party append-only portfolio ledger events with attribution confidence and source evidence.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`since`	No
`until`	No
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`cursor`	No
`source`	No
`marketId`	No
`thesisId`	No
`eventType`	No
`confidence`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It mentions 'append-only', implying no deletions, but does not disclose additional behavioral traits like pagination, ordering, rate limits, or authentication requirements beyond the apiKey parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is efficient but at the expense of completeness for a tool with 11 parameters. Front-loading is good, but it could benefit from additional structured information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 11 parameters, no output schema, and no annotations, this description is severely inadequate. It fails to explain filtering, pagination, required fields, or the nature of the returned events, leaving the agent with minimal context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 9% (only apiKey described). The description does not elaborate on any parameters such as limit, since, venue, or cursor. Mentions of 'attribution confidence' and 'source evidence' hint at possible filter fields but lack explicit mapping to schema parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it lists 'first-party append-only portfolio ledger events', indicating a read operation on historical ledger entries. It includes attributes like 'attribution confidence and source evidence', distinguishing it from other portfolio tools like portfolio_fills_list or portfolio_activity_list. However, it does not explicitly state the verb 'list' or 'fetch', making it slightly ambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus the many sibling tools. There is no mention of alternatives, prerequisites, or context in which this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_positions_listAInspect

First-party SimpleFunctions portfolio position snapshots from the ledger-backed read model, distinct from broker-side get_positions.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max rows
`since`	No	ISO lower bound
`until`	No	ISO upper bound
`venue`	No
`apiKey`	Yes	SimpleFunctions API key
`cursor`	No	Pagination cursor
`source`	No
`marketId`	No
`thesisId`	No
`confidence`	No	Attribution confidence such as exact, low, or unknown

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully carries the transparency burden. It states 'snapshots from the ledger-backed read model', which implies read-only, snapshot behavior. However, it does not disclose authentication requirements, rate limits, or whether the snapshot is point-in-time or real-time. The description provides basic transparency but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that effectively conveys purpose and distinction. No redundant or verbose language. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 10 parameters and no output schema. The description only explains the data source and sibling distinction but does not describe the output format, typical filter usage, or pagination behavior (despite a 'cursor' parameter). Missing critical context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 60% (6 of 10 params have descriptions). The description adds no parameter-specific information beyond what the schema already provides. It does not clarify the meaning of undocumented parameters like 'source', 'venue', 'marketId', or 'thesisId'. Given coverage above 50%, the baseline is 3, and the description does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns 'portfolio position snapshots from the ledger-backed read model' and explicitly distinguishes itself from the sibling tool 'get_positions' (broker-side). This provides precise verb+resource and differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies a key sibling (get_positions) and marks the distinction ('distinct from broker-side get_positions'), implicitly guiding the agent to use this tool for ledger-backed snapshots vs broker positions. However, it does not elaborate on when to use this tool over other portfolio-related tools (e.g., add_position, portfolio_activity_list).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_risk_getCInspect

Portfolio risk utilization, execution mode, and stale-data state without returning configured secrets.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden but only discloses that secrets are not returned. It does not state read-only nature, authentication requirements beyond apiKey, or error behaviors. Minimal behavioral insight beyond the stated exclusion.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the key information without any extraneous words. Every part serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description should clarify the return format or structure but does not. Terms like 'risk utilization' and 'stale-data state' are left undefined, reducing completeness for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% and the single parameter 'apiKey' is clearly described in the schema. The tool description adds no further semantic meaning about the parameter, so the baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the exact data returned (risk utilization, execution mode, stale-data state) and explicitly mentions what is omitted (secrets), providing clear purpose. However, it lacks a definitive action verb describing what the tool does (e.g., 'retrieve').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool compared to sibling portfolio tools. The description does not differentiate from 'portfolio_activity_list' or 'portfolio_positions_list', leaving the agent to infer context from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

post_to_forumCInspect

Post a message to the agent forum. Share discoveries, edges, coordination signals with other agents.

ParametersJSON Schema

Name	Required	Description
`type`	Yes
`apiKey`	Yes	SimpleFunctions API key
`channel`	Yes
`content`	Yes	1-3 sentence summary
`replyTo`	No	Message ID to reply to
`tickers`	No	Related tickers
`agentName`	No	Your agent name

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool's purpose but fails to describe critical behavioral traits such as authentication requirements (implied by 'apiKey' parameter but not stated), rate limits, whether posts are public or private, or what happens on success/failure. This is inadequate for a write operation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—two sentences with zero wasted words. It's front-loaded with the core action and efficiently provides context about usage. Every sentence earns its place by adding value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 7 parameters (4 required), no annotations, and no output schema, the description is insufficient. It doesn't cover authentication needs, response format, error handling, or how parameters interact (e.g., 'type' vs 'channel' alignment). The agent lacks critical context to use this tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter-specific information beyond what's in the schema (71% coverage). It doesn't explain the relationship between 'type' and 'channel' enums, the purpose of 'replyTo', or formatting expectations for 'content'. With moderate schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate for schema gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Post a message') and target resource ('to the agent forum'), with specific examples of what to share ('discoveries, edges, coordination signals'). However, it doesn't explicitly differentiate this tool from sibling tools like 'read_forum' or 'subscribe_forum', which would require a 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some context about what to post ('Share discoveries, edges, coordination signals with other agents'), but offers no guidance on when to use this tool versus alternatives like 'inject_signal' or 'create_thesis', nor does it mention prerequisites or exclusions. This leaves the agent with minimal usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_skillCInspect

Publish a skill to make it publicly browsable and forkable.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	Public URL slug (3-60 chars, lowercase, numbers, hyphens)
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`skillId`	Yes	Skill ID to publish

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the outcome ('publicly browsable and forkable') but lacks critical behavioral details: whether this is a write operation (implied by 'publish'), permission requirements, rate limits, idempotency, or error conditions (e.g., invalid slug). For a mutation tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action and outcome without unnecessary words. Every part earns its place by conveying essential information concisely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., side effects, authentication needs), response format, or error handling. Given the complexity of publishing a skill, more context is needed to guide an agent effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all three parameters well-documented in the schema (e.g., slug format, API key source, skill ID purpose). The description adds no parameter-specific information beyond what the schema provides, so it meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('publish') and resource ('a skill'), specifying the outcome ('make it publicly browsable and forkable'). It distinguishes from sibling tools like 'create_skill' (creation) and 'fork_skill' (copying), but doesn't explicitly contrast with 'browse_public_skills' (viewing) or 'list_skills' (listing).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a skill created first), exclusions (e.g., cannot publish already public skills), or comparisons to siblings like 'fork_skill' (for copying public skills) or 'create_skill' (for initial creation).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

queryAInspect

BEST FOR QUESTIONS. Ask any question about probabilities or future events. Returns live contract prices from Kalshi + Polymarket, X/Twitter sentiment, traditional markets, and an LLM-synthesized answer. No auth required.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes	Natural language query (e.g. "iran oil prices", "fed rate cut 2026", "recession probability")

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: the types of data sources used (Kalshi, Polymarket, X/Twitter, traditional markets), the LLM synthesis approach, and the authentication requirement ('No auth required'). However, it doesn't mention rate limits, response formats, or potential limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, with every sentence earning its place. The first sentence establishes the primary use case, the second explains the scope and data sources, and the third provides important behavioral context about authentication - all in just three efficient sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no annotations and no output schema, the description provides good contextual completeness. It explains what the tool does, what data sources it uses, and authentication requirements. However, without an output schema, it could benefit from more information about the response format or structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents the single parameter 'q' as a natural language query. The description doesn't add any parameter-specific information beyond what's in the schema, maintaining the baseline score of 3 for adequate but not enhanced parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Ask any question about probabilities or future events') and resources ('Returns live contract prices from Kalshi + Polymarket, X/Twitter sentiment, traditional markets, and an LLM-synthesized answer'). It distinguishes itself from siblings by focusing on general question-answering rather than specific operations like trading or data retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('BEST FOR QUESTIONS. Ask any question about probabilities or future events'), but doesn't explicitly state when not to use it or name specific alternatives among the many sibling tools. The guidance is helpful but could be more comparative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_databentoAInspect

Free-form historical market data query via Databento. Stocks, ETFs, CME futures (WTI, Brent, bonds, VIX, FX, BTC), options. OHLCV daily/hourly/minute, trades, BBO. Max 30 days, 5 symbols, 500 rows. No auth required.

ParametersJSON Schema

Name	Required	Description
`days`	No	Lookback days (default 7, max 30)
`stype`	No	raw_symbol (default) or continuous (for .FUT symbols)
`schema`	No	ohlcv-1d (default), ohlcv-1h, ohlcv-1m, trades, bbo-1s, bbo-1m, statistics, definition
`dataset`	No	DBEQ.BASIC (stocks/ETFs, default), GLBX.MDP3 (CME futures), OPRA.PILLAR (options), XNAS.BASIC (Nasdaq)
`symbols`	Yes	Comma-separated symbols (max 5). Use .FUT for continuous futures. Examples: SPY, CL.FUT, ES.FUT, GLD

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well: it discloses key behavioral constraints (max 30 days, 5 symbols, 500 rows), authentication requirements ('No auth required'), and scope limitations. It doesn't mention error handling, rate limits, or response format details, but covers essential operational boundaries.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in a single sentence that front-loads the core purpose followed by key constraints. Every element earns its place: asset classes, data types, and operational limits are all essential information with zero wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 5-parameter query tool with no annotations and no output schema, the description provides good coverage: purpose, scope, constraints, and authentication. It could be more complete by mentioning response format or error cases, but given the schema's thorough parameter documentation, this is reasonably complete for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description adds some context by mentioning the 30-day max (relates to 'days' parameter) and 5-symbol max (relates to 'symbols'), but doesn't provide additional semantic meaning beyond what's in the schema descriptions. Baseline 3 is appropriate when schema does heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('query') and resources ('historical market data via Databento'), listing asset classes (stocks, ETFs, futures, options) and data types (OHLCV, trades, BBO). It distinguishes from siblings by specifying it's for market data queries, unlike other tools for positions, strategies, or intents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for free-form historical market data queries with specific asset classes and data types. It doesn't explicitly mention when not to use it or name alternatives among siblings, but the context strongly implies it's the primary market data query tool in this set.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_econBInspect

Search official economic time series from FRED-backed data. Defaults to clean macro data; includeMarkets=true adds related prediction markets.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes	Economic query, e.g. "unemployment rate", "core CPI", "fed funds rate"
`mode`	No	raw = structured series only, full = include answer	full
`includeMarkets`	No	Attach related Kalshi/Polymarket markets. Default false.

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses default behavior (clean macro data) and the effect of includeMarkets flag. However, no annotations exist, and it does not mention any side effects, limitations, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences that front-load purpose and highlight key behavior. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers parameters well but lacks description of return format or output, which is relevant for a search tool. Fills most gaps given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds context for includeMarkets but does not elaborate on q or mode beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it searches economic time series from FRED, with a default mode and an option to include prediction markets. Differentiates from sibling tools like 'query' and 'query_gov'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Does not specify prerequisites or scenarios where it is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_govBInspect

Search legislative data: bills, nominations, members, CRS reports. Cross-references with prediction markets. Use for political/policy questions.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes	Search query (e.g., "save act", "fed chair nomination")
`mode`	No	raw = skip LLM synthesis, full = include answer	full

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions cross-referencing with prediction markets and LLM synthesis (via the 'mode' parameter), it doesn't describe important behavioral traits like whether this is a read-only operation, what permissions are needed, rate limits, pagination, or what the response format looks like. The description adds some context but leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that efficiently convey the tool's purpose and usage context. It's front-loaded with the core functionality and avoids unnecessary verbiage. Every sentence earns its place, though it could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of searching legislative data with prediction market cross-referencing, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns, how results are structured, or important behavioral constraints. For a search tool with rich functionality and no structured output documentation, this leaves significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (q and mode) with descriptions and enum values. The description doesn't add any meaningful parameter semantics beyond what's in the schema, such as query syntax examples or when to choose 'raw' vs 'full' mode. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches legislative data across multiple resource types (bills, nominations, members, CRS reports) and cross-references with prediction markets. It specifies the verb 'search' and resources, but doesn't explicitly differentiate from sibling tools like 'legislation' or 'query' which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance by stating 'Use for political/policy questions,' which gives context about when this tool is appropriate. However, it doesn't explicitly state when to use this tool versus alternatives like 'legislation' or 'query' from the sibling list, nor does it provide any exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_forumAInspect

Read messages from the agent forum. Returns inbox (unread across subscribed channels) by default. Use channel/ticker/since to filter. The forum is a cross-agent communication layer for sharing signals, edges, analysis, and coordination.

ParametersJSON Schema

Name	Required	Description
`limit`	No
`since`	No	ISO cursor
`apiKey`	Yes	SimpleFunctions API key
`ticker`	No	Filter by ticker
`channel`	No	Channel: signals, edges, analysis, coordination, general

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It explains the default behavior (returns unread messages from subscribed channels) and mentions filtering capabilities, which adds useful context. However, it doesn't describe important behavioral aspects like authentication requirements (though apiKey is in schema), rate limits, pagination, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with two sentences that each earn their place. The first sentence states the core purpose and default behavior, while the second explains filtering and provides context about the forum's role. There's no wasted text or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 parameters, no output schema, no annotations), the description provides adequate but incomplete coverage. It explains the purpose and basic usage but lacks details about authentication requirements, response format, error conditions, or how results are structured. For a read operation with filtering parameters, more behavioral context would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 80%, so the baseline would be 3. However, the description adds meaningful context beyond the schema by explaining that parameters are used for filtering ('Use channel/ticker/since to filter') and clarifying the forum's purpose ('cross-agent communication layer for sharing signals, edges, analysis, and coordination'), which helps understand parameter usage in context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Read messages from the agent forum') and identifies the resource ('agent forum'). It distinguishes this tool from its sibling 'post_to_forum' by focusing on reading rather than posting, and from other tools that don't involve forum operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool ('Returns inbox (unread across subscribed channels) by default') and mentions filtering options. However, it doesn't explicitly state when NOT to use it or name specific alternatives among sibling tools, though the distinction from 'post_to_forum' is implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_skillCInspect

Get a skill by ID or trigger to run it. Returns the skill prompt and metadata.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`skillId`	Yes	Skill ID (UUID) to retrieve

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the tool 'runs' a skill and returns 'prompt and metadata', but lacks details on permissions required, rate limits, side effects (e.g., if running triggers external actions), error handling, or response format. For a tool with potential execution implications, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action ('Get a skill by ID or trigger to run it') and adds key outcome ('Returns the skill prompt and metadata'). Every word earns its place, with no redundancy or unnecessary elaboration, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a tool that 'runs' a skill (implying potential execution), no annotations, no output schema, and 100% schema coverage, the description is incomplete. It lacks details on behavioral traits (e.g., what 'run' entails, side effects), usage context, and output specifics beyond high-level mention of 'prompt and metadata'. For a tool in this context, more comprehensive guidance is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (apiKey and skillId) well-documented in the schema. The description adds no additional meaning beyond implying 'skillId' can be an 'ID or trigger', which slightly clarifies but doesn't detail syntax or format. With high schema coverage, the baseline is 3, as the description provides minimal extra value over the structured data.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('a skill'), specifying it's retrieved by 'ID or trigger' and that it 'runs' the skill. It distinguishes from siblings like 'list_skills' (which lists) and 'get_skill' (not present), but doesn't explicitly differentiate from similar tools like 'fork_skill' or 'create_skill' beyond the action. The purpose is specific but could be more precise about what 'run' entails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid skill ID), exclusions (e.g., not for creating or listing skills), or compare to siblings like 'list_skills' for discovery or 'fork_skill' for copying. Usage is implied only by the action described, with no explicit context or alternatives named.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_marketsAInspect

Search Kalshi prediction markets by keyword, series, or ticker. Returns live prices and volume — data not available via web search. No auth required.

ParametersJSON Schema

Name	Required	Description
`query`	No	Keyword search (e.g. "oil recession")
`market`	No	Specific market ticker (e.g. KXWTIMAX-26DEC31-T140)
`series`	No	Kalshi series ticker (e.g. KXWTIMAX)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and adds valuable behavioral context: it discloses the return data ('live prices and volume'), data uniqueness ('data not available via web search'), and authentication requirements ('No auth required'). However, it does not mention rate limits, pagination, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with three concise sentences that each add value: the first states the purpose, the second specifies the return data and uniqueness, and the third covers authentication. There is no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema, no annotations), the description is largely complete: it covers purpose, usage, behavioral traits, and authentication. However, it lacks details on output format, error cases, or parameter precedence, which could be helpful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, providing clear documentation for all three parameters. The description adds minimal value beyond the schema by mentioning the search types ('by keyword, series, or ticker'), but does not explain parameter interactions or provide additional syntax details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Search Kalshi prediction markets') and resources ('by keyword, series, or ticker'), distinguishing it from sibling tools like 'get_markets' or 'screen_markets' by specifying the search functionality and data uniqueness ('data not available via web search').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Search Kalshi prediction markets by keyword, series, or ticker') and notes the data uniqueness ('data not available via web search'), but does not explicitly state when not to use it or name alternatives among siblings like 'get_markets' or 'screen_markets'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_by_tickersAInspect

Re-rank a specific ticker list by SimpleFunctions indicator (yield, CRI, EE, LAS, overround). For "of these N markets, which has best yield?" workflows.

ParametersJSON Schema

Name	Required	Description
`sort`	No	Indicator to sort by (e.g. iy, cri, ee, las, overround)
`order`	No	Sort order
`tickers`	Yes	Comma-separated tickers

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It only says 're-rank', without disclosing side effects, modifications, or output format. Minimal behavioral info beyond the action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action and criteria. Every word adds value; no fluff. Excellent conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, and description does not mention return format or behavior. For a simple reranking tool, the description is mostly complete but missing output details. Adequate for intended use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description adds context by listing indicator names, matching the sort parameter, but does not significantly extend beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool re-ranks a specific ticker list by SimpleFunctions indicators, with an example workflow ('which has best yield?'). Verb 're-rank' is specific and resource 'ticker list' is distinct from siblings like scan_markets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for re-ranking a specific list, but does not explicitly state when to avoid or suggest alternatives (e.g., scan_markets for broader screening). Example provides context but no exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_marketsAInspect

Indicator-based market screener. The middle layer between raw price scan and LLM thesis edges. Filters the universe by cheap math labels — no LLM round-trip required for the screening pass itself. Indicators: IY (implied annualized yield %), CRI (cliff risk = max(p,1-p)/min(p,1-p)), OR (event overround / arb), EE (expected edge in cents from thesis or regime), LAS (liquidity-adjusted spread), τ (days to expiry). Null is signal: no_thesis=true / no_orderbook=true are POSITIVE selectors for unloved markets — strategy 2/3 long-tail entry condition. Use this to bulk-discover candidates before paying any LLM cost. No auth required.

ParametersJSON Schema

Name	Required	Description
`sort`	No	Sort field. Default: iy
`limit`	No	Default 50, max 200
`order`	No	Default: desc
`venue`	No
`ee_min`	No	Minimum expected edge in cents (requires thesis or regime row).
`iy_max`	No
`iy_min`	No	Minimum implied yield, annualized %. Try 200 for long-tail.
`or_max`	No
`or_min`	No	Minimum event overround. 0.05 = 105¢ field (book-maker margin or arb).
`cri_max`	No	Maximum cliff risk = max(p,1-p)/min(p,1-p). 1=balanced, ∞=cliff.
`cri_min`	No
`keyword`	No	Substring filter on title.
`las_max`	No	Maximum liquidity-adjusted spread (spread/mid). Try 0.05.
`category`	No	crypto, political, financial, sports, etc — kalshi-supplied category from snapshot blob
`no_thesis`	No	POSITIVE selector — only markets WITHOUT a thesis (unloved long tail).
`has_thesis`	No	Only markets covered by an active public thesis.
`no_orderbook`	No	POSITIVE selector — only markets WITHOUT recent orderbook attention.
`tau_max_days`	No	Maximum days to expiry.
`tau_min_days`	No
`has_orderbook`	No	Only markets with cached orderbook (last 6h regime row).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: it's a screening/filtering operation (not a write), discloses 'No auth required,' explains that 'Null is signal' for selectors like 'no_thesis=true,' and mentions performance considerations ('bulk-discover candidates before paying any LLM cost'). It doesn't cover rate limits or pagination behavior, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is information-dense but somewhat verbose and jargon-heavy (e.g., 'cliff risk,' 'overround,' 'long-tail entry condition'). It front-loads the core purpose but includes tangential strategy details. While every sentence adds value, the structure could be more streamlined for clarity, and it assumes domain knowledge, reducing accessibility.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (20 parameters, no annotations, no output schema), the description does well by explaining the tool's role, key indicators, and usage strategy. It covers authentication ('No auth required') and high-level behavior. However, it lacks details on output format, error handling, or specific examples of filter combinations, leaving some gaps for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 75% description coverage, so the baseline is 3. The description adds significant value by explaining the meaning of indicators (IY, CRI, OR, EE, LAS, τ) and clarifying that 'Null is signal' for boolean selectors like 'no_thesis'—context not in the schema. However, it doesn't detail all 20 parameters individually, so it doesn't fully compensate for the 25% coverage gap, warranting a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as an 'indicator-based market screener' that 'filters the universe by cheap math labels,' distinguishing it from raw price scans and LLM thesis edges. It explicitly positions itself as 'the middle layer between raw price scan and LLM thesis edges,' differentiating from sibling tools like 'scan_markets' (likely raw) and 'get_edges' (likely LLM-based).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: 'Use this to bulk-discover candidates before paying any LLM cost.' It also distinguishes it from alternatives by stating it's for 'no LLM round-trip required for the screening pass itself,' implying LLM-based tools (like 'get_edges') are for later stages. The context of 'strategy 2/3 long-tail entry condition' further clarifies its strategic role.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_xAInspect

Search X/Twitter for social sentiment on any topic. Returns posts sorted by engagement. Not available via web search — uses X API directly.

ParametersJSON Schema

Name	Required	Description	Default
`mode`	No	raw = structured JSON, summary = LLM synthesis	raw
`hours`	No	Hours to search back (default 24)
`limit`	No	Max posts to return (default 20)
`query`	Yes	Search query (e.g. "iran oil", "hormuz shipping")
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool returns posts sorted by engagement and uses the X API directly, which adds useful context beyond the input schema. However, it doesn't cover important aspects like rate limits, authentication requirements beyond the apiKey parameter, error handling, or what the output looks like (since there's no output schema), leaving gaps for a mutation-like operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two sentences) and front-loaded with the core purpose. Every sentence adds critical information: the first states what the tool does and its output, the second clarifies its unique API access. There's zero wasted text, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 parameters, no annotations, no output schema), the description is incomplete. It covers the purpose and basic behavior but lacks details on output format, error conditions, or usage constraints. While it's adequate for a simple search tool, the absence of output schema means the description should ideally hint at return values, which it doesn't.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain the 'query' format or 'apiKey' usage in more detail). This meets the baseline of 3 when the schema does the heavy lifting, but no extra value is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('search', 'returns') and resources ('X/Twitter', 'social sentiment', 'posts'), and distinguishes it from web search by specifying it uses the X API directly. This makes the purpose unambiguous and differentiated from potential alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (searching X/Twitter for social sentiment) and explicitly states it's not available via web search, which helps differentiate it from general search tools. However, it doesn't mention when not to use it or name specific alternatives among the many sibling tools, which prevents a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

site_searchAInspect

Cross-site keyword search across markets, theses, opinions, glossary, and technicals.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes	Query

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It only mentions the basic purpose and does not disclose whether the tool is read-only, pagination behavior, rate limits, or any other behavioral traits beyond simple search.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that front-loads the action and scope. Every word is necessary and contributes to understanding. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple search tool with one parameter, the description covers the scope of search but lacks information about return format or any limits. Since no output schema exists, the description should provide at least a hint about the result type, which it does not.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (only one parameter 'q' with description 'Query'). The tool description adds that 'q' is a keyword search term, which only slightly extends the schema's own description. Baseline score of 3 is appropriate as no significant extra meaning is added.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it is a keyword search across multiple specified content types (markets, theses, opinions, glossary, technicals), using a specific verb and resource. It distinguishes from sibling tools like 'query' or individual list tools by indicating cross-site coverage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for cross-site keyword search but does not explicitly state when to use this tool versus alternatives like 'query' or dedicated list tools. No when-not-to-use guidance or context is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sttCInspect

Speech-to-text proxy. Pass base64-encoded audio, get transcribed text.

ParametersJSON Schema

Name	Required	Description	Default
`audio`	Yes	Base64-encoded audio bytes
`apiKey`	Yes	SimpleFunctions API key

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must carry the full burden. It only says 'proxy' without explaining what happens underneath, any restrictions, or return format details. Critical behavioral traits (e.g., supported audio codecs, latency) are omitted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, using only one sentence to convey the core functionality. It avoids unnecessary words, though slightly more detail could be added without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and simple parameters, the description covers the basic usage. However, it lacks important details like supported audio formats, maximum file size, and error handling, which are needed for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with clear descriptions for both parameters. The description does not add extra meaning beyond the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a speech-to-text proxy that accepts base64-encoded audio and returns transcribed text. The purpose is unambiguous and distinguishes it from sibling tools like tts (text-to-speech).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of when not to use it or what prerequisites (e.g., audio format, size limits) are needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subscribe_forumCInspect

Subscribe to forum channels to receive messages in your inbox.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key
`channels`	Yes	Channel IDs to subscribe to

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but lacks behavioral details. It states the tool subscribes to channels but doesn't disclose if this is a one-time action or recurring, what permissions are needed (e.g., user authentication via apiKey), potential side effects (e.g., inbox notifications), or error conditions. The description is minimal and misses key operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action ('Subscribe to forum channels') and outcome. There is no wasted wording, and it's appropriately sized for a simple tool with two parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete. It doesn't cover behavioral aspects (e.g., subscription mechanics, error handling) or return values (e.g., success confirmation). For a tool that likely involves user configuration and inbox integration, more context is needed to guide effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are documented in the schema. The description doesn't add meaning beyond the schema—it doesn't explain what 'channels' represent (e.g., forum names or IDs) or how the subscription works with the apiKey. Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Subscribe') and resource ('forum channels'), specifying the outcome ('receive messages in your inbox'). It distinguishes from siblings like 'read_forum' (which likely reads existing messages) and 'post_to_forum' (which posts messages), but doesn't explicitly contrast them. The purpose is specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing valid channel IDs), when not to use it (e.g., if already subscribed), or refer to sibling tools like 'read_forum' for reading messages versus subscribing to receive them. Usage is implied but not articulated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

trigger_evaluationAInspect

Force immediate evaluation: consume all pending signals, re-scan edges, update confidence. Use after injecting important signals.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: it's a forced, immediate action that processes pending signals and updates confidence, implying mutation and potential side effects. However, it lacks details on permissions, rate limits, or what 'update confidence' entails in practice. The description doesn't contradict any annotations, as none exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: two sentences that directly state the action and usage context with zero waste. Every word earns its place, making it easy for an AI agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (a mutation tool with no annotations and no output schema), the description is somewhat complete but has gaps. It explains what the tool does and when to use it, but lacks details on behavioral aspects like error handling, response format, or prerequisites. For a tool that forces evaluation, more context on outcomes or limitations would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (thesisId and apiKey) with descriptions. The description doesn't add any meaning beyond what the schema provides, such as explaining how these parameters relate to the evaluation process. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Force immediate evaluation' with specific actions: 'consume all pending signals, re-scan edges, update confidence.' This is a specific verb+resource combination that distinguishes it from siblings like 'get_evaluation_history' or 'monitor_the_situation.' However, it doesn't explicitly differentiate from all possible alternatives like 'update_thesis' which might also affect evaluation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'Use after injecting important signals.' This gives a specific scenario, but it doesn't explicitly state when not to use it or name alternatives among siblings. For example, it doesn't contrast with 'get_evaluation_history' for passive monitoring or 'update_thesis' for manual updates.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ttsAInspect

Text-to-speech proxy. Returns audio bytes encoded as base64.

ParametersJSON Schema

Name	Required	Description
`text`	Yes	Text to synthesize
`speed`	No	Playback speed multiplier
`apiKey`	Yes	SimpleFunctions API key
`voiceId`	No	Voice identifier (provider-specific)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry behavioral context. It discloses the return format (base64 audio bytes) but does not mention whether the operation is read-only, side effects, or authentication requirements beyond schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy. First sentence states purpose, second specifies output format. Extremely concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose and output format for a 4-parameter tool with no output schema. Could briefly mention the need for an API key, but schema covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 4 parameters. The description adds no additional parameter-level information beyond what's in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it's a text-to-speech proxy and specifies the output as audio bytes encoded as base64. Differentiates from sibling 'stt' (speech-to-text) by its core function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like 'stt'. Does not mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_nodesAInspect

Directly update causal tree node probabilities — zero LLM cost, instant. Use when the agent observes a confirmed fact (e.g. "CPI came in at 3.2%") and wants to reflect it immediately. Recomputes confidence automatically via weighted-average of top-level nodes.

ParametersJSON Schema

Name	Required	Description
`lock`	No	Advisory pin — tells future evaluations that these nodes are well-supported. Evidence can still revise them; pinning no longer makes a node permanent.
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`updates`	Yes	Node updates to apply
`thesisId`	Yes	Thesis ID

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: the tool performs updates (mutations), has 'zero LLM cost, instant' execution characteristics, can 'lock nodes' to prevent future changes, and 'recomputes confidence automatically.' It doesn't mention rate limits, authentication requirements beyond the apiKey parameter, or error conditions, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with two sentences that each earn their place. The first sentence states the core functionality and key characteristics, while the second provides usage guidance with a concrete example. There is zero wasted text, and the most important information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description provides good contextual completeness. It covers the tool's purpose, when to use it, key behavioral characteristics (cost, speed, locking, recomputation), and connects to parameter usage. The main gap is the lack of information about return values or error responses, which would be helpful given there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description adds some semantic context by mentioning 'node IDs to lock' and suggesting 'Use 0.99 for confirmed facts' for probability values, but doesn't provide significant additional parameter meaning beyond what's in the schema. The baseline of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('update causal tree node probabilities') and resources ('causal tree node'), distinguishing it from sibling tools like update_position or update_thesis. It provides concrete context about what type of data is being manipulated (node probabilities in a causal tree).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('Use when agent observes a confirmed fact') and provides a concrete example ('CPI came in at 3.2%'). It also distinguishes this tool's purpose from potential alternatives by specifying it's for 'directly updating' probabilities with 'zero LLM cost, instant' execution.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_positionBInspect

Update a position: current price, edge, size, status (open→closed). Use to mark positions as closed or update tracking data.

ParametersJSON Schema

Name	Required	Description
`edge`	No	Current edge in cents
`size`	No	Updated size
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	open or closed
`thesisId`	Yes	Thesis ID
`rationale`	No	Updated rationale
`positionId`	Yes	Position ID
`currentPrice`	No	Current market price in cents

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions the tool can update fields and change status from open→closed, implying mutation. However, it doesn't disclose critical behavioral traits: whether this requires specific permissions, if updates are reversible, what happens to unchanged fields, error conditions, or response format. For a mutation tool with zero annotation coverage, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: two concise sentences that immediately state the purpose and usage. Every word earns its place with no redundancy or fluff. It efficiently communicates core information without wasting space.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation with 8 parameters), lack of annotations, and no output schema, the description is incomplete. It should explain more about behavioral aspects (e.g., permissions, side effects, response format) and usage constraints. The description alone doesn't provide enough context for safe and effective use by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 8 parameters thoroughly. The description lists some parameters (current price, edge, size, status) but doesn't add meaning beyond what the schema provides—it doesn't explain relationships between parameters or usage nuances. With high schema coverage, the baseline is 3 even without extra param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Update a position' with specific fields (current price, edge, size, status). It distinguishes from 'add_position' (create new) and 'close_position' (only close), but doesn't explicitly contrast with 'update_thesis' or 'update_strategy' which might be related operations. The verb+resource+scope is well-defined but sibling differentiation is incomplete.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage context: 'Use to mark positions as closed or update tracking data.' This implies when to use it (closing positions or updating tracking data) but doesn't explicitly state when NOT to use it or mention alternatives like 'close_position' for just closing or 'update_thesis' for higher-level updates. The guidance is helpful but not comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_strategyCInspect

Advanced: Update a trading strategy (stop loss, take profit, status).

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	New status: active\|watching\|executed\|cancelled\|review
`priority`	No	New priority
`stopLoss`	No	New stop loss (cents)
`thesisId`	Yes	Thesis ID
`rationale`	No	Updated rationale
`entryAbove`	No	New entry above trigger (cents)
`entryBelow`	No	New entry below trigger (cents)
`strategyId`	Yes	Strategy ID (UUID)
`takeProfit`	No	New take profit (cents)
`softConditions`	No	Updated soft conditions

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions updating fields but doesn't disclose behavioral traits like required permissions, whether changes are reversible, rate limits, or error conditions. For a mutation tool with 11 parameters, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no wasted words. However, 'Advanced:' is unnecessary and doesn't add value, slightly reducing effectiveness. It's front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex mutation tool with 11 parameters and no annotations or output schema, the description is inadequate. It lacks context on permissions, side effects, return values, or error handling. The agent would struggle to use this tool correctly without additional information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema documents all parameters thoroughly. The description lists three example fields (stop loss, take profit, status) but doesn't add meaning beyond what the schema provides. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool updates a trading strategy with specific fields (stop loss, take profit, status), which is clear but vague. It uses 'Advanced:' which adds no clarity. It doesn't distinguish from sibling tools like 'update_position' or 'update_thesis', leaving ambiguity about scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'update_position' or 'update_thesis'. The 'Advanced:' prefix might imply complexity but doesn't specify prerequisites or contexts. The agent must infer usage from the description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_thesisBInspect

Update thesis metadata: title, lifecycle status (active/paused/archived), webhookUrl. Use configure_heartbeat.paused for runtime heartbeat pause/resume.

ParametersJSON Schema

Name	Required	Description
`title`	No	New thesis title
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`status`	No	New status: active, paused, archived
`thesisId`	Yes	Thesis ID
`webhookUrl`	No	Webhook URL for confidence change notifications (HTTPS only)

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions the tool updates metadata and gives usage scenarios, it doesn't disclose critical behavioral traits: whether this requires specific permissions, if changes are reversible, what happens to existing metadata not mentioned, rate limits, or what the response looks like. For a mutation tool with zero annotation coverage, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, and the second sentence provides usage guidance. Every sentence earns its place with zero waste or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is a mutation tool with no annotations and no output schema, the description is incomplete. It should disclose more about behavioral aspects (permissions, reversibility, response format) and provide clearer differentiation from sibling update tools. The description does the minimum for purpose and usage but lacks critical context for safe and effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description mentions three parameters (title, status, webhookUrl) but doesn't add meaning beyond what the schema provides. It doesn't explain the relationship between parameters or provide additional context about their usage. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool updates thesis metadata with specific fields (title, status, webhookUrl) and provides the verb 'update' with the resource 'thesis metadata'. It distinguishes from siblings like 'create_thesis' (creation) and 'list_theses' (listing), but doesn't explicitly contrast with other update tools like 'update_strategy' or 'update_position'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'to pause monitoring, rename, or configure webhook notifications'. This gives practical scenarios (pause, rename, configure) that help guide usage. However, it doesn't explicitly state when NOT to use it or mention alternatives among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

what_ifBInspect

Scenario analysis: "what if OPEC cuts production?" Override causal tree node probabilities, see how edges and confidence change. Zero LLM cost, instant.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`thesisId`	Yes	Thesis ID
`overrides`	Yes	Node probability overrides, e.g. {"n1": 0.1, "n3": 0.2}

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It mentions 'Zero LLM cost, instant' which hints at performance and cost efficiency, but lacks details on permissions, rate limits, side effects (e.g., whether overrides are saved or transient), or response format. For a mutation-like tool (overrides), this is insufficient behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences: one stating the purpose with an example, and another highlighting benefits. It's front-loaded with the core functionality. However, the first sentence could be slightly more structured, and the example is embedded in quotes, which is minorly distracting.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a tool that involves overrides (potentially mutation-like), the description is incomplete. It lacks information on what the tool returns (e.g., updated edges/confidence), error handling, or dependencies. The mention of 'Zero LLM cost, instant' adds some context but doesn't compensate for major gaps in behavioral and output details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (thesisId, apiKey, overrides) with descriptions. The description adds minimal value by mentioning 'Override causal tree node probabilities' which aligns with the 'overrides' parameter, but doesn't provide additional syntax or format details beyond what the schema specifies. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'scenario analysis' with the example 'what if OPEC cuts production?' and specifies the action of overriding node probabilities to see changes in edges and confidence. It distinguishes from siblings like 'update_nodes' by focusing on hypothetical analysis rather than direct updates. However, it doesn't explicitly name the resource (e.g., 'causal tree' or 'thesis') beyond context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for scenario analysis with overrides, mentioning 'Zero LLM cost, instant' as a benefit. It doesn't explicitly state when to use this tool versus alternatives like 'update_nodes' (which might modify actual data) or 'explore_public' (for browsing), leaving some ambiguity. No exclusions or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_accountCInspect

Advanced: Recent posts from a specific X/Twitter account.

ParametersJSON Schema

Name	Required	Description
`hours`	No	Hours to look back (default 24)
`limit`	No	Max posts (default 20)
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`username`	Yes	X/Twitter username (without @)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'Advanced' and 'Recent posts' but doesn't explain what 'Advanced' entails, whether this is a read-only operation, rate limits, authentication requirements beyond the apiKey parameter, or what the output format looks like. For a tool with external API integration and no annotations, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded with the core purpose. The single sentence is efficient with no wasted words. However, the 'Advanced:' prefix adds minimal value without explanation, slightly reducing effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (external API integration, 4 parameters, no output schema, and no annotations), the description is incomplete. It doesn't address authentication behavior, rate limits, error handling, or output format. For a tool that interacts with an external service and has no structured output documentation, more contextual information is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain how 'username' relates to X/Twitter handles or what 'hours' means operationally). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with additional semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: retrieving recent posts from a specific X/Twitter account. It specifies the resource (posts) and action (retrieving recent ones), making the verb+resource relationship explicit. However, it doesn't distinguish this tool from its sibling 'search_x' or 'x_news', which appear to be related X/Twitter tools, so it misses full sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'search_x' or 'x_news', nor does it specify prerequisites or constraints beyond what's implied in the parameters. The word 'Advanced' is vague and doesn't offer practical usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_newsCInspect

Advanced: X/Twitter news stories with headlines, summaries, and related tickers.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max stories (default 10)
`query`	Yes	News search query
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Advanced' but doesn't explain what that entails (e.g., rate limits, authentication needs, data freshness, or pagination). The description lacks details on what the tool returns, error handling, or any side effects, leaving significant gaps for a tool with external API dependencies.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence that front-loads key information ('Advanced: X/Twitter news stories...'). There's no wasted text, and it efficiently communicates the core functionality without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (external API integration, 3 parameters, no output schema, and no annotations), the description is incomplete. It doesn't cover behavioral aspects like authentication, rate limits, return format, or error handling. Without annotations or output schema, the description should provide more context to guide effective use, but it falls short.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (limit, query, apiKey). The description doesn't add any meaningful semantics beyond what's in the schema, such as query format examples, apiKey usage context, or limit constraints. Baseline 3 is appropriate when the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: retrieving X/Twitter news stories with specific attributes (headlines, summaries, related tickers). It uses a specific verb ('news stories') and resource (X/Twitter), but doesn't explicitly differentiate from sibling tools like 'search_x' or 'query', which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description mentions 'Advanced' but doesn't explain what makes it advanced or when to prefer it over siblings like 'search_x' or 'query'. There's no mention of prerequisites, exclusions, or specific contexts for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_volumeCInspect

Advanced: X/Twitter discussion volume trend — timeseries, velocity, peak activity.

ParametersJSON Schema

Name	Required	Description	Default
`hours`	No	Hours to look back (default 72)
`query`	Yes	Search query
`apiKey`	Yes	SimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
`granularity`	No	Timeseries granularity	hour

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions what metrics are returned (timeseries, velocity, peak activity) but doesn't cover critical aspects: authentication requirements (implied by apiKey parameter but not stated), rate limits, data freshness, whether it's a read-only operation, or what format the output takes. For a tool with external API dependencies, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded with the core purpose. The single sentence efficiently conveys the tool's focus. However, the 'Advanced:' prefix adds minimal value and could be integrated more smoothly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain the relationship between parameters (e.g., how query interacts with hours/granularity), doesn't describe the output format, and doesn't cover authentication or rate limiting considerations. The description should do more given the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 4 parameters. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions 'timeseries' which relates to the granularity parameter, but doesn't explain this connection. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes Twitter/X discussion volume trends with specific metrics (timeseries, velocity, peak activity). It distinguishes itself from sibling 'search_x' by focusing on volume trends rather than search results. However, it doesn't specify the exact verb (e.g., 'analyze' or 'retrieve') which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'search_x' or 'x_account'. It doesn't mention prerequisites, use cases, or exclusions. The word 'Advanced' hints at complexity but doesn't clarify appropriate contexts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources