Skip to main content
Glama

Server Details

Calibrated world model for AI agents. 40 tools: world state, markets, trading. Kalshi + Polymarket.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
spfunctions/simplefunctions-cli
GitHub Stars
10
Server Listing
SimpleFunctions

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.4/5 across 56 of 56 tools scored. Lowest: 2.5/5.

Server CoherenceB
Disambiguation2/5

Many tools have overlapping or ambiguous purposes, causing potential confusion. For example, 'get_trade_ideas', 'get_edges', 'scan_markets', and 'screen_markets' all relate to finding trading opportunities but with unclear distinctions. Similarly, 'get_world_state' and 'get_world_delta' serve similar functions with incremental vs. full updates, and 'enrich_content' vs. 'monitor_the_situation' both cross-reference content with markets but differ in scope and authentication. The descriptions help somewhat, but the high tool count exacerbates the overlap.

Naming Consistency4/5

Tool names generally follow a consistent verb_noun pattern (e.g., 'create_intent', 'list_theses', 'update_position'), with clear actions like get, create, update, list. However, there are minor deviations such as 'augment_tree' (verb_noun but less common), 'browse_public_skills' (browse vs. list), 'enrich_content' (enrich vs. get), and 'what_if' (non-standard phrasing). Overall, the naming is mostly predictable and readable, with only a few inconsistencies.

Tool Count2/5

With 56 tools, the count is excessive for a single server, making it overwhelming and difficult to navigate. While the domain (prediction market trading and analysis) is broad, many tools could be consolidated or grouped (e.g., multiple 'get_' tools for market data, advanced vs. basic functions). This high number suggests poor scoping and likely includes redundant or overly granular operations that hinder usability.

Completeness5/5

The tool set provides comprehensive coverage for prediction market trading and analysis, with no obvious gaps. It includes full lifecycle management (create, update, list, delete for theses, intents, positions), data retrieval (market prices, edges, context, forecasts), execution tools (intents, strategies), and auxiliary functions (forum, skills, legislation). The surface supports end-to-end workflows from discovery to execution and monitoring, ensuring agents can perform all necessary operations without dead ends.

Available Tools

108 tools
add_positionBInspect

Record a new position in a thesis for tracking. Use after an intent fills or a manual trade.

ParametersJSON Schema
NameRequiredDescriptionDefault
sizeNoPosition size (contracts)
venueYesExchange venue
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
directionYesPosition direction
rationaleNoWhy this position
entryPriceYesEntry price in cents
marketTitleYesHuman-readable market name
externalMarketIdYesMarket ticker
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it implies a write operation ('Record a new position'), it lacks details on permissions, side effects, error handling, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, consisting of just two sentences that directly state the purpose and usage context. Every word earns its place, with no redundant or unnecessary information, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a write operation with 9 parameters, 7 required), no annotations, and no output schema, the description is incomplete. It doesn't address behavioral aspects like what happens on success/failure, authentication needs (implied by 'apiKey' but not explained), or how it integrates with sibling tools, leaving significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents all parameters thoroughly. The description doesn't add any additional meaning or context beyond what the schema provides (e.g., it doesn't explain relationships between parameters like 'thesisId' and 'externalMarketId'). Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Record') and resource ('new position in a thesis for tracking'), making it understandable. However, it doesn't explicitly differentiate from sibling tools like 'update_position' or 'close_position', which would require more specific context about when to use each.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidelines by stating 'Use after an intent fills or a manual trade,' which gives some context for when to invoke this tool. However, it doesn't explicitly mention alternatives (e.g., when to use 'update_position' instead) or exclusions, leaving room for ambiguity compared to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

augment_treeAInspect

Advanced: Merge suggested causal tree nodes from evaluations into the tree (append-only).

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
dryRunNoPreview without applying (default: false)
thesisIdYesThesis ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the 'append-only' behavioral trait, which is valuable context beyond the basic 'merge' action. However, it doesn't mention permissions, side effects, rate limits, or what happens on failure. The 'Advanced' label hints at complexity but lacks specifics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and includes a qualifier ('append-only') that adds necessary context without verbosity. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is minimal but covers the core purpose. It lacks details on permissions, error handling, or return values, which are important for an 'Advanced' operation. The 100% schema coverage helps, but behavioral context is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description doesn't add any parameter-specific semantics beyond what the schema provides (e.g., it doesn't explain how 'dryRun' interacts with merging). Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('merge suggested causal tree nodes from evaluations') and resource ('into the tree'), with the qualifier 'append-only' indicating a specific type of merge. It distinguishes from siblings like 'update_nodes' by focusing on merging from evaluations rather than general updates, though it doesn't explicitly name alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when there are 'suggested causal tree nodes from evaluations' to be merged, but doesn't specify when to use this versus alternatives like 'update_nodes' or 'update_thesis'. No explicit when-not or prerequisite guidance is provided, leaving usage context somewhat implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

batch_marketsAInspect

Fetch many markets at once by ticker list. Cheaper than calling get_market_detail in a loop.

ParametersJSON Schema
NameRequiredDescriptionDefault
depthNoOrderbook depth levels (0 = none)
tickersYesComma-separated tickers
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description adds only the cost hint ('Cheaper'), but does not disclose rate limits, error behavior, or response handling. It meets minimum disclosure but is not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no wasted words. Every part adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-param tool with no output schema, the description covers the purpose and cost advantage but omits return format, limits, and error scenarios. Adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already described. The description adds no extra meaning beyond 'by ticker list', which matches the schema's 'Comma-separated tickers'. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a clear verb ('Fetch') and resource ('markets'), and explicitly contrasts with a sibling ('get_market_detail'), distinguishing the tool's batch nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It states when to use this tool ('cheaper than calling get_market_detail in a loop'), providing a clear usage context, though it does not mention when not to use it or other alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

browse_public_skillsBInspect

Browse public skills from the community. Free-tier and rate-limited. Filter by category, search, or sort by popularity.

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoSearch skills by name/description
sortNoSort order: popular or new (default: new)
categoryNoFilter by category: custom, trading, research, monitoring
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It states 'Free-tier and rate-limited,' which adds context about access restrictions, but fails to mention other important behaviors such as whether the operation is read-only, what happens on rate limit, or the structure of the response. The description is minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise, with two sentences. The first sentence clearly states the purpose, and the second adds constraints and filtering options. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (0 required params, no nested objects, no output schema), the description covers purpose and constraints. However, it lacks a hint about the return value, which would be helpful since no output schema is provided. Overall adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have descriptions in the schema (100% coverage). The description does not add extra meaning beyond stating that filtering, search, and sorting are possible, which aligns with the schema. Baseline 3 is appropriate as the description adds no new information.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the resource ('public skills from the community') and verb ('Browse'), but does not explicitly differentiate from siblings like 'list_skills' or 'get_skills', which may also list skills. However, the term 'public' and 'community' implies a distinction from personal skills.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Free-tier and rate-limited' as a constraint, but provides no guidance on when to use this tool versus alternatives. There is no explicit context for when to use or when not to use, nor references to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_intentBInspect

Cancel an active intent. Stops trigger evaluation and prevents execution.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
intentIdYesIntent ID to cancel
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'Stops trigger evaluation and prevents execution', which adds some behavioral context about halting processes. However, it lacks details on permissions, reversibility, side effects, or response format, leaving significant gaps for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two short sentences that directly state the tool's purpose and effect. Every word contributes meaning, with no wasted text, making it front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks information on success/failure outcomes, error conditions, or broader system impact, which are critical for an agent to use this tool effectively in context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters ('apiKey', 'intentId') well-documented in the schema. The description adds no additional parameter details beyond what the schema provides, so it meets the baseline of 3 without compensating for any gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Cancel') and resource ('an active intent'), specifying what the tool does. It distinguishes from siblings like 'create_intent' or 'list_intents' by focusing on termination rather than creation or listing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites (e.g., needing an active intent) or compare to other tools like 'close_position' or 'update_strategy', leaving usage context unclear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

close_positionCInspect

Delete a position record from a thesis.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
positionIdYesPosition ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states 'Delete,' implying a destructive mutation, but doesn't disclose behavioral traits like whether deletion is permanent, requires specific permissions, has side effects on related data, or returns confirmation. This leaves significant gaps in understanding the tool's impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, direct sentence with zero wasted words, efficiently conveying the core action and target. It's appropriately sized and front-loaded, making it easy for an agent to parse quickly without unnecessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity as a destructive operation with no annotations and no output schema, the description is incomplete. It fails to address critical aspects like return values, error conditions, or confirmation of deletion, leaving the agent with insufficient information for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions in the input schema (e.g., 'Thesis ID,' 'Position ID,' 'SimpleFunctions API key'). The description adds no additional semantic context beyond what the schema provides, so it meets the baseline for adequate but not enhanced parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete') and target resource ('a position record from a thesis'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'update_position' or 'add_position', which would require more specific context about what distinguishes deletion from modification or creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'update_position' or other position-related operations. The description lacks context about prerequisites, such as ensuring the position exists or isn't actively in use, leaving the agent without usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

configure_heartbeatAInspect

Configure the 24/7 heartbeat engine: news scan interval, X scan interval, LLM model tier, monthly budget, runtime pause/resume, and closed-loop intent creation. Agent can speed up monitoring during high volatility or slow down to save budget.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
pausedNoPause/resume heartbeat
thesisIdYesThesis ID
xIntervalMinNoX/social scan interval in minutes (60-1440, default 240)
evalModelTierNoLLM model for evaluations
closedLoopExitNoEnable/disable closed-loop exit intent creation
closedLoopEntryNoEnable/disable closed-loop entry intent creation
newsIntervalMinNoNews scan interval in minutes (15-1440, default 240)
monthlyBudgetUsdNoMonthly budget cap in USD (0 = unlimited)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the burden. It mentions 'runtime pause/resume' which is a behavioral trait, but lacks details on side effects, persistence, permissions beyond the required apiKey, or whether changes take effect immediately. More context on mutation impact would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences: one listing settings, one providing usage guidance. No redundant phrases. It earns its place but could include a brief note on return value or effect, keeping it slightly above average.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 parameters and no output schema or annotations, the description covers the tool's function and usage context. However, it does not explain what happens after configuration (e.g., success indication, instant apply), leaving gaps for a mutation tool. It is adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with each parameter having a clear description. The description groups parameters by category but adds minimal new meaning beyond the schema. The phrase 'speed up monitoring' provides context for interval values, but overall does not significantly enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the verb 'Configure' and specifies the resource '24/7 heartbeat engine', listing all major configurable items (intervals, model tier, budget, pause/resume, closed-loop intents). This clearly distinguishes it from sibling tools like get_heartbeat_config and get_heartbeat_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides contextual guidance by stating the agent can speed up monitoring during high volatility or slow down to save budget. It implies when to use the tool based on volatility and budget constraints. However, it does not explicitly state when not to use it or suggest alternatives for read-only actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_intentAInspect

Declare an execution intent: "buy X when condition Y, expire at Z." Intents are the single gateway for all order execution. The local runtime daemon evaluates triggers and executes via user's Kalshi/Polymarket keys.

ParametersJSON Schema
NameRequiredDescriptionDefault
venueYesExchange venue
actionYesTrade action
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
sourceNoIntent source: agent, manual, ideaagent
expireAtNoISO timestamp when intent expires (default: +24h)
marketIdYesMarket ticker (e.g. KXFEDDEC-25DEC31-T100)
maxPriceNoMax price per contract in cents (1-99). Omit for market order.
sourceIdNoSource reference (idea ID, thesis ID, etc.)
directionYesContract direction
rationaleNoWhy this trade — logged for audit trail
autoExecuteNoAuto-execute without later human confirmation. Defaults to false unless explicitly set true.
marketTitleYesHuman-readable market name
triggerTypeNoWhen to executeimmediate
triggerPriceNoPrice trigger threshold in cents (for price_below/price_above)
targetQuantityYesNumber of contracts
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description explains that the local runtime daemon evaluates triggers and executes via user's keys, disclosing asynchronous execution and key requirement. No annotations were provided, so the description carries the full burden and adequately conveys the behavioral model.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. First sentence defines the function, second gives operational context. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (15 parameters, no output schema, no annotations), the description provides a high-level behavioral overview and operational concept, which is valuable. However, it does not elaborate on parameter usage or return values, but the exhaustive schema descriptions compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add significant parameter-level detail beyond what the schema already provides; it only hints at direction, market, trigger, and expiry.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it declares an execution intent for conditional orders, and positions itself as the single gateway for order execution, distinguishing it from any other trading tool among siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Intents are the single gateway for all order execution,' providing clear guidance that this tool should be used for any trade execution, with no alternative for direct order submission.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_skillAInspect

Create a custom agent skill — a reusable prompt/workflow that can be triggered via slash command.

ParametersJSON Schema
NameRequiredDescriptionDefault
autoNoAuto-trigger condition
nameYesSkill name (e.g. "precheck")
tagsNoTags for discovery
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
promptYesThe full prompt/instructions for the skill
triggerYesSlash command trigger (e.g. "/precheck")
categoryNoCategory: custom, trading, research, monitoring
toolsUsedNoSimpleFunctions tools this skill uses
descriptionYesWhat this skill does
estimatedTimeNoEstimated run time
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only says 'create', implying mutation. Does not disclose side effects, required permissions, rate limits, or any warnings. Fails to add behavioral context beyond the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with purpose, no redundant words. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 10 parameters and no output schema, the description is minimal. It covers the core purpose but lacks details on return values, naming constraints, or error conditions. Adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter is described. The tool description adds no additional meaning beyond referencing 'slash command' which is already in the trigger parameter description. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool creates a custom agent skill, defined as a reusable prompt/workflow triggered via slash command. This distinguishes it from sibling tools like list_skills and run_skill.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or alternatives mentioned. Context implies creation, but no guidance on when not to use or comparison with other skill-related siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_strategyAInspect

Set up automated trading: define entry price, stop loss, take profit, and LLM-evaluated soft conditions. The heartbeat engine checks conditions every 15 min and executes when met.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
marketYesHuman-readable market name
horizonNoTime horizonmedium
marketIdYesMarket ticker e.g. KXWTIMAX-26DEC31-T150
stopLossNoStop loss: bid <= this value (cents)
thesisIdYesThesis ID
directionYesTrade direction
rationaleNoFull logic description
entryAboveNoEntry trigger: ask >= this value (cents, for NO direction)
entryBelowNoEntry trigger: ask <= this value (cents)
takeProfitNoTake profit: bid >= this value (cents)
maxQuantityNoMax total contracts
softConditionsNoLLM-evaluated conditions
perOrderQuantityNoContracts per order
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds valuable context about the 'heartbeat engine' checking conditions every 15 minutes and executing when met, which reveals automation behavior. However, it doesn't cover important aspects like authentication needs (apiKey is documented in schema but not described in behavior), rate limits, error handling, or what happens on execution failure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences that efficiently convey the core functionality. The first sentence establishes the purpose, and the second adds important behavioral context about the heartbeat engine. There's minimal waste, though it could be slightly more structured for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex 14-parameter tool with no annotations and no output schema, the description provides adequate but incomplete context. It covers the automation aspect well but doesn't address important considerations like what the tool returns, error conditions, or how it interacts with other trading tools. The description compensates somewhat for the lack of structured metadata but leaves gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 14 parameters thoroughly. The description mentions key parameters like 'entry price, stop loss, take profit, and LLM-evaluated soft conditions' but doesn't add meaningful semantic context beyond what the schema provides. It establishes the purpose but doesn't clarify parameter relationships or usage nuances.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Set up automated trading') and resources ('define entry price, stop loss, take profit, and LLM-evaluated soft conditions'), distinguishing it from siblings like 'create_intent' or 'update_strategy' by focusing on automated trading strategy creation with heartbeat monitoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('automated trading', 'heartbeat engine checks conditions every 15 min') but doesn't explicitly state when to use this tool versus alternatives like 'create_intent' or 'update_strategy'. It provides some operational context but lacks explicit guidance on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_thesisAInspect

Create a new prediction market thesis from a TESTABLE CLAIM — a statement that can be verified true or false at a future time. GOOD: "Bitcoin closes 2026 above $50,000". BAD: "High conviction due to large price gap" (that is reasoning, not a claim). Builds a causal tree and scans for mispriced contracts. Formation takes ~60s.

ParametersJSON Schema
NameRequiredDescriptionDefault
syncNoWait for formation to complete
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisYesYour thesis statement
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses behavioral traits: formation takes ~60s, builds a causal tree, scans for mispriced contracts, and the sync parameter's effect. However, it does not mention authentication requirements beyond the apiKey parameter or potential side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences and a bullet-point example. It front-loads the core purpose and provides essential details without superfluous text. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description explains the process (builds causal tree, scans mispriced contracts) and timing. The sync parameter is explained. The only minor gap is no mention of error conditions or output format, but overall sufficient for the tool complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds value by clarifying 'thesis' as a testable claim, providing examples, and explaining the 'sync' parameter's effect. This goes beyond the schema's simple descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Create a new prediction market thesis from a TESTABLE CLAIM' and distinguishes it from siblings like update_thesis and fork_thesis by specifying the input type (testable claim). It provides concrete examples of good and bad claims, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (creating a thesis from a testable claim) but does not explicitly state when not to use or suggest alternatives. It gives criteria for a good claim but lacks guidance on compared to sibling tools like create_strategy.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

enrich_contentBInspect

Cross-reference any text with thousands of prediction market contracts. Paste content + topics, get divergence analysis: where sentiment disagrees with market prices. No auth, no Firecrawl needed. Demo/trial entry point.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoLLM model for digest generation. Default: google/gemini-2.5-flash
topicsYesTopics to search in prediction markets, e.g. ["iran", "oil"]
contentYesText content to cross-reference (up to 50K chars)
includeIndexNoInclude SimpleFunctions Index
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions no auth and no Firecrawl, but does not disclose other behaviors such as rate limits, output limits, or what happens with large inputs beyond the schema's 50K char limit. This is insufficient for full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no wasted words. It front-loads the core action, then provides input hints and context. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, no output schema, and no annotations, the description leaves significant gaps. It does not explain what the 'divergence analysis' output contains, how it is structured, or any constraints beyond input limits. For a trial entry point, this may be acceptable but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description only reiterates 'content + topics' and does not add meaning beyond the schema for parameters like 'model' or 'includeIndex'. It offers no additional context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: cross-referencing text with prediction market contracts to get divergence analysis. It uses a specific verb ('Cross-reference') and resource ('prediction market contracts'). While it doesn't explicitly differentiate from siblings, the unique function of sentiment/market price comparison is evident.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for quick demos or trials with phrases like 'Demo/trial entry point' and 'No auth, no Firecrawl needed'. It gives some context but lacks explicit guidance on when not to use or alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explore_publicAInspect

Browse public theses from other users. Free-tier and rate-limited. Pass a slug to get details, or omit to list all.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugNoSpecific thesis slug, or empty to list all
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description adds rate-limit and free-tier context. However, it does not disclose authentication requirements, pagination, or whether the tool is read-only (implied by 'browse').

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no unnecessary words. The first sentence states purpose, the second explains parameter usage. Highly efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description sufficiently covers usage, constraints (free-tier, rate-limited), and parameter behavior. Could mention pagination but not essential.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already describes the slug parameter well. The description restates the two use cases, adding no new information beyond what the schema provides. Schema description coverage is 100%, so baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Browse public theses from other users' and distinguishes two modes (list all or get details by slug). However, it does not explicitly differentiate from sibling tools like 'explore_theses' or 'list_theses'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use each mode ('Pass a slug to get details, or omit to list all') and mentions rate-limiting, but does not provide guidance on when to choose this tool over alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explore_thesesAInspect

Browse public theses (alias of explore_public). Pass slug to get one, omit to list.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugNoThesis slug, or empty to list
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the burden. It states the tool is for browsing (read-only), which is appropriate, but lacks details on auth, rate limits, or other behavioral traits. It is adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the key action, and contains no waste. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple browse tool with one optional parameter and no output schema, the description is mostly sufficient. However, it fails to mention what the tool returns (e.g., a list of thesis summaries or a full thesis object), which would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has one parameter 'slug' with a clear description. The tool description essentially repeats that ('Pass slug to get one, omit to list'). With high schema coverage (100%), baseline is 3, and the description adds no significant new meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Browse'), the resource ('public theses'), and the behavior (list or get one by slug). It also distinguishes itself from siblings by noting it's an alias of explore_public.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly explains how to use the tool: pass slug to get one thesis, omit to list all. It implies it's an alias of explore_public, but does not provide explicit alternatives or when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fork_skillBInspect

Fork a public skill into your collection. No slug needed — just the skill ID.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
skillIdYesPublic skill ID to fork
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the operation ('fork') which implies a copy/create action, but doesn't describe what happens during forking (e.g., whether it creates an editable copy, maintains links to original, includes dependencies). No information about permissions needed, rate limits, error conditions, or what the forked skill inherits. The description is minimal and leaves significant behavioral aspects unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just two sentences, with zero wasted words. It's front-loaded with the core purpose, followed by a brief parameter guidance. Every sentence earns its place by providing essential information about the tool's function and parameter expectations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete for a mutation tool. While concise, it lacks critical information about what the fork operation actually does behaviorally, what permissions are required, what the response contains, or how the forked skill relates to the original. For a tool that presumably creates a copy of someone else's work, more context about the operation's implications would be valuable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so both parameters are documented in the schema. The description adds minimal value beyond the schema: it mentions 'No slug needed' which clarifies what's NOT required, and 'just the skill ID' reinforces the skillId parameter's purpose. However, it doesn't provide additional context about parameter relationships, validation rules, or usage patterns that aren't already in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Fork') and resource ('a public skill into your collection'), providing a specific verb+resource combination. It distinguishes from siblings like 'create_skill' by specifying it's for forking existing public skills rather than creating new ones from scratch. However, it doesn't explicitly differentiate from 'fork_thesis' which is a similar operation on a different resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by specifying 'public skill' and 'into your collection,' suggesting this is for copying shared skills to personal use. It mentions 'No slug needed' which provides some guidance about parameter requirements. However, it doesn't explicitly state when to use this versus alternatives like 'create_skill' or 'browse_public_skills,' nor does it provide exclusion criteria or prerequisites beyond having the skill ID.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fork_thesisAInspect

Fork a thesis. Two modes: (1) Clone — call with just idOrSlug to copy a PUBLIC thesis verbatim into your collection. (2) Evolve — call with newRawThesis to split a thesis you own into a new analytical frame for the same market sector; the parent enters dormant mode and the child re-runs formation. Use evolve when the current frame is fundamentally inadequate for what the evidence now shows.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
reasonNoEvolve mode: why the original frame is no longer adequate
idOrSlugYesThesis ID or public slug
newTitleNoEvolve mode: short title for the new thesis, ≤60 chars
newRawThesisNoEvolve mode: new frame in 1-3 sentences. Omit for clone mode.
inheritEdgeMarketIdsNoEvolve mode: subset of current edge marketIds to carry over. Omit to let formation rescan fresh.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description bears full burden. Discloses that in Evolve mode, the parent enters dormant mode and child re-runs formation. Also notes Clone requires public thesis. Lacks details on auth responsibilities or error states, but gives key behavioral insights.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each earning its place: first states purpose, second describes clone, third describes evolve and when to use. Front-loaded with the overall action, no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers the two modes and key parameter usage. However, lacks explanation of output (e.g., returns new thesis ID or success status) and edge cases like calling evolve on non-owned thesis. Overall adequate for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, so baseline is 3. Description adds value by explaining how parameters relate to modes (e.g., omitting newRawThesis implies clone mode, inheritEdgeMarketIds for evolve). This clarifies usage beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly describes the tool as forking a thesis with two distinct modes: Clone (copy a public thesis) and Evolve (split a thesis you own). The verb 'fork' and resource 'thesis' are specific, and the modes differentiate it from sibling tools like fork_skill.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use each mode: Clone for public theses, Evolve when the current frame is inadequate. Does not mention alternatives like create_thesis or update_thesis, but provides clear context for choosing between the two modes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_agent_guideBInspect

Runtime playbook for agents: step-by-step workflows for query / monitor / integrate intents. Use when an agent is lost or needs onboarding.

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoSpecific question to scope the guide
intentNoWorkflow intent
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It only says it's a 'playbook' with workflows, but doesn't state whether it's read-only, has side effects, or requires authentication. Inadequate given the lack of annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two lean sentences, front-loaded with purpose. Efficient but could include a bit more structure without bloat.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema or annotations, the description should cover return format, example usage, or cross-reference siblings. It only gives high-level purpose and usage scenario, leaving gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline of 3. The description mentions intents (matching the enum) but adds no new meaning beyond what schema already provides for either parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides a runtime playbook with workflows for specific intents (query, monitor, integrate). It distinguishes itself from generic 'get_' tools by specifying its role for agent onboarding.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when an agent is lost or needs onboarding', giving clear context. Does not mention alternatives or when not to use, but the guidance is direct.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_answerBInspect

Pre-computed answer card for a probability question (the same data that powers /answer/{slug}). Returns probability, confidence, and citations.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesAnswer slug (e.g. will-the-fed-cut-rates-in-december)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses return fields but does not state whether the tool is read-only, requires authentication, has rate limits, or any side effects. The term 'pre-computed' implies safety but is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no superfluous text. Purpose and output are front-loaded, making it easy for an AI agent to quickly understand the tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description sufficiently explains what is returned. However, it lacks context about data availability (e.g., when answer cards exist) and does not clarify how this differs from similar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and documents the slug parameter adequately. Description does not add any additional parameter information beyond what the schema provides, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool retrieves a pre-computed answer card for a probability question, including specific return fields (probability, confidence, citations). This differentiates it from sibling get_* tools like get_forecast or get_opinion by focusing on answer cards for probability questions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. While it mentions the data powers /answer/{slug}, it doesn't indicate when to prefer this over related tools (e.g., get_forecast) or any prerequisites or constraints.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_balanceCInspect

Advanced: Kalshi account balance and portfolio value.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Advanced:' which might imply complexity or additional requirements, but doesn't explain what that entails (e.g., rate limits, authentication needs, or data freshness). It also doesn't describe the return format (e.g., numeric balance, structured portfolio data) or potential errors. For a financial tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief with two phrases, but 'Advanced:' is vague and doesn't earn its place by adding useful information—it could be omitted without loss. The core purpose is stated efficiently, but the structure isn't front-loaded with the most critical details (e.g., it doesn't immediately clarify this is a read-only tool for account data). It's concise but could be more effectively structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a financial tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., balance in dollars, portfolio breakdown), behavioral aspects like authentication requirements or rate limits, or how it fits with sibling tools. For a tool that likely provides critical account information, this leaves too much unspecified for reliable agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its single parameter (apiKey), detailing it as a 'SimpleFunctions API key' with a URL for acquisition. The description adds no parameter-specific information beyond what the schema provides, such as how the apiKey relates to the Kalshi account or any additional context. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool retrieves 'Kalshi account balance and portfolio value', which specifies the resource (account/portfolio) and the action (get/retrieve). However, it's prefixed with 'Advanced:' which adds ambiguity rather than clarity, and it doesn't distinguish this from potential sibling tools like 'get_fills' or 'get_settlements' that might also relate to account data. The purpose is clear but not optimally specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an API key for authentication), context for usage (e.g., checking account status before trading), or exclusions (e.g., not for real-time market data). With many sibling tools present, this lack of differentiation leaves the agent to infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_briefingBInspect

Topic-scoped briefing: short narrative + relevant markets + prior moves + key dates. Reusable as a callable /briefing card.

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoTopic keyword
windowNoLookback window
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden of disclosure. It implies a read-only operation ('briefing') and lists outputs, but does not explicitly state side effects, permissions, or rate limits. It is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (two sentences) and front-loaded with the main purpose. It avoids redundancy but could be slightly more structured. However, it is concise enough for an agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and two parameters with full schema coverage, the description lists the output components (narrative, markets, moves, dates) which helps. However, it omits return format, error handling, and defaults, leaving gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds minimal value beyond 'topic keyword' and 'lookback window', merely restating them in context. No additional semantics are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a 'topic-scoped briefing' with specific components (narrative, markets, moves, dates), distinguishing it from sibling tools like get_market_detail or get_calendar. However, it could be more explicit about how it differs from other get_* tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool vs alternatives. It mentions 'reusable as a callable /briefing card' but does not specify scenarios or prerequisites, leaving the agent uncertain about context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calendarAInspect

Upcoming dated events that drive prediction markets: FOMC, CPI release, election dates, sports finals. Returns date, topic, and linked tickers.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoLookahead days (default 30)
categoryNoCategory filter (econ, election, sports, geo)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description discloses output fields but omits details like default lookahead (30 days as per schema) or any authentication/rate limit requirements. Includes relevant examples, but lacks comprehensive behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no wasted words. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description outlines return fields (date, topic, tickers). With two optional parameters and clear context among many siblings, the description is mostly complete for a retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3 is appropriate. Description adds minimal value beyond schema by giving example categories but does not clarify parameter formats or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'upcoming dated events that drive prediction markets' with specific examples (FOMC, CPI, election dates, sports finals) and output fields (date, topic, linked tickers). This distinguishes it from other get_ tools on the server.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's for market-relevant events, but does not state when not to use it or compare with other tools like get_schedule.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calibrationCInspect

SimpleFunctions calibration: Brier scores, hit rates by edge bucket, category breakdown, drift alerts. Measured against resolved/settled markets.

ParametersJSON Schema
NameRequiredDescriptionDefault
periodNoPeriod (30d, 90d, all)
categoryNoTopic filter (fed, elections, ai, crypto, sports, ...)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully convey behavioral traits. It names outputs (e.g., drift alerts) but does not explain behaviors like data freshness, authentication requirements, or operational constraints. The description is insufficient for an agent to understand side effects or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, listing key outputs and measurement context. It is concise without superfluous words. However, the first sentence is a comma-separated list which slightly reduces readability. Still efficient overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has two parameters, no output schema, and no annotations, the description is minimally complete. It covers what the tool returns but omits details like output format, interpretation of drift alerts, or examples. Could be improved with more context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters (period and category) are fully described in the input schema, achieving 100% coverage. The description adds no extra semantic nuance beyond the schema. Baseline score of 3 is appropriate as the schema already provides adequate meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns calibration data including Brier scores, hit rates, category breakdowns, and drift alerts. It specifies the resource (calibration) and that measurements are against resolved markets. However, it could be more explicit about what exactly is being calibrated (e.g., forecast probabilities).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, exclusions, or specific contexts. While the purpose is clear, an agent lacks cues for appropriate invocation relative to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_changesAInspect

Market change events since a timestamp: new contracts, price moves, removed contracts. Used by the live feed and agent context refreshers.

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoKeyword filter
typeNoChange type
sinceNoISO timestamp lower bound
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the tool returns change events, implying a read operation, but does not disclose other behaviors like authentication, rate limits, or pagination. The description is adequate but lacks detailed behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences (17 words), front-loaded with key information, and wastes no words. It achieves efficiency without sacrificing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 3 parameters and no output schema, the description covers the essential aspects: purpose, relevant event types, and usage context. It does not explain return format, but that is acceptable given the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so the description adds limited additional value. It mentions the event types (new_contract, price_move, removed_contract) which align with the type parameter, but does not provide further semantic enrichment beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves market change events (new contracts, price moves, removed contracts) since a timestamp. It does not explicitly differentiate from sibling tools like get_changes_delta, but it provides enough context to understand its purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it is used by the live feed and agent context refreshers, giving context for when to use it. However, it does not provide explicit alternatives or when-not-to-use guidance, leaving room for ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_changes_deltaAInspect

Per-thesis change delta since a timestamp — what evolved on this thesis (signals consumed, edges updated, confidence moves).

ParametersJSON Schema
NameRequiredDescriptionDefault
sinceNoISO timestamp lower bound
apiKeyYesSimpleFunctions API key
thesisIdYesThesis ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description carries full burden. It indicates a read operation returning evolution details but does not disclose side effects, permissions beyond the API key, or any limitations. Adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the main action. Every word contributes value; no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description sufficiently explains return value (signals, edges, confidence moves). Missing pagination or ordering details, but acceptable for a delta tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds context by linking 'since' to the timestamp and 'thesisId' to the thesis, but no extra details beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a per-thesis change delta since a timestamp. It specifies the resource (thesis) and the action (get delta), and distinguishes from the sibling 'get_changes' by adding 'delta' and listing what evolved.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for tracking changes over time but does not explicitly mention when not to use it or suggest alternatives like 'get_changes'. No guidance on selection between similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_congress_memberBInspect

Get a single Congress member by bioguide ID.

ParametersJSON Schema
NameRequiredDescriptionDefault
idYesBioguide ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose any behavioral traits (e.g., read-only, rate limits, or side effects). 'Get' implies read-only but is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff, directly conveys the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple lookup by ID with one parameter, the description is sufficient. No output schema exists, but description doesn't need to detail return values. Minimal but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with 'Bioguide ID' for the parameter. The description adds no additional meaning beyond the schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get a single Congress member by bioguide ID', specifying the action (get) and the resource (Congress member), and distinguishes from sibling 'list_congress_members'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'list_congress_members'. No explicit when-to-use or when-not-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_contagionAInspect

Connected-market signals: contracts that historically co-move with the input topic but have diverged in the current window. Surfaces "this market should have moved but didn't" trades.

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoTopic keyword (fed, election, ai)
windowNoLookback (e.g. 24h, 7d)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the tool returns signals of diverged co-moving contracts and mentions temporal aspects ('current window'), but lacks details on data freshness, measurement methodology, or any side effects. Basic transparency is present but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences efficiently convey the tool's purpose and value proposition. No redundancy, but could be slightly more structured (e.g., separate usage note). Still, it's well within acceptable limits for conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given moderate complexity (2 params, no output schema), the description adequately explains the tool's function and context. However, it omits return format or example output, which would improve completeness for an agent. Still, it provides sufficient context for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions for 'topic' and 'window'. The description adds conceptual context (contagion, divergence) but does not enhance parameter semantics beyond what the schema already provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb (get/surfaces), resource (contracts that co-move and diverge), and unique scope (connected-market signals, divergence). It effectively differentiates from sibling tools like get_market_diff or get_changes by focusing on historical co-movement and current divergence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use when looking for diverged co-moving markets but provides no explicit when-to-use or when-not-to-use guidance. No alternatives are named, leaving the agent to infer context from sibling tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_contextAInspect

START HERE. Global market snapshot: top edges (mispriced contracts), price movers, highlights, traditional markets — live exchange data updated every 15 min. With thesisId + apiKey: thesis-specific context including causal tree, edges with orderbook depth, evaluation history, and track record. Global context is free-tier and rate-limited; API keys unlock higher limits.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyNoSimpleFunctions API key. Required only for thesis-specific context. Get one at https://simplefunctions.dev/dashboard/keys or run: sf login
thesisIdNoThesis ID. Omit for global market snapshot on the free tier.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses update frequency (every 15 min), rate limits, and authentication requirements. No contradictions or omissions noted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each focused on one mode. No wasted words. Front-loaded with 'START HERE' for priority.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description adequately describes return contents for both modes. Could be more explicit about format, but sufficient for a starting-point tool with sibling list showing many other tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. Description adds significant value by explaining the interplay between apiKey and thesisId, and what each combination returns, beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool provides a global market snapshot or thesis-specific context, with specific details like top edges, price movers, causal tree, etc. It distinguishes between two modes based on parameters.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'START HERE' as entry point. Tells when to use each mode (omit thesisId for global snapshot, provide for thesis-specific). Mentions free-tier rate limits and that API keys unlock higher limits, guiding usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_economic_anchorsAInspect

Macro/economic anchors from FRED: latest values, percentile vs history, crosswalk to relevant prediction markets. For grounding macro theses.

ParametersJSON Schema
NameRequiredDescriptionDefault
seriesNoFRED series ID (e.g. CPIAUCSL)
categoryNoCategory (rates, inflation, employment, growth)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description implies read-only operation and describes outputs, but does not disclose authentication needs, rate limits, or detailed behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no waste, front-loading the purpose and key outputs.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema; description covers return value types (latest values, percentiles, crosswalks). Could benefit from more detail on format or structure, but sufficient for a moderately complex tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds minimal new information beyond clarifying the source (FRED) and context (macro anchors).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves macro/economic anchors from FRED, providing latest values, percentiles, and crosswalks to prediction markets, distinguishing it from other data retrieval tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Indicates usage for grounding macro theses, but does not explicitly mention when not to use or suggest alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_edgesAInspect

Top mispriced markets across all theses, ranked by edge size. Shows where your model disagrees with the market. No auth = public thesis edges. Auth = private + public.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax edges to return
venueNoFilter: kalshi or polymarket
apiKeyNoSimpleFunctions API key. Omit for public edges only.
minEdgeNoMinimum edge in cents
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: the ranking mechanism (by edge size), the effect of auth (public vs. private+public edges), and the focus on model-market disagreement. However, it lacks details on rate limits, error handling, or output format, which would be needed for a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by clarifying details in two concise sentences. Every sentence adds value (e.g., explaining auth impact), with no wasted words or redundancy, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, no annotations, no output schema), the description is adequate but has gaps. It covers the purpose, auth context, and ranking logic, but lacks details on output format, error cases, or deeper behavioral aspects like pagination. This makes it minimally viable for a tool with this parameter count and no structured support.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (limit, venue, apiKey, minEdge). The description adds marginal value by implying the apiKey's role in auth (private vs. public edges), but does not provide additional semantics beyond what the schema offers, such as explaining 'edge' units or venue specifics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it retrieves 'Top mispriced markets across all theses, ranked by edge size' and explains that it 'Shows where your model disagrees with the market.' This specifies the verb (get/retrieve), resource (edges/mispriced markets), and scope (across all theses), distinguishing it from siblings like get_markets or get_forecast by focusing on edge-based ranking of mispricing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use it: to find mispriced markets based on edge size. It distinguishes between public and private edges based on auth presence, but does not explicitly mention when not to use it or name specific alternatives among siblings (e.g., scan_markets or screen_markets might be related but are not referenced).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_evaluation_historyBInspect

Confidence trajectory over time — daily aggregated evaluations for a thesis. Use to analyze trends, detect convergence/divergence.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes the tool as retrieving aggregated data over time, which implies a read-only operation, but does not specify aspects like rate limits, authentication needs (beyond the apiKey parameter), data format, or pagination. This leaves gaps in understanding how the tool behaves in practice.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded, consisting of two short sentences that directly state the tool's function and usage. There is no wasted language, and it efficiently communicates the core purpose without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a read operation with 2 parameters) and the absence of annotations and output schema, the description is moderately complete. It explains what data is retrieved but lacks details on return values, error handling, or behavioral constraints. This is adequate for a basic tool but leaves room for improvement in guiding the agent fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (thesisId and apiKey) documented in the schema. The description does not add any additional meaning or context beyond what the schema provides, such as explaining how the thesisId relates to evaluations or the aggregation process. Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to retrieve 'daily aggregated evaluations for a thesis' and analyze 'confidence trajectory over time.' It specifies the resource (thesis evaluations) and the verb (get/analyze), but does not explicitly differentiate from siblings like 'get_milestones' or 'get_world_state,' which might also involve thesis-related data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'use to analyze trends, detect convergence/divergence,' which implies a context for usage but does not provide explicit guidance on when to use this tool versus alternatives. No sibling tools are referenced, and there is no mention of prerequisites or exclusions, leaving the agent to infer usage based on the purpose alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_feedBInspect

Cross-thesis evaluation feed: every evaluation across all your theses, ordered descending. Powers sf feed CLI.

ParametersJSON Schema
NameRequiredDescriptionDefault
hoursNoLookback hours (default 24)
limitNoMax rows
apiKeyYesSimpleFunctions API key
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description only states basic behavior (returns evaluations, ordered descending). It does not disclose important details like pagination behavior via the limit parameter, authentication requirements beyond mentioning apikey, or whether results are filtered by user.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence. Front-loaded with the key purpose. Every word is necessary and adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool is a straightforward list, the description is adequate but lacks details that would help an agent choose it over siblings like get_evaluation_history. No output schema means the description could clarify return format. Also no mention of rate limits or other behavioral constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (all three parameters have descriptions). The tool description adds no additional meaning beyond the schema for the parameters. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Purpose is clear: returns evaluations across all theses, ordered descending. The description also mentions it powers the CLI. It implicitly distinguishes from per-thesis evaluation tools like get_evaluation_history by emphasizing cross-thesis aggregation, but does not explicitly differentiate from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or alternatives. The description implies it's for a broad overview, but provides no guidance on when not to use or how it compares to get_evaluation_history or other similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_fillsCInspect

Advanced: Recent trade fills on Kalshi.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
tickerNoFilter by market ticker
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'recent trade fills' but doesn't specify time ranges, pagination, rate limits, authentication needs beyond the apiKey parameter, or what 'Advanced:' entails operationally. This leaves significant gaps for a tool that likely involves financial data access.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—a single sentence with no wasted words. It's front-loaded with the key information ('recent trade fills on Kalshi'), and the 'Advanced:' prefix, while minimal, adds context efficiently. Every part earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a tool that likely returns financial data (trade fills), the description is incomplete. It doesn't explain return values, error conditions, or behavioral traits like data recency or access constraints. For a tool with potential complexity in a trading context, this is inadequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (apiKey and ticker) well-documented in the schema. The description adds no additional parameter semantics beyond implying filtering by 'recent' and 'Kalshi', which are already covered or inferred. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('get') and resource ('recent trade fills on Kalshi'), making the purpose specific and understandable. It distinguishes from siblings like 'get_orders' or 'get_settlements' by focusing on trade fills, though it doesn't explicitly contrast with them. The 'Advanced:' prefix adds context but doesn't detract from clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'Advanced:' which might imply it's for experienced users, but offers no explicit when/when-not rules or references to sibling tools like 'get_orders' or 'get_settlements' for comparison. Usage is implied rather than stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forecastCInspect

Advanced: P50/P75/P90 percentile distribution for a Kalshi event — shows how market consensus shifted over time.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoDays of history (default 7)
apiKeyYesSimpleFunctions API key (needed for authenticated Kalshi calls). Get one at https://simplefunctions.dev/dashboard/keys
eventTickerYesKalshi event ticker (e.g. KXWTIMAX-26DEC31)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool shows 'how market consensus shifted over time' which implies historical trend analysis, but doesn't disclose authentication requirements (though the apiKey parameter hints at this), rate limits, data freshness, or what format the percentile distribution is returned in. For a financial data tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that conveys the core functionality. The 'Advanced:' prefix could be considered slightly redundant but doesn't significantly detract. The sentence is front-loaded with the main purpose and doesn't waste words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a financial data tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (beyond mentioning percentile distributions), doesn't specify data formats or units, and provides no context about Kalshi events themselves. Given the complexity of financial forecasting data and lack of structured output documentation, more completeness is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema - it doesn't explain how the 'days' parameter relates to the percentile calculations or provide examples of event ticker formats. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves percentile distribution data (P50/P75/P90) for a Kalshi event and shows how market consensus shifted over time. It specifies the resource (Kalshi event) and verb (get/show distribution), but doesn't explicitly differentiate from sibling tools like 'get_markets' or 'scan_markets' which might provide related market data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_markets' or 'scan_markets' that might provide overlapping functionality, nor does it specify prerequisites beyond the implied need for a Kalshi event ticker. The 'Advanced:' prefix suggests specialized use but offers no concrete usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_glossary_termBInspect

Get a single glossary term with full definition and links.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesGlossary slug
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose error handling (e.g., missing slug), idempotency, or any side effects. It only states it 'gets' data, which is safe by default, but lacks behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one short sentence with no redundant words. It is appropriately concise for a simple tool, though it could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a one-parameter read tool without an output schema, the description is adequate but could specify the return format (e.g., object with definition and links) to improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a description for 'slug'. The description adds no extra meaning beyond the schema, so it meets the baseline but does not enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Get), the resource (single glossary term), and the output (full definition and links). It effectively distinguishes from siblings like list_glossary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. It implies usage for retrieving a single term, but does not mention when to prefer it over list_glossary or other get tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_heartbeat_configCInspect

Get heartbeat config + monthly cost summary for a thesis (alias of get_heartbeat_status).

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
thesisIdYesThesis ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It only states the tool returns config and cost summary, but does not mention side effects, permissions, rate limits, or whether it is read-only. For a read operation, this is minimal but not misleading.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence), which is concise but arguably under-specified. It lacks structure or front-loading of key details. While every word serves purpose, more context could be added without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, and the description only briefly mentions return content (config + monthly cost summary). Given the tool's simplicity and high schema coverage, it is adequate but could be improved by listing specific config fields or cost details to fully inform invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100%, with both parameters (apiKey, thesisId) already described in the schema. The description adds negligible extra meaning beyond what the schema provides. A baseline score of 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets heartbeat config and monthly cost summary for a thesis, and explicitly identifies it as an alias of get_heartbeat_status. This provides a specific verb-resource pair and clarifies its relationship to a sibling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided. The description does not indicate when to use this tool versus alternatives, nor does it mention any prerequisites or exclusions. Given the sibling set includes a nearly identical name (get_heartbeat_status), explicit guidance would be helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_heartbeat_statusCInspect

Get heartbeat config + this month's cost summary for a thesis. See news/X scan intervals, model tier, budget usage.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It implies a read-only operation ('Get'), but doesn't specify if it requires specific permissions, has rate limits, returns real-time or cached data, or details error conditions. The mention of 'scan intervals' and 'budget usage' hints at monitoring behavior, but lacks operational clarity for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. The additional detail about 'news/X scan intervals, model tier, budget usage' is relevant but could be slightly more integrated. Overall, it's appropriately sized with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with 2 parameters and no output schema, the description covers the basic purpose but lacks completeness. Without annotations or output schema, it should ideally specify return format, error handling, or data freshness. The mention of specific data points helps, but doesn't fully compensate for missing behavioral context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (thesisId, apiKey) well-documented in the schema. The description adds no parameter-specific information beyond what the schema provides, such as format examples or constraints. Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'heartbeat config + this month's cost summary for a thesis', specifying both the resource (thesis) and what data is returned (config and cost summary). It distinguishes from siblings like 'get_balance' or 'get_milestones' by focusing on heartbeat-specific monitoring data, though it doesn't explicitly contrast with them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_balance' (for general balance) or 'configure_heartbeat' (for setup). It mentions 'See news/X scan intervals, model tier, budget usage' as data points, but this is output detail, not usage context. No exclusions or prerequisites are stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_highlightsAInspect

Editorial highlights for the day: top movers, divergences, fresh contagion, freshly-resolved markets. Curated summary view.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It describes the output as a 'curated summary view' but does not disclose potential side effects, data sources, or access requirements. It is a read operation but not explicitly stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no superfluous words. It is front-loaded with the purpose and provides specific examples. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description is reasonably complete but could add more context about return format (e.g., is it a list of items?), ordering, or data recency. It covers the basic purpose but lacks some operational details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so by baseline it scores 4. The description adds meaning about the content of the output (top movers, divergences, etc.), which is helpful beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides 'editorial highlights for the day' with specific examples like 'top movers, divergences, fresh contagion, freshly-resolved markets'. It is a specific verb+resource (get highlights) and distinguishes from siblings like get_feed or get_briefing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for a curated daily summary but does not explicitly state when to use versus alternatives or any exclusions. There is no guidance on context or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_index_historyBInspect

Historical SimpleFunctions Index snapshots. Pre-computed every 15 minutes, stored since v2 launch (2026-04-09). For charting trends.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoLookback days (default 7)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses update frequency ('Pre-computed every 15 minutes') and data start date ('stored since v2 launch (2026-04-09)'), adding value beyond annotations (which are absent). However, it omits key behavioral details like read-only nature, rate limits, or output format, limiting transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, with two short sentences front-loading the purpose and adding essential details. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description provides sufficient context: purpose, data freshness, and storage start date. It adequately informs an agent, though output structure is omitted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers the only parameter ('days') with a description. The tool description adds no additional meaning beyond the schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides 'Historical SimpleFunctions Index snapshots' and 'For charting trends,' indicating the resource and action. However, it does not explicitly differentiate from sibling tools like get_market_index or get_market_history, leaving room for confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for charting trends but offers no explicit guidance on when to use this tool versus alternatives, nor any conditions or exclusions. Given many sibling get_* tools, this is insufficient for optimal tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_legislationCInspect

Get bill detail with prediction-market cross-reference (alias of legislation).

ParametersJSON Schema
NameRequiredDescriptionDefault
billIdYesBill ID, e.g. "119-hr-22"
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description bears full burden. It does not disclose any behavioral traits (e.g., read vs. write, authentication needs, rate limits, or output behavior). The phrase 'prediction-market cross-reference' hints at extra data but is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no extraneous information. It could benefit from slight structuring (e.g., listing behavior), but it is short and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter, no output schema), the description is minimally adequate. It provides the core purpose but lacks completeness regarding return format or any side effects. A more complete description would help the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'billId' is fully described in the input schema with an example. The description adds minimal value beyond the schema, mentioning prediction-market cross-reference but no additional parameter context. Schema coverage is 100%, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves bill details and mentions a prediction-market cross-reference. It explicitly calls itself an alias of 'legislation', helping differentiate from a direct sibling. However, it could be more specific about the nature of the detail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. While it mentions being an alias, it does not explain the context for using this over other related tools like 'list_legislation' or 'legislation'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_detailAInspect

Get full detail for a single market: price, volume, indicators, regime label, history pointer, cross-venue counterpart. Lower-level than inspect_ticker — raw JSON only.

ParametersJSON Schema
NameRequiredDescriptionDefault
depthNoOrderbook depth levels to include (0 = none)
tickerYesMarket ticker
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It clarifies the output is 'raw JSON only', a behavioral trait. However, it does not disclose potential side effects, authorization requirements, or rate limits, leaving some behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences with no redundant information. It front-loads the primary purpose and immediately provides context that distinguishes it from a sibling, making every word earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists key return elements (price, volume, indicators, etc.) and compares to 'inspect_ticker'. The sibling context is large, but the description fully captures the tool's role and output, making it complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers both parameters ('ticker', 'depth') with descriptions, achieving 100% coverage. The description does not add new meaning beyond the schema, meeting the baseline expectation but not exceeding it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves full detail for a single market, listing specific data types (price, volume, indicators, etc.). It explicitly distinguishes from the sibling 'inspect_ticker' by noting it is lower-level and returns raw JSON only, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear comparison with the sibling 'inspect_ticker' (lower-level vs. higher-level), which helps the agent decide which tool to use. However, it does not explicitly state when not to use this tool or mention alternatives among other sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_diffAInspect

Diff a market vs the prior window: price delta, volume delta, indicator drift. For "what changed in the last 6h?" questions.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNoSort field (e.g. priceDelta)
topicNoTopic keyword if no tickers
windowNoLookback (default 24h)
tickersNoComma-separated tickers
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It mentions comparing to a prior window and outputs, but does not clarify side effects (e.g., read-only), authentication needs, or rate limits. The description is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the main action and outputs. Every sentence serves a purpose, with no redundant information. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, no required fields, and no output schema, the description covers the core purpose and a typical use case. It lacks details on return format or edge cases, but given the tool's simplicity, it is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add significant meaning beyond the schema; it mentions output deltas but not how parameters like 'sort' or 'window' work. Minimal added value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action (diff), resource (market), and specific outputs (price delta, volume delta, indicator drift). It also provides a concrete use case ('what changed in the last 6h?'), distinguishing it from sibling tools like get_changes or get_world_delta.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives a clear usage scenario ('For what changed in the last 6h? questions'), implying when to use it. However, it does not explicitly state when not to use it or mention alternatives among siblings, leaving some inference needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_historyAInspect

Get rolling 7-day price + indicator history for a single market. For trajectory questions and chart rendering.

ParametersJSON Schema
NameRequiredDescriptionDefault
tickerYesMarket ticker
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full behavioral burden. It discloses the rolling 7-day window and that it returns price+indicator history, but omits specifics like which indicators, data resolution, or response format. Acceptable but leaves questions about exact output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two compact sentences. First sentence states action and scope, second provides usage context. Every word earns its place with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and description does not hint at return structure (e.g., array of objects, fields). For a charting/data tool, this leaves ambiguity. However, for a simple single-param tool, the description is minimally adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds meaning beyond the schema: 'for a single market' clarifies the ticker's context and that it's not for indices or batches. This elevates from baseline 3 to 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get rolling 7-day price + indicator history for a single market', specifying the exact data scope and time window. It distinguishes from siblings like get_index_history (which is for indices) and get_market_detail (which provides different data).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions two use cases: 'For trajectory questions and chart rendering', giving clear context for when to use. Does not explicitly exclude when not to use or name alternatives, but the guidance is sufficient for a focused data retrieval tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_indexAInspect

SimpleFunctions Prediction Market Index v2. Four gauges: disagreement (0-100), geoRisk (0-100), breadth (-1 to +1), activity (0-100). Updated every 15 minutes.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides partial transparency: update frequency (every 15 minutes) and value ranges. However, it does not disclose behavior like rate limits, authentication requirements, or consequences of calling between updates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no unnecessary words. Efficiently conveys purpose and key details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description adequately explains return values (four gauges with ranges). Could be more complete by specifying the output structure (e.g., object with named fields), but sufficient for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters (100% coverage), so baseline is 4. The description adds meaning about the output (four gauges with ranges) which helps the agent understand what to expect.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves the prediction market index with specific gauges (disagreement, geoRisk, breadth, activity) and their ranges. It distinguishes from 'get_index_history' by implying current snapshot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like 'get_index_history' or other data tools. The description only mentions update frequency but no context on appropriate scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_market_microstructure_historyAInspect

Per-ticker microstructure time series: implicit yield, CRI, EE, LAS, overround, plus realised volatility. Used for charting indicator drift.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoLookback days (default 7)
tickerYesMarket ticker
intervalNoBucketing
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It mentions the tool returns time series data but does not disclose data retention, update frequency, error handling, or any side effects. The lack of behavioral context limits tool selection.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each adding clear value: the first defines the output, the second states the usage. No wasted words, well-front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description lists the returned metrics, which partially compensates. However, it lacks details on data structure (e.g., array of objects, timestamps). Still, it provides sufficient context for a simple time series tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have descriptions in the input schema (100% coverage), so the baseline is 3. The description adds context about the output metrics, but no additional parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns per-ticker microstructure time series with specific metrics like implicit yield, CRI, etc., and explicitly mentions its use for charting indicator drift. This effectively distinguishes it from siblings like get_market_history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for charting indicator drift but does not explicitly compare to alternative tools or provide when-not-to-use guidance. Without clear differentiation, an agent might not know when to choose this over similar tools like get_market_history.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_marketsAInspect

Get traditional market prices via Databento. Default: SPY, VIX, TLT, GLD, USO. Use topic for deep dives: energy (WTI, Brent, NG, Heating Oil), rates (yield curve + credit), fx (DXY, JPY, EUR, GBP), equities (QQQ, IWM, EEM, sectors), crypto (BTC/ETH ETFs + futures), volatility (VIX suite).

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoDeep dive topic: energy, rates, fx, equities, crypto, volatility. Omit for core 5.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes what data is returned (market prices) and the default behavior (core 5 markets). However, it doesn't disclose important behavioral traits like whether this is a read-only operation, potential rate limits, authentication requirements, data freshness, or error handling. The description adds some context but leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences. The first sentence states the core purpose and default behavior. The second sentence explains the parameter usage with specific examples. Every sentence earns its place, though the second sentence is somewhat dense with multiple topic examples packed together.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (1 optional parameter, no output schema, no annotations), the description is somewhat complete but has gaps. It explains what data is returned and how to use the parameter, but doesn't describe the return format, data structure, or any limitations. For a data retrieval tool with no output schema, more information about the response would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its single parameter, so the baseline is 3. The description adds significant value by explaining the parameter's purpose ('Use topic for deep dives') and providing concrete examples of what each topic value returns (e.g., 'energy (WTI, Brent, NG, Heating Oil)'). This goes beyond the schema's generic description and helps the agent understand the parameter's practical use.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get traditional market prices via Databento.' It specifies the verb ('Get') and resource ('traditional market prices'), and mentions the data source. However, it doesn't explicitly distinguish this tool from sibling tools like 'scan_markets' or 'screen_markets', which appear related but have different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage guidance: 'Default: SPY, VIX, TLT, GLD, USO. Use topic for deep dives...' It explains when to use the optional parameter (for specific topic areas) versus when to omit it (for core 5 markets). However, it doesn't explicitly state when NOT to use this tool or mention alternatives among siblings like 'scan_markets' or 'query_databento'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_milestonesBInspect

Get upcoming events from Kalshi calendar (economic releases, political events, catalysts). Free-tier and rate-limited.

ParametersJSON Schema
NameRequiredDescriptionDefault
hoursNoHours ahead to look (default 168 = 1 week)
categoryNoFilter by category (e.g. Economics, Politics, Sports)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full disclosure burden. It only mentions rate limiting, omitting other traits like data freshness, authentication requirements, or whether past events are included.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two short sentences, straight to the point with no redundant information. It front-loads the purpose and adds a constraint note efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 2 optional params and no output schema, the description covers basic purpose and a constraint. However, it lacks clarity on the tool's range, overlap with siblings, and usage context, making it minimally adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides descriptions for both parameters (hours and category), achieving 100% coverage. The tool description adds no extra parameter semantics, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves upcoming events from the Kalshi calendar, specifying types like economic releases and political events. However, it does not differentiate from the sibling tool 'get_calendar', leaving ambiguity for the agent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Free-tier and rate-limited', providing some constraints, but fails to specify when to use this tool over alternatives like 'get_calendar' or under what circumstances to avoid it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_newmarketsBInspect

Recently-listed markets (new contracts) on Kalshi and Polymarket. For finding fresh trading opportunities.

ParametersJSON Schema
NameRequiredDescriptionDefault
hoursNoLookback hours (default 24)
limitNoMax rows
venueNokalshi or polymarket
minLiquidityNoMinimum liquidity threshold
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are available, so the description must bear the full burden. It only states 'recently-listed' and 'fresh opportunities' but fails to disclose what 'recently' means, data freshness guarantees, rate limits, or any mutation behavior. The description is too vague.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (two sentences) and front-loaded with the core purpose. No unnecessary information, but it could be slightly more informative without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters and no output schema, the description is minimal. It does not explain the return format, pagination, or the exact definition of 'new' (e.g., hours-based). Given many sibling tools, more context would help the agent select correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has all four parameters described, achieving 100% coverage. The description does not add extra meaning beyond the schema, so the baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'recently-listed markets' and specifies the venues (Kalshi and Polymarket), which distinguishes it from sibling tools like get_markets or scan_markets. The purpose is specific and actionable for finding fresh trading opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for finding fresh opportunities but does not explicitly state when to use this tool versus alternatives like get_markets or scan_markets. No guidance on when not to use it is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_opinionCInspect

Get a single opinion/essay by slug.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesOpinion slug
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully convey behavioral traits. It implies a read-only operation but does not disclose failure behavior (e.g., invalid slug), rate limits, or whether full content is returned. The agent lacks crucial behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, front-loaded with the action and resource. Every word is necessary, but it could be slightly more informative without losing conciseness (e.g., specifying that it returns the full essay).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description leaves ambiguity about the response structure. The agent does not know if it returns content, metadata, or both. Siblings like 'get_context' suggest similar retrieval but differ in scope, yet this is not clarified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes the 'slug' parameter. The description merely repeats 'by slug', adding no further meaning (e.g., format, endpoint variations). Baseline 3 is appropriate as the description adds no extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a single opinion/essay identified by a slug. The verb 'Get' and resource 'opinion/essay' are specific. However, it does not differentiate from siblings like 'get_answer' or 'get_context', nor does it highlight that 'list_opinions' is for multiple opinions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. For instance, it doesn't mention that this is for fetching one specific opinion, while 'list_opinions' returns a list. There are no conditions, prerequisites, or exclusions stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_ordersCInspect

Advanced: Current resting orders on Kalshi.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoFilter by status: resting, canceled, executed. Default: restingresting
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'resting orders' which implies read-only retrieval of pending orders, but doesn't specify authentication requirements (beyond the apiKey parameter), rate limits, pagination, error conditions, or what data is returned. For a financial API tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (6 words) and front-loaded with the core functionality. However, the 'Advanced:' prefix adds minimal value and could be considered slightly wasteful. Overall, it's efficient but could be more informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a financial trading API tool with no annotations and no output schema, the description is inadequate. It doesn't explain what data is returned (order details, format, structure), doesn't mention authentication requirements beyond the parameter, and provides no context about Kalshi's platform. The description should do more to compensate for the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents both parameters (apiKey and status). The description doesn't add any parameter semantics beyond what's in the schema - it doesn't explain what 'resting orders' means in relation to the status parameter or provide context about Kalshi's order lifecycle. Baseline 3 is appropriate when schema does the documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('get') and resource ('current resting orders on Kalshi'), making the purpose specific and understandable. It doesn't explicitly distinguish from sibling tools like 'get_fills' or 'get_settlements', but the resource specificity ('resting orders') provides implicit differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_fills' (for executed orders) or 'cancel_intent' (for order management). The 'Advanced:' prefix is vague and doesn't clarify appropriate contexts or prerequisites for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_positionsAInspect

Advanced: Open Kalshi positions with live P&L. Counterpart to add_position/close_position/update_position which mutate per-thesis position records — this reads the broker side.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Requires Kalshi credentials configured via sf setup.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations, so description must disclose behavior. It states it's a read operation with live P&L, but does not detail return format, error conditions, or permissions beyond apiKey. Could be more transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, all content valuable, front-loaded with purpose and contrast.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema; description mentions live P&L but does not specify the output structure or potential errors. Adequate for a simple read but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter already well-described. Description adds no extra parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it reads positions (not mutates) with live P&L, and distinguishes from sibling mutate tools like add_position/close_position/update_position.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says it is the counterpart to add_position/close_position/update_position and reads, so agent knows when to use this vs those.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_public_skillBInspect

Get a single published skill by slug.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesPublic skill slug
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility. It only states the basic purpose without disclosing behavioral traits such as read-only nature, error handling (e.g., missing slug), or any side effects. The description is insufficient for an agent to understand behavior beyond the core operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is concise and front-loaded. Every word carries meaning, and there is no fluff. It efficiently communicates the core function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description provides the essential purpose but lacks details like how to identify a published skill, typical return structure, or error cases. It is minimally complete but could benefit from a hint about the response format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the schema already documents the 'slug' parameter. The description adds marginal value by confirming the slug is for a published skill. Baseline 3 is appropriate as the description does not deepen understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get a single published skill by slug' clearly specifies the action (get), resource (a single published skill), and the method (by slug). It distinguishes from siblings like 'browse_public_skills' (likely listing) and 'get_skills' (possibly private).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description does not mention prerequisites, typical use cases, or conditions like needing a slug. There are no 'when to use' or 'when not to use' instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_regime_scanBInspect

Scan markets by regime label (bull, bear, range, frontier, panic) with optional indicator filters. For regime-based screening.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNoSort field (e.g. as, score)
labelNoRegime label filter
limitNoMax rows (default 50)
orderNoSort order
venueNokalshi or polymarket
hasEdgeNoFilter to markets with non-trivial SimpleFunctions edge
eventTypeNobinary, scalar, ladder
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description carries full burden. It fails to mention behavioral traits like read-only nature, pagination, rate limits, or authentication requirements. The description is too brief for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no filler. Front-loaded with purpose and regime labels. Could be slightly more structured but remains concise and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple scan tool with fully described schema. Lacks details on output or return format since no output schema exists. For a tool with 7 parameters and no annotations, more context would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already documented. The description adds 'with optional indicator filters' as context but doesn't elaborate on specific parameters beyond the schema. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Scan markets by regime label' with enumerated labels, and adds 'with optional indicator filters'. It distinguishes its regime-based focus from generic scanning tools, but doesn't explicitly differentiate from siblings like scan_markets or screen_markets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: regime-based screening. No explicit when-to-use, when-not-to-use, or alternatives. The phrase 'For regime-based screening' gives context but lacks guidance on selecting this tool over similar ones.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_scheduleBInspect

Get exchange status and trading hours

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It implies a read-only operation ('get') but doesn't specify whether it requires authentication, has rate limits, returns real-time or cached data, or details the response format. This leaves significant gaps for a tool that likely interacts with market data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's purpose without any fluff or redundancy. It's front-loaded and wastes no words, making it ideal for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's likely complexity (providing exchange status and trading hours), the description is minimal. With no annotations and no output schema, it lacks details on authentication needs, data freshness, or return structure. However, for a zero-parameter tool, it meets a basic threshold but could be more informative about behavioral aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so no parameter documentation is needed. The description appropriately doesn't mention parameters, earning a baseline high score since it doesn't need to compensate for any gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('get') and resources ('exchange status and trading hours'), making it immediately understandable. However, it doesn't differentiate from sibling tools like 'get_markets' or 'get_balance' that might also provide related market information, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_markets' or 'get_forecast', nor does it mention any prerequisites or exclusions. It merely states what the tool does without contextual usage information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_settlementsCInspect

Advanced: Settled contracts with realized P&L.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
tickerNoFilter by market ticker
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'advanced' and 'realized P&L' but doesn't disclose behavioral traits like whether this is a read-only operation, authentication requirements beyond the apiKey parameter, rate limits, or what the output format looks like. For a financial data tool with no annotations, this leaves significant gaps in understanding its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that gets straight to the point without unnecessary words. It's appropriately sized for a simple data retrieval tool, though it could be slightly more structured by separating purpose from context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete. It doesn't explain what 'settled contracts' entail, how 'realized P&L' is calculated, or what the return data looks like. For a financial tool with potential complexity, this leaves too much undefined for effective agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (apiKey and ticker). The description doesn't add any parameter-specific information beyond what's in the schema, such as explaining how ticker filtering works or providing examples. With high schema coverage, the baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool retrieves 'settled contracts with realized P&L,' which is a clear verb+resource combination. It specifies 'advanced' context and includes 'realized P&L' to differentiate from generic settlement data. However, it doesn't explicitly distinguish from siblings like get_fills or get_orders, which might also involve financial transactions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'advanced' but doesn't clarify what makes it advanced or when to prefer it over other financial data tools like get_fills or get_orders. There are no explicit when/when-not statements or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_skillsAInspect

List authenticated user's skills. (Public skills are at browse_public_skills.)

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description implies a read-only, authenticated operation and scopes to the user's own skills. Does not detail error cases or side effects, but for a list operation, this is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exceptionally concise: one sentence plus a parenthetical note. Every word adds value, no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with good schema coverage and no output schema, the description fully covers purpose, scope, and alternative tool. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter (apiKey). The description adds context by linking the apiKey to the authenticated user, going beyond the generic schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists the authenticated user's skills, distinguishing it from the sibling tool 'browse_public_skills' for public skills.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use this tool (own skills) and directs to 'browse_public_skills' for public skills, providing clear usage guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_technicalBInspect

Get a single technical reference doc by slug.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesTechnical doc slug
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden but only states the action. It does not disclose any behavioral traits such as rate limits, authentication needs, or side effects. For a read-only tool, it minimally conveys no side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with 7 words, front-loaded with the verb and resource, and contains no superfluous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one param, no output schema), the description is minimally adequate but lacks context about the return value or behavior. It is not incomplete but could be more helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the 'slug' parameter. The description adds no extra meaning beyond what the schema provides, meeting the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'technical reference doc', with the parameter 'slug' mentioned. It distinguishes from siblings like get_glossary_term by specifying 'technical reference doc'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_glossary_term or get_market_detail. The description does not provide context for usage exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_thesis_contextAInspect

Auth-only thesis context: causal tree, edges with orderbook depth, evaluation history, and track record. Use get_context with no thesisId for the global market snapshot.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
thesisIdYesThesis ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions 'Auth-only' hinting at authentication needs but does not disclose behavioral traits like error handling, rate limits, or what happens if the thesisId is invalid. The description focuses on content rather than behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the key information in the first sentence and providing usage guidance in the second. No redundant text; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (2 params, no output schema), the description lists the specific data elements returned (causal tree, edges, evaluation history, track record), providing a good sense of the output. However, it does not specify the return format or whether pagination applies, which would elevate completeness further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with two string parameters (apiKey, thesisId). The description adds no additional meaning beyond the schema; it does not explain the format or constraints of these parameters. Baseline 3 is appropriate as schema already covers them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states what the tool does: returns a thesis-specific context including causal tree, edges with orderbook depth, evaluation history, and track record. It distinguishes from sibling tool 'get_context' by noting that 'get_context' without thesisId provides the global snapshot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (for a specific thesis) and provides an alternative: 'Use get_context with no thesisId for the global market snapshot.' This leaves no ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_trade_ideasAInspect

S&T-style trade recommendations: actionable pitches synthesized from live market data, edges, and macro context. Each idea has conviction level, catalyst timing, direction, and risk. Free-tier and rate-limited. START HERE when looking for what to trade.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax ideas (1-10)
categoryNoFilter: macro, geopolitics, crypto, policy, event
freshnessNoMax cache age: 1h, 6h, 12h, 1d12h
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries behavioral disclosure. It notes 'Free-tier and rate-limited' and 'actionable pitches synthesized from live market data', but does not clarify if ideas are hypothetical or real, or if the tool has any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the core purpose. Could be restructured with bullet points for readability, but no excessive verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description hints at return structure (conviction, catalyst, direction, risk). Covers purpose and key constraints. Might lack mention of pagination or if limit is exact.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description restates parameter purposes but adds no extra meaning beyond the schema descriptions (e.g., filter options, cache age, limit range).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it provides S&T-style trade ideas with conviction, catalyst, direction, and risk. The phrase 'START HERE when looking for what to trade' distinguishes it from siblings like get_technical or get_edges.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'START HERE when looking for what to trade', guiding primary use. Mentions free-tier and rate-limited, but doesn't specify when to use alternatives like get_technical or get_forecast.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_world_deltaAInspect

Incremental world state update — only what changed since a timestamp. ~30-50 tokens vs 800 for full state. For periodic refresh during long tasks.

ParametersJSON Schema
NameRequiredDescriptionDefault
sinceYesRelative (30m, 1h, 6h, 24h) or ISO timestamp
formatNoOutput formatmarkdown
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: the incremental nature (only changes since timestamp), performance characteristics (~30-50 tokens vs 800 for full state), and the intended use case (periodic refresh). However, it doesn't mention error handling, rate limits, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: the first sentence states the core purpose, the second provides performance context, and the third gives usage guidance. Every sentence earns its place with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema, no annotations), the description is fairly complete. It explains what the tool does, when to use it, and performance characteristics. However, it lacks details on return values or error conditions, which would be helpful since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema, maintaining the baseline score of 3 for adequate coverage without extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Incremental world state update — only what changed since a timestamp.' It specifies the verb ('update'), resource ('world state'), and scope ('incremental'), and distinguishes it from sibling tools like 'get_world_state' by emphasizing the delta approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'For periodic refresh during long tasks.' It also implies an alternative by contrasting with 'full state' (likely referring to 'get_world_state'), providing clear context for usage vs. other options.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_world_stateAInspect

Real-time world model for agents. ~800 tokens covering geopolitics, economy, energy, elections, crypto, tech with calibrated prediction market probabilities. Anchor contracts (recession, Fed, Iran) always present. No auth needed.

ParametersJSON Schema
NameRequiredDescriptionDefault
focusNoComma-separated topics for deeper coverage: geopolitics, economy, energy, elections, crypto, tech. Same token budget concentrated on fewer topics.
formatNoOutput format. markdown (default) is optimized for LLM context.markdown
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so effectively. It discloses key behavioral traits: 'real-time world model,' token budget (~800 tokens), content scope, prediction market probabilities, always-present anchor contracts, and explicitly states 'No auth needed.' This provides comprehensive operational context beyond what the input schema covers.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is exceptionally concise and well-structured. Every sentence adds value: first establishes the core purpose, second specifies content scope and token budget, third mentions key features (prediction markets, anchor contracts), fourth states authentication status. Zero wasted words, perfectly front-loaded with the most important information first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (real-time world model with multiple content domains) and no annotations or output schema, the description provides excellent coverage. It explains what the tool returns (world model with specific content areas), format considerations (token budget), key features (prediction markets, anchor contracts), and operational constraints (no auth needed). The only minor gap is lack of explicit output format details, though the input schema covers this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters. The description doesn't add any parameter-specific information beyond what's in the schema descriptions. The baseline score of 3 is appropriate when the schema does all the parameter documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: providing a 'real-time world model for agents' with specific content coverage (~800 tokens covering geopolitics, economy, energy, elections, crypto, tech) and includes calibrated prediction market probabilities. It distinguishes itself from siblings by focusing on aggregated world state information rather than specific operations like trading, analysis, or content management.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for obtaining real-time world model information with prediction market probabilities. It mentions 'anchor contracts (recession, Fed, Iran) always present' which helps understand consistent coverage. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, though the content focus distinguishes it from most siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_yield_curveBInspect

Single yield curve for one event series.

ParametersJSON Schema
NameRequiredDescriptionDefault
eventYesEvent ticker (e.g. KXFEDDECISION-26DEC10)
venueNoVenue if needed to disambiguate
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description only states purpose without disclosing behavioral traits like data source, freshness, caching, or that it returns a single curve object. Minimal transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is extremely concise (6 words) and front-loads the purpose. No wasted words, but might be too terse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given low complexity (2 params, no output schema), description is minimal but sufficient for a simple retrieval. However, lacking output format information reduces completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description adds no extra meaning beyond schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Single yield curve for one event series' with specific verb (get) and resource (yield curve), clearly differentiating it from sibling get_yield_curves (plural).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_yield_curves. Lacks explicit context such as 'use this for one specific curve; use get_yield_curves for multiple.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_yield_curvesAInspect

Liquidity-weighted yield curves across event types (e.g. KXFED 6mo, KXBTC 30d). For "where on the curve am I trading?" questions.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax events
venueNokalshi or polymarket
minPointsNoMinimum curve points to keep an event
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must fully disclose behavior. It mentions 'liquidity-weighted' but does not state safety (read-only), rate limits, or any side effects. Minimal operational insight.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with key information. No redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema, so description should explain return structure. It gives examples but does not describe the response format or pagination. Adequate but incomplete for a data retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover all three parameters (limit, venue, minPoints). The description adds no extra meaning beyond the schema, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb and resource: 'Liquidity-weighted yield curves across event types' with examples. Clearly states what it does and distinguishes from siblings by mentioning event types and liquidity weighting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly targets 'where on the curve am I trading?' questions, providing a clear use case. Does not mention when to avoid or alternatives like get_yield_curve.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

inject_signalBInspect

Feed an observation into a thesis — news, price move, or external event. Consumed in next evaluation cycle to update confidence and edges.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeNoSignal typeuser_note
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
contentYesSignal content
thesisIdYesThesis ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the tool feeds observations that are 'consumed in next evaluation cycle', implying asynchronous processing and potential delay in effect. However, it doesn't disclose critical behavioral traits: whether this is a write operation (likely, given 'inject'), authentication requirements beyond the apiKey parameter, rate limits, error conditions, or what happens if the thesisId is invalid. The description adds some context but leaves significant gaps for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences that are front-loaded with the core purpose. Every word earns its place: the first sentence defines the action and examples, the second explains the outcome. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is a mutation tool (injecting signals) with no annotations and no output schema, the description is incomplete. It doesn't cover authentication needs (implied by apiKey but not explained), error handling, return values, or side effects. The context about the 'next evaluation cycle' is helpful, but for a tool that modifies thesis state, more behavioral disclosure is needed to be complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters well. The description doesn't add any parameter-specific semantics beyond what's in the schema (e.g., it doesn't elaborate on 'content' format or 'thesisId' sourcing). With high schema coverage, the baseline is 3, and the description doesn't compensate with extra parameter insights.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Feed an observation into a thesis' with specific examples (news, price move, external event). It distinguishes from siblings by focusing on signal injection rather than creation, evaluation, or querying of theses. However, it doesn't explicitly differentiate from tools like 'update_thesis' or 'trigger_evaluation' that might also affect thesis state.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Consumed in next evaluation cycle to update confidence and edges'), suggesting this tool is for adding observational data to inform thesis evaluations. However, it doesn't provide explicit guidance on when to use this versus alternatives like 'update_thesis', 'create_thesis', or 'trigger_evaluation', nor does it mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

inspect_tickerAInspect

STEP 2 OF THE LOOP. Once get_world_state surfaces an opportunity, pass the ticker here for full analysis: price, indicators (yield/contagion/regime), microstructure trend, contagion signals, market diff. Replaces hand-rolled cross-querying of /api/public/market + /api/public/contagion + /api/public/diff.

ParametersJSON Schema
NameRequiredDescriptionDefault
diffNoInclude market diff vs prior window (default true)
depthNoInclude live orderbook depth. Default false for cheaper cached agent sweeps.
trendNoInclude microstructure history (default true)
formatNoDefault markdown
tickerYesMarket ticker, e.g. KXFEDDECISION-26DEC10-T0
contagionNoInclude contagion signals (default true)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It describes the tool as performing analysis (price, indicators, microstructure, contagion, diff) and replacing multiple API calls. It implies read-only behavior without contradicting any annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with the step number. Every sentence adds value, but it is slightly verbose with the endpoint list at the end.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains what the tool returns (price, indicators, etc.). It is complete enough for an agent to understand the tool's role in the pipeline.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add extra meaning beyond the schema; it only provides context for the tool's purpose, not parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is 'STEP 2 OF THE LOOP' for analyzing a ticker after get_world_state finds an opportunity. It specifies the resource (ticker) and the action (full analysis including price, indicators, etc.), and distinguishes itself by noting it replaces cross-querying multiple endpoints.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly says to use after get_world_state surfaces an opportunity, providing clear context. It does not explicitly state when not to use or list alternatives, but the pipe is clearly defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

legislationCInspect

Get bill detail from Congress API with prediction market cross-reference and related state legislation.

ParametersJSON Schema
NameRequiredDescriptionDefault
billIdYesBill ID, e.g. "119-hr-22"
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions data sources (Congress API, prediction markets, state legislation) but fails to describe key traits like rate limits, authentication needs, error handling, or response format. This is inadequate for a tool with external API dependencies.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('Get bill detail') and includes all key components without waste. Every word contributes to understanding the tool's scope and data sources.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of integrating multiple data sources (Congress API, prediction markets, state legislation) and no output schema, the description is incomplete. It lacks details on response structure, error cases, or behavioral constraints, which are critical for effective tool invocation in this context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'billId' documented as 'Bill ID, e.g. "119-hr-22"'. The description adds no additional parameter details beyond this, so it meets the baseline score of 3 where the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get bill detail') and resources ('Congress API', 'prediction market cross-reference', 'related state legislation'), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'query_gov' or 'query', which might have overlapping functionality, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as 'query_gov' or other query-related siblings. It lacks context about prerequisites, exclusions, or specific use cases, leaving the agent without clear usage instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_congress_membersBInspect

List sitting US Congress members.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax rows
stateNoTwo-letter state code
chamberNoChamber filter
currentMemberNoOnly currently-serving members
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden of behavioral disclosure. However, it only states the basic function and does not reveal traits like default filtering (e.g., whether it returns all members or only currently-serving by default), pagination, ordering, or data freshness. The currentMember boolean parameter suggests a filter, but the description does not clarify its default behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, making it concise and front-loaded, but it is too brief for a tool with four optional parameters. It could be expanded slightly to include default behavior or usage hints without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 optional parameters, no output schema, no annotations), the description is incomplete. It does not explain the return format, default values for parameters, or how the list is ordered, leaving the agent with insufficient insight for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have descriptions in the input schema (100% coverage), so the description adds no extra meaning. Baseline is 3, and the description does not enhance understanding of parameters such as what 'limit' defaults to or how 'state' and 'chamber' combine.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List sitting US Congress members' clearly states the action (list) and resource (sitting US Congress members), immediately distinguishing this tool from related tools like 'get_congress_member' (which retrieves a single member) and other list tools for different entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'get_congress_member' for individual member lookups or other listing tools. The description lacks any context about prerequisites, ideal scenarios, or comparison with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_forum_channelsAInspect

List available forum channels (general, alerts, signals, etc.) the agent can read or post to.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It does not disclose any behavioral traits such as permissions required, rate limits, pagination, or error handling. Only states the action without side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, efficiently front-loads the verb and resource. No unnecessary words or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple listing tool, but lacks details about the return format or structure (e.g., whether it returns channel IDs, names, etc.), which would be useful for downstream tools. No output schema provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one parameter, apiKey, fully described). The description adds no additional meaning beyond the schema for this parameter. Baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool lists available forum channels, with examples (general, alerts, signals). It distinguishes itself from sibling tools like read_forum and post_to_forum which perform different actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's a prerequisite for reading or posting, but doesn't name other tools or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_glossaryAInspect

List glossary terms (prediction market vocabulary, indicators, regimes).

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoKeyword
categoryNoCategory filter
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description does not disclose behavioral traits such as whether the operation is read-only, if pagination exists, or any prerequisites. For a list tool, more transparency is expected.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded, no wasted words. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations or output schema, the description is minimal. It lacks information about result format, pagination, or any additional context needed for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers both parameters (q, category) with descriptions. Schema description coverage is 100%, so the description adds no additional meaning beyond example content types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'List glossary terms' with examples of content types (prediction market vocabulary, indicators, regimes). It distinguishes itself from sibling 'get_glossary_term' by being a list operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use or not use this tool. While sibling 'get_glossary_term' exists for single term retrieval, the description does not mention trade-offs or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_intentsAInspect

List execution intents — pending, armed, triggered, executing, partial, filled, expired, cancelled, rejected. Shows the full execution pipeline status.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoFilter: pending, armed, triggered, executing, partial, filled, expired, cancelled, rejected
activeOnlyNoOnly show active intents (default true)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It implies a read-only listing operation but does not explicitly state behavioral traits like idempotency, permissions, or pagination behavior. The pipeline status mention adds some context but is insufficient for complete transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no redundancy. It front-loads the action and lists statuses efficiently. Every part contributes to understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks details on return format, pagination, or handling of defaults like 'activeOnly'. Given the tool's simplicity and no output schema, it is adequate but not fully complete for an agent to invoke correctly without assumptions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds no new meaning beyond the schema for parameters. It repeats the list of statuses already in the schema's description for the 'status' parameter. The overall purpose context does not enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists execution intents and enumerates all possible statuses, differentiating it from creation or cancellation tools. The phrase 'Shows the full execution pipeline status' adds context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'get_orders' or 'get_fills'. The description does not mention when not to use it or suggest other tools for specific needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_legislationBInspect

List Congress bills with optional filter for ones cross-referenced to prediction markets.

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoKeyword
limitNoMax rows
congressNoCongress number (e.g. 119)
hasMarketNoOnly bills with a linked market
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full burden. It doesn't state read-only nature, authentication needs, rate limits, or output format. Only lists bills but misses behavioral cues.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 12 words, front-loaded with purpose. No unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a listing tool with 4 params and no output schema, the description lacks context on pagination, ordering, return format, or any constraints. Leaves agent guessing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description adds minimal value beyond the parameter descriptions. The cross-reference mention aligns with 'hasMarket' param, but no additional semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists Congress bills with an optional filter for prediction market cross-references. It distinguishes from sibling tools like 'get_legislation' (single bill) and 'list_congress_members' (members).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'query_gov' or 'list_congress_members'. The optional filter is mentioned but no context for when to apply it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_opinionsBInspect

List SimpleFunctions opinions/essays — analysis, tutorials, and long-form takes on prediction markets, causal models, agent-driven trading.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax rows
categoryNoCategory filter
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden for behavioral disclosure. It only mentions 'List' without specifying order, pagination, or any side effects. The read-only nature is implied but not explicitly stated, and there is no mention of permissions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that immediately states the purpose and includes relevant content context. It is front-loaded and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description does not explain the return format, pagination behavior, or how the category filter works in practice. For a listing tool, more detail would help the agent understand what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description provides minimal additional meaning beyond the existing parameter descriptions ('Max rows', 'Category filter'). It does not elaborate on valid category values or default limits, but the schema already handles the basics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'opinions/essays', specifying the content types (analysis, tutorials, long-form takes). It is unambiguous and implicitly distinguishes from the singular get_opinion by using the plural form.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide any guidance on when to use this tool versus alternatives like get_opinion (singular) or list_theses. It only states what the tool does without context of selection criteria or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_skillsBInspect

List all skills: built-in + user-created custom skills.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions listing skills but fails to disclose behavioral traits like whether this is a read-only operation, if it requires authentication (implied by the apiKey parameter but not stated), potential rate limits, or the format of the returned list. This leaves significant gaps in understanding the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose without any wasted words. It is appropriately sized for a simple listing tool, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (one parameter, no output schema) and the schema's good coverage, the description is minimally adequate. However, with no annotations and no output schema, it lacks details on behavioral aspects and return values, leaving room for improvement in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'apiKey' well-documented in the schema itself. The description adds no additional meaning about parameters, so it meets the baseline of 3 where the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('List all skills') and specifies the scope ('built-in + user-created custom skills'), which is a specific verb+resource combination. However, it does not explicitly differentiate from sibling tools like 'browse_public_skills' or 'publish_skill', which might have overlapping purposes, so it falls short of a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as 'browse_public_skills' or 'fork_skill', nor does it mention any prerequisites or exclusions. It simply states what the tool does without context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_strategiesCInspect

Advanced: List trading strategies for a thesis.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoFilter: active|watching|executed|cancelled|review
thesisIdYesThesis ID
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It only states the action ('List trading strategies') without detailing behavioral traits such as whether it's read-only (implied but not confirmed), pagination, rate limits, authentication needs beyond the apiKey parameter, or what the output looks like. This leaves significant gaps for a tool that likely interacts with trading data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action. However, the word 'Advanced:' is unnecessary and adds noise without value, slightly reducing clarity. Otherwise, it's appropriately sized with zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (trading strategies tool with no annotations and no output schema), the description is incomplete. It lacks details on behavioral traits (e.g., safety, output format), usage context, and doesn't compensate for the absence of annotations or output schema. This makes it inadequate for an agent to fully understand how to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (thesisId, apiKey, status) with descriptions. The description adds no additional meaning beyond implying that 'thesisId' is required (from 'for a thesis'), which is already covered in the schema. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the verb ('List') and resource ('trading strategies') with a qualifier ('for a thesis'), which clarifies the scope. However, it's prefixed with 'Advanced:' which adds ambiguity rather than clarity, and it doesn't distinguish this tool from potential siblings like 'list_intents' or 'list_theses' beyond the resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'for a thesis' but doesn't specify prerequisites (e.g., requires an existing thesis) or compare it to other listing tools (e.g., 'list_intents' for intents vs. 'list_strategies' for strategies). There's no explicit when/when-not or alternative tool references.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_technicalsCInspect

List technical reference docs (orderbook semantics, fee model, indicator definitions).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax rows
categoryNoCategory filter
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Lacking annotations, the description provides minimal behavioral detail. It does not disclose if the operation is read-only, whether there are rate limits, pagination, or ordering. Only states 'list', leaving the agent unaware of side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 8 words, very concise. It front-loads the core action and resource. However, it may be slightly too terse given the lack of context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with no annotations or output schema, the description covers the basic purpose and mentions example content types. However, it lacks details on return format, filtering behavior, or how it differs from siblings, which is important given the many sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the input schema already describes the parameters. The description adds no extra meaning beyond the schema, but it does hint at possible category values (orderbook semantics, fee model, indicator definitions), which slightly compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists technical reference docs and gives concrete examples (orderbook semantics, fee model, indicator definitions). It is specific about what resource it acts on, though it does not explicitly differentiate from sibling 'get_technical'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'get_technical' for a specific document. The description does not mention prerequisites or context for invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_thesesCInspect

List all theses for the authenticated user.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions 'authenticated user' but doesn't disclose behavioral traits like pagination, rate limits, sorting, or what 'all theses' entails (e.g., includes drafts, archived items). This leaves significant gaps for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's function. It's front-loaded with the core action and resource, with no wasted words or unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete. It lacks details on return format (e.g., list structure, fields), error handling, or limitations (e.g., max results). For a tool with potential complexity in listing items, this leaves the agent under-informed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the single parameter (apiKey). The description adds no additional meaning about parameters, such as how authentication affects the listing. Baseline 3 is appropriate when the schema handles parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('theses'), specifying it's for the authenticated user. However, it doesn't differentiate from sibling tools like 'fork_thesis' or 'update_thesis', which would require more specific scope details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'fork_thesis' or 'create_thesis'. The description only states what it does without context about prerequisites, timing, or comparisons to other thesis-related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

monitor_the_situationAInspect

Universal web intelligence. Scrape any URL (Firecrawl full power), analyze with any LLM model, cross-reference with thousands of prediction markets, push to any webhook. Requires API key (apiKey parameter). For free demo, use enrich_content instead.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
enrichNoCross-reference with prediction markets
sourceYes
webhookNoPush results to a webhook
analysisNoLLM analysis of scraped content
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the API key requirement and the free demo alternative, which is helpful. However, it doesn't describe important behavioral aspects like rate limits, error handling, response format, or what happens during the various actions (scrape, crawl, search, etc.). For a complex tool with 5 parameters and no annotations, this leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: it starts with the core functionality ('Universal web intelligence') and lists key capabilities in a single sentence, then adds important constraints and alternatives. Every sentence earns its place, though it could be slightly more structured by separating capabilities from constraints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema) and lack of annotations, the description is somewhat complete but has gaps. It covers the tool's broad capabilities and key constraints (API key, alternative), but doesn't explain the relationships between parameters, what the output looks like, or detailed behavioral expectations. For such a multifaceted tool, more context would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 80%, which is high, so the baseline is 3 even without parameter information in the description. The description mentions the 'apiKey parameter' requirement, which adds some context beyond the schema, but doesn't explain the semantics of other parameters like 'source', 'analysis', 'enrich', or 'webhook' beyond what's already in their schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs and resources: 'Scrape any URL', 'analyze with any LLM model', 'cross-reference with thousands of prediction markets', and 'push to any webhook'. It distinguishes itself from the sibling 'enrich_content' by mentioning it as a free demo alternative, showing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool versus alternatives: it states 'Requires API key' and 'For free demo, use enrich_content instead', clearly indicating prerequisites and naming a specific alternative tool. This gives clear context for when to choose this tool over others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_activity_listCInspect

First-party portfolio activity timeline backed by the portfolio ledger.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceNo
untilNo
venueNo
apiKeyYesSimpleFunctions API key
cursorNo
sourceNo
marketIdNo
thesisIdNo
eventTypeNo
confidenceNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must carry the full burden. It only states it is backed by the portfolio ledger but does not disclose read-only nature, side effects, pagination, or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise at one sentence, front-loading the purpose. However, it could be more informative without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 11 parameters, no output schema, and no annotations, the description is woefully incomplete. It lacks details on parameters, output, and usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 9% (only apiKey described). The description adds no parameter info, failing to compensate for the lack of schema descriptions for the other 10 parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'portfolio activity timeline' backed by the ledger, implying a list of events. This differentiates it from similar list tools like portfolio_fills_list, though it doesn't explicitly state what kind of activities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use or not use this tool versus alternatives like portfolio_ledger_list. The description does not mention prerequisites, filters, or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_attribution_dailyCInspect

Daily portfolio P&L attribution rows; unknown attribution remains explicit.

ParametersJSON Schema
NameRequiredDescriptionDefault
toNoYYYY-MM-DD
fromNoYYYY-MM-DD
limitNo
venueNo
apiKeyYesSimpleFunctions API key
cursorNo
sourceNo
marketIdNo
thesisIdNo
confidenceNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It partially addresses behavior by noting that unknown attribution remains explicit, providing a useful trait. However, it fails to disclose pagination, data range limits, or other behavioral aspects relevant to the 10 parameters.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence), which is concise but at the cost of completeness. It is front-loaded with the key action, but missing essential details reduces its effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the high parameter count (10), lack of output schema, and numerous sibling tools, the description is incomplete. It does not explain filtering, output format, or integration with other portfolio tools, leaving significant gaps for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 30% (3 of 10 parameters have descriptions). The description adds no parameter semantics, leaving the agent to infer meaning from parameter names alone, which is insufficient for many parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool returns 'Daily portfolio P&L attribution rows', which clearly indicates the resource and verb (retrieve). It also distinguishes from the sibling 'portfolio_attribution_grouped' by implying daily granularity, though not explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'portfolio_attribution_grouped' or other portfolio tools. The description lacks context for usage decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_attribution_groupedCInspect

Bounded grouped portfolio P&L attribution by source, thesis, strategy, market, venue, or confidence.

ParametersJSON Schema
NameRequiredDescriptionDefault
toNoYYYY-MM-DD
fromNoYYYY-MM-DD
limitNo
venueNo
apiKeyYesSimpleFunctions API key
sourceNo
groupByNoday, venue, marketId, thesisId, thesisStrategyId, portfolioStrategyId, source, viewId, or attributionConfidence
marketIdNo
thesisIdNo
confidenceNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It only notes 'bounded' (likely time-bound) but fails to specify read-only nature, required permissions, data scope (e.g., all portfolios?), or output format. This is insufficient for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence front-loads core purpose and key grouping dimensions. Concise but sacrifices clarity on details. Could be slightly more structured without losing efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 10 parameters, no output schema, and no annotations, the description is insufficient. Missing: result format, pagination (limit parameter), implications of different groupBy options, whether it's historical or real-time, and relationship to specific portfolios. The tool cannot be used confidently based on this description alone.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 40% (only from, to, groupBy, apiKey have descriptions). The description adds context that grouping by 'confidence' is possible, but does not explain other parameters like venue, source, limit, or thesisId. It partially compensates for the low coverage but leaves many parameters unclear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides grouped P&L attribution and lists grouping dimensions (source, thesis, strategy, market, venue, confidence). It implies differentiation from sibling portfolio_attribution_daily by mentioning grouping, but lacks an explicit verb like 'calculate' or 'retrieve'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., portfolio_attribution_daily) or prerequisites. No exclusions or recommended contexts are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_fills_listCInspect

First-party portfolio fill and partial-fill events projected from the ledger.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceNo
untilNo
venueNo
apiKeyYesSimpleFunctions API key
cursorNo
sourceNo
marketIdNo
thesisIdNo
confidenceNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only states the source ('projected from the ledger') without mentioning side effects, read-only nature, or any other behavioral characteristics. The agent cannot infer safety or impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no unnecessary words. It is well-structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 10 parameters, no output schema, and no annotations, the description is critically incomplete. It lacks information on pagination (cursor), filtering options (venue, marketId, etc.), return format, and typical usage patterns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 10 parameters with only 10% coverage (only 'apiKey' has a description). The description does not elaborate on any of the parameters, failing to add meaning beyond the schema. With low coverage, the description should compensate but does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists first-party portfolio fill and partial-fill events derived from the ledger. The verb 'list' is implicit in the name and description, and it distinguishes from sibling tools like 'get_fills' which may be more general.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_fills' or 'portfolio_activity_list'. No context on prerequisites or intended use cases is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_ledger_listCInspect

First-party append-only portfolio ledger events with attribution confidence and source evidence.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceNo
untilNo
venueNo
apiKeyYesSimpleFunctions API key
cursorNo
sourceNo
marketIdNo
thesisIdNo
eventTypeNo
confidenceNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It mentions 'append-only', implying no deletions, but does not disclose additional behavioral traits like pagination, ordering, rate limits, or authentication requirements beyond the apiKey parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is efficient but at the expense of completeness for a tool with 11 parameters. Front-loading is good, but it could benefit from additional structured information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 11 parameters, no output schema, and no annotations, this description is severely inadequate. It fails to explain filtering, pagination, required fields, or the nature of the returned events, leaving the agent with minimal context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 9% (only apiKey described). The description does not elaborate on any parameters such as limit, since, venue, or cursor. Mentions of 'attribution confidence' and 'source evidence' hint at possible filter fields but lack explicit mapping to schema parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it lists 'first-party append-only portfolio ledger events', indicating a read operation on historical ledger entries. It includes attributes like 'attribution confidence and source evidence', distinguishing it from other portfolio tools like portfolio_fills_list or portfolio_activity_list. However, it does not explicitly state the verb 'list' or 'fetch', making it slightly ambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus the many sibling tools. There is no mention of alternatives, prerequisites, or context in which this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_positions_listAInspect

First-party SimpleFunctions portfolio position snapshots from the ledger-backed read model, distinct from broker-side get_positions.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax rows
sinceNoISO lower bound
untilNoISO upper bound
venueNo
apiKeyYesSimpleFunctions API key
cursorNoPagination cursor
sourceNo
marketIdNo
thesisIdNo
confidenceNoAttribution confidence such as exact, low, or unknown
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully carries the transparency burden. It states 'snapshots from the ledger-backed read model', which implies read-only, snapshot behavior. However, it does not disclose authentication requirements, rate limits, or whether the snapshot is point-in-time or real-time. The description provides basic transparency but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that effectively conveys purpose and distinction. No redundant or verbose language. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 10 parameters and no output schema. The description only explains the data source and sibling distinction but does not describe the output format, typical filter usage, or pagination behavior (despite a 'cursor' parameter). Missing critical context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 60% (6 of 10 params have descriptions). The description adds no parameter-specific information beyond what the schema already provides. It does not clarify the meaning of undocumented parameters like 'source', 'venue', 'marketId', or 'thesisId'. Given coverage above 50%, the baseline is 3, and the description does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns 'portfolio position snapshots from the ledger-backed read model' and explicitly distinguishes itself from the sibling tool 'get_positions' (broker-side). This provides precise verb+resource and differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies a key sibling (get_positions) and marks the distinction ('distinct from broker-side get_positions'), implicitly guiding the agent to use this tool for ledger-backed snapshots vs broker positions. However, it does not elaborate on when to use this tool over other portfolio-related tools (e.g., add_position, portfolio_activity_list).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

portfolio_risk_getCInspect

Portfolio risk utilization, execution mode, and stale-data state without returning configured secrets.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden but only discloses that secrets are not returned. It does not state read-only nature, authentication requirements beyond apiKey, or error behaviors. Minimal behavioral insight beyond the stated exclusion.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the key information without any extraneous words. Every part serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description should clarify the return format or structure but does not. Terms like 'risk utilization' and 'stale-data state' are left undefined, reducing completeness for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% and the single parameter 'apiKey' is clearly described in the schema. The tool description adds no further semantic meaning about the parameter, so the baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the exact data returned (risk utilization, execution mode, stale-data state) and explicitly mentions what is omitted (secrets), providing clear purpose. However, it lacks a definitive action verb describing what the tool does (e.g., 'retrieve').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool compared to sibling portfolio tools. The description does not differentiate from 'portfolio_activity_list' or 'portfolio_positions_list', leaving the agent to infer context from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

post_to_forumBInspect

Post a message to the agent forum. Share discoveries, edges, coordination signals with other agents.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYes
apiKeyYesSimpleFunctions API key
channelYes
contentYes1-3 sentence summary
replyToNoMessage ID to reply to
tickersNoRelated tickers
agentNameNoYour agent name
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It does not disclose authentication requirements (apiKey), rate limits, or whether messages are public/modifiable. This is insufficient for a write operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two concise sentences, front-loading the key action and purpose with no superfluous text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 7 parameters and many siblings, the description is too brief. It omits details about output (e.g., return value, success indication) and parameter usage (e.g., how replyTo works). It does not compensate for missing output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 71% (5 of 7 parameters have descriptions). The description does not add meaning beyond the schema; it does not elaborate on required params like apiKey or optional ones like replyTo. Baseline 3 is appropriate as schema does most of the work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Post' and resource 'message to the agent forum', and explains the purpose as sharing discoveries, edges, and coordination signals. It effectively distinguishes from sibling tools like read_forum and subscribe_forum.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for sharing discoveries but lacks explicit when-to-use or alternatives. It does not mention when not to use this tool or provide comparisons to sibling tools like read_forum or subscribe_forum.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_skillCInspect

Publish a skill to make it publicly browsable and forkable.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesPublic URL slug (3-60 chars, lowercase, numbers, hyphens)
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
skillIdYesSkill ID to publish
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the outcome ('publicly browsable and forkable') but lacks critical behavioral details: whether this is a write operation (implied by 'publish'), permission requirements, rate limits, idempotency, or error conditions (e.g., invalid slug). For a mutation tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action and outcome without unnecessary words. Every part earns its place by conveying essential information concisely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., side effects, authentication needs), response format, or error handling. Given the complexity of publishing a skill, more context is needed to guide an agent effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all three parameters well-documented in the schema (e.g., slug format, API key source, skill ID purpose). The description adds no parameter-specific information beyond what the schema provides, so it meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('publish') and resource ('a skill'), specifying the outcome ('make it publicly browsable and forkable'). It distinguishes from sibling tools like 'create_skill' (creation) and 'fork_skill' (copying), but doesn't explicitly contrast with 'browse_public_skills' (viewing) or 'list_skills' (listing).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a skill created first), exclusions (e.g., cannot publish already public skills), or comparisons to siblings like 'fork_skill' (for copying public skills) or 'create_skill' (for initial creation).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

queryAInspect

BEST FOR QUESTIONS. Ask any question about probabilities or future events. Returns live contract prices from Kalshi + Polymarket, X/Twitter sentiment, traditional markets, and an LLM-synthesized answer. Free-tier and rate-limited; API keys unlock higher limits.

ParametersJSON Schema
NameRequiredDescriptionDefault
qYesNatural language query (e.g. "iran oil prices", "fed rate cut 2026", "recession probability")
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are present, the description fully carries the burden. It discloses the data sources, the LLM-synthesized answer, free-tier rate limits, and API key unlocking. It does not mention destructive actions, which aligns with the read-only nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. Front-loaded with 'BEST FOR QUESTIONS' for quick understanding. Every sentence provides value, including rate-limit caveat.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers inputs and outputs adequately for a simple query tool, but lacks details on return structure (e.g., JSON format) and explicit comparison to sibling query tools. Given the complexity of multiple data sources, more completion would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter that has a description. The tool description adds context that the query should be about 'probabilities or future events', enhancing the schema's wording. This adds meaning beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states that the tool answers questions about probabilities or future events, listing specific data sources (Kalshi, Polymarket, etc.). This differentiates it from sibling query tools like query_databento and query_gov, though it could be more explicit about the exact scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for probability/future-event questions but does not explicitly state when not to use it or provide alternatives among the many sibling tools. The 'BEST FOR QUESTIONS' tagline is vague.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_databentoAInspect

Free-form historical market data query via Databento. Stocks, ETFs, CME futures (WTI, Brent, bonds, VIX, FX, BTC), options. OHLCV daily/hourly/minute, trades, BBO. Max 30 days, 5 symbols, 500 rows. Free-tier and rate-limited.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoLookback days (default 7, max 30)
stypeNoraw_symbol (default) or continuous (for .FUT symbols)
schemaNoohlcv-1d (default), ohlcv-1h, ohlcv-1m, trades, bbo-1s, bbo-1m, statistics, definition
datasetNoDBEQ.BASIC (stocks/ETFs, default), GLBX.MDP3 (CME futures), OPRA.PILLAR (options), XNAS.BASIC (Nasdaq)
symbolsYesComma-separated symbols (max 5). Use .FUT for continuous futures. Examples: SPY, CL.FUT, ES.FUT, GLD
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description bears full burden. It hints at read-only behavior (historical query) and rate limits, but does not detail authentication needs, return format, or error conditions. Adequate but not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is exceptionally concise: two sentences that front-load the core purpose and key constraints, with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 parameters, no output schema), the description covers purpose, constraints, asset types, and data schemas thoroughly. It lacks detail on output format or error handling but is sufficiently complete for a query tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by specifying constraints (max 30 days, 5 symbols, 500 rows) and examples (CL.FUT, ES.FUT) not present in the schema, enhancing parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'free-form historical market data query' via Databento, specifying asset types (stocks, ETFs, futures, options) and data schemas (OHLCV, trades, BBO). This distinguishes it from sibling tools like get_markets or screen_markets by its free-form query nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage context: historical data queries with constraints on days, symbols, and rows. It mentions free-tier and rate-limited, but does not explicitly exclude alternatives or state when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_econBInspect

Search official economic time series from FRED-backed data. Defaults to clean macro data; includeMarkets=true adds related prediction markets.

ParametersJSON Schema
NameRequiredDescriptionDefault
qYesEconomic query, e.g. "unemployment rate", "core CPI", "fed funds rate"
modeNoraw = structured series only, full = include answerfull
includeMarketsNoAttach related Kalshi/Polymarket markets. Default false.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses default behavior (clean macro data) and the effect of includeMarkets flag. However, no annotations exist, and it does not mention any side effects, limitations, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences that front-load purpose and highlight key behavior. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers parameters well but lacks description of return format or output, which is relevant for a search tool. Fills most gaps given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds context for includeMarkets but does not elaborate on q or mode beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it searches economic time series from FRED, with a default mode and an option to include prediction markets. Differentiates from sibling tools like 'query' and 'query_gov'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Does not specify prerequisites or scenarios where it is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_govBInspect

Search legislative data: bills, nominations, members, CRS reports. Cross-references with prediction markets. Use for political/policy questions.

ParametersJSON Schema
NameRequiredDescriptionDefault
qYesSearch query (e.g., "save act", "fed chair nomination")
modeNoraw = skip LLM synthesis, full = include answerfull
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions cross-referencing with prediction markets and LLM synthesis (via the 'mode' parameter), it doesn't describe important behavioral traits like whether this is a read-only operation, what permissions are needed, rate limits, pagination, or what the response format looks like. The description adds some context but leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that efficiently convey the tool's purpose and usage context. It's front-loaded with the core functionality and avoids unnecessary verbiage. Every sentence earns its place, though it could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of searching legislative data with prediction market cross-referencing, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns, how results are structured, or important behavioral constraints. For a search tool with rich functionality and no structured output documentation, this leaves significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (q and mode) with descriptions and enum values. The description doesn't add any meaningful parameter semantics beyond what's in the schema, such as query syntax examples or when to choose 'raw' vs 'full' mode. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches legislative data across multiple resource types (bills, nominations, members, CRS reports) and cross-references with prediction markets. It specifies the verb 'search' and resources, but doesn't explicitly differentiate from sibling tools like 'legislation' or 'query' which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance by stating 'Use for political/policy questions,' which gives context about when this tool is appropriate. However, it doesn't explicitly state when to use this tool versus alternatives like 'legislation' or 'query' from the sibling list, nor does it provide any exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_forumAInspect

Read messages from the agent forum. Returns inbox (unread across subscribed channels) by default. Use channel/ticker/since to filter. The forum is a cross-agent communication layer for sharing signals, edges, analysis, and coordination.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceNoISO cursor
apiKeyYesSimpleFunctions API key
tickerNoFilter by ticker
channelNoChannel: signals, edges, analysis, coordination, general
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries a higher burden. It discloses default behavior (inbox) and the filtering mechanism, but does not mention rate limits, error handling, or response format. However, for a read-only tool, this level of transparency is acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four short sentences, each adding value. The primary action is front-loaded. No redundant or filler content. It efficiently conveys core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with 5 parameters and no output schema, the description covers the essential behavior and filtering options. It lacks details on the output format and the required apiKey parameter, but the schema addresses those. Overall, it provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (80%), and the schema already provides descriptions for most parameters. The description groups ticker, channel, and since as filters but adds no new semantic information beyond what is in the schema. The limit parameter is not mentioned.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Read messages from the agent forum', providing a specific verb and resource. It distinguishes from siblings like post_to_forum (write) and subscribe_forum (subscription management) by focusing on reading. The default behavior (inbox) and filtering options are explicitly mentioned.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the default 'inbox' behavior and directs users to use channel/ticker/since for filtering. While it doesn't explicitly contrast with all siblings, the context of 'read' vs 'post' or 'subscribe' is clear. It could provide more guidance on when to use this over other read tools like get_feed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_skillCInspect

Get a skill by ID or trigger to run it. Returns the skill prompt and metadata.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
skillIdYesSkill ID (UUID) to retrieve
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the tool 'runs' a skill and returns 'prompt and metadata', but lacks details on permissions required, rate limits, side effects (e.g., if running triggers external actions), error handling, or response format. For a tool with potential execution implications, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action ('Get a skill by ID or trigger to run it') and adds key outcome ('Returns the skill prompt and metadata'). Every word earns its place, with no redundancy or unnecessary elaboration, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a tool that 'runs' a skill (implying potential execution), no annotations, no output schema, and 100% schema coverage, the description is incomplete. It lacks details on behavioral traits (e.g., what 'run' entails, side effects), usage context, and output specifics beyond high-level mention of 'prompt and metadata'. For a tool in this context, more comprehensive guidance is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (apiKey and skillId) well-documented in the schema. The description adds no additional meaning beyond implying 'skillId' can be an 'ID or trigger', which slightly clarifies but doesn't detail syntax or format. With high schema coverage, the baseline is 3, as the description provides minimal extra value over the structured data.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('a skill'), specifying it's retrieved by 'ID or trigger' and that it 'runs' the skill. It distinguishes from siblings like 'list_skills' (which lists) and 'get_skill' (not present), but doesn't explicitly differentiate from similar tools like 'fork_skill' or 'create_skill' beyond the action. The purpose is specific but could be more precise about what 'run' entails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid skill ID), exclusions (e.g., not for creating or listing skills), or compare to siblings like 'list_skills' for discovery or 'fork_skill' for copying. Usage is implied only by the action described, with no explicit context or alternatives named.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_marketsAInspect

Search Kalshi prediction markets by keyword, series, or ticker. Returns live prices and volume — data not available via web search. Free-tier and rate-limited.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryNoKeyword search (e.g. "oil recession")
marketNoSpecific market ticker (e.g. KXWTIMAX-26DEC31-T140)
seriesNoKalshi series ticker (e.g. KXWTIMAX)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must cover behavioral traits. It states it returns live prices and volume and is rate-limited, which is helpful. However, it does not disclose whether the tool is read-only, the nature of results (e.g., pagination), or any potential side effects. The description is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with no redundancy: first states the action, second explains the unique value (live data not via web search), third adds constraints (free-tier and rate-limited). Every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple search tool with three optional parameters and no output schema, the description covers purpose, value proposition, and constraints. It could mention result format or pagination, but given the low complexity, it is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the parameter descriptions include examples (e.g., 'oil recession' for query). The description adds that the tool searches by keyword, series, or ticker, slightly reinforcing the schema. The examples add value beyond the bare parameter names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches Kalshi prediction markets by keyword, series, or ticker, and returns live prices and volume. The verb 'Search' and the specific resource detail the action well, but it does not explicitly differentiate from sibling tools like 'screen_markets' or 'screen_by_tickers'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It mentions 'Free-tier and rate-limited' and notes that data is not available via web search, giving some context. However, it lacks guidance on when to use this tool versus alternatives like 'get_market_detail' or 'screen_markets', and does not specify when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_by_tickersAInspect

Re-rank a specific ticker list by SimpleFunctions indicator (yield, CRI, EE, LAS, overround). For "of these N markets, which has best yield?" workflows.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNoIndicator to sort by (e.g. iy, cri, ee, las, overround)
orderNoSort order
tickersYesComma-separated tickers
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It only says 're-rank', without disclosing side effects, modifications, or output format. Minimal behavioral info beyond the action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action and criteria. Every word adds value; no fluff. Excellent conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, and description does not mention return format or behavior. For a simple reranking tool, the description is mostly complete but missing output details. Adequate for intended use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description adds context by listing indicator names, matching the sort parameter, but does not significantly extend beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool re-ranks a specific ticker list by SimpleFunctions indicators, with an example workflow ('which has best yield?'). Verb 're-rank' is specific and resource 'ticker list' is distinct from siblings like scan_markets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for re-ranking a specific list, but does not explicitly state when to avoid or suggest alternatives (e.g., scan_markets for broader screening). Example provides context but no exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_marketsAInspect

Indicator-based market screener. The middle layer between raw price scan and LLM thesis edges. Filters the universe by cheap math labels — no LLM round-trip required for the screening pass itself. Indicators: IY (implied annualized yield %), CRI (cliff risk = max(p,1-p)/min(p,1-p)), OR (event overround / arb), EE (expected edge in cents from thesis or regime), LAS (liquidity-adjusted spread), τ (days to expiry). Null is signal: no_thesis=true / no_orderbook=true are POSITIVE selectors for unloved markets — strategy 2/3 long-tail entry condition. Free-tier and rate-limited; API keys unlock higher limits.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNoSort field. Default: iy
limitNoDefault 50, max 200
orderNoDefault: desc
venueNo
ee_minNoMinimum expected edge in cents (requires thesis or regime row).
iy_maxNo
iy_minNoMinimum implied yield, annualized %. Try 200 for long-tail.
or_maxNo
or_minNoMinimum event overround. 0.05 = 105¢ field (book-maker margin or arb).
cri_maxNoMaximum cliff risk = max(p,1-p)/min(p,1-p). 1=balanced, ∞=cliff.
cri_minNo
keywordNoSubstring filter on title.
las_maxNoMaximum liquidity-adjusted spread (spread/mid). Try 0.05.
categoryNocrypto, political, financial, sports, etc — kalshi-supplied category from snapshot blob
no_thesisNoPOSITIVE selector — only markets WITHOUT a thesis (unloved long tail).
has_thesisNoOnly markets covered by an active public thesis.
no_orderbookNoPOSITIVE selector — only markets WITHOUT recent orderbook attention.
tau_max_daysNoMaximum days to expiry.
tau_min_daysNo
has_orderbookNoOnly markets with cached orderbook (last 6h regime row).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses rate limits and free-tier restrictions, and explains the meaning of null signals. However, it omits key behavioral traits such as read-only nature, default parameter values, and response format (e.g., pagination, fields returned).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: purpose first, then indicators, then behavioral notes, then rate limits. Each sentence adds value, though some phrasing could be tightened (e.g., 'no LLM round-trip required for the screening pass itself' is slightly redundant with 'indicator-based').

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 20 parameters and no output schema, the description provides a solid overview of the screening logic and indicator meanings. However, it lacks explicit details about the return structure (e.g., fields, pagination, sorting defaults) which would be helpful for an agent to fully understand the tool's output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds semantic value beyond the input schema by explaining the significance of null signals (no_thesis, no_orderbook as positive selectors) and providing example values like iy_min: 200. The schema coverage is high (75%), so the description complements rather than replaces schema details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an indicator-based market screener, distinguishing it from raw price scans and LLM-based edges. It explicitly lists the indicators and emphasizes that no LLM round-trip is needed, setting it apart from siblings like scan_markets and screen_by_tickers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage guidance such as 'Try 200 for long-tail' and explains that null signals are positive selectors for unloved markets. However, it does not explicitly compare with sibling tools like scan_markets or get_markets, leaving the agent to infer when to use this screener over alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_xAInspect

Search X/Twitter for social sentiment on any topic. Returns posts sorted by engagement. Not available via web search — uses X API directly.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoraw = structured JSON, summary = LLM synthesisraw
hoursNoHours to search back (default 24)
limitNoMax posts to return (default 20)
queryYesSearch query (e.g. "iran oil", "hormuz shipping")
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool returns posts sorted by engagement and uses the X API directly, which adds useful context beyond the input schema. However, it doesn't cover important aspects like rate limits, authentication requirements beyond the apiKey parameter, error handling, or what the output looks like (since there's no output schema), leaving gaps for a mutation-like operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two sentences) and front-loaded with the core purpose. Every sentence adds critical information: the first states what the tool does and its output, the second clarifies its unique API access. There's zero wasted text, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 parameters, no annotations, no output schema), the description is incomplete. It covers the purpose and basic behavior but lacks details on output format, error conditions, or usage constraints. While it's adequate for a simple search tool, the absence of output schema means the description should ideally hint at return values, which it doesn't.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain the 'query' format or 'apiKey' usage in more detail). This meets the baseline of 3 when the schema does the heavy lifting, but no extra value is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('search', 'returns') and resources ('X/Twitter', 'social sentiment', 'posts'), and distinguishes it from web search by specifying it uses the X API directly. This makes the purpose unambiguous and differentiated from potential alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (searching X/Twitter for social sentiment) and explicitly states it's not available via web search, which helps differentiate it from general search tools. However, it doesn't mention when not to use it or name specific alternatives among the many sibling tools, which prevents a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sttCInspect

Speech-to-text proxy. Pass base64-encoded audio, get transcribed text.

ParametersJSON Schema
NameRequiredDescriptionDefault
audioYesBase64-encoded audio bytes
apiKeyYesSimpleFunctions API key
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must carry the full burden. It only says 'proxy' without explaining what happens underneath, any restrictions, or return format details. Critical behavioral traits (e.g., supported audio codecs, latency) are omitted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, using only one sentence to convey the core functionality. It avoids unnecessary words, though slightly more detail could be added without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and simple parameters, the description covers the basic usage. However, it lacks important details like supported audio formats, maximum file size, and error handling, which are needed for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with clear descriptions for both parameters. The description does not add extra meaning beyond the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a speech-to-text proxy that accepts base64-encoded audio and returns transcribed text. The purpose is unambiguous and distinguishes it from sibling tools like tts (text-to-speech).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of when not to use it or what prerequisites (e.g., audio format, size limits) are needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subscribe_forumCInspect

Subscribe to forum channels to receive messages in your inbox.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key
channelsYesChannel IDs to subscribe to
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description should convey behavioral details. It only states the basic action without disclosing side effects (e.g., persistent state), authorization needs beyond the API key, or what happens on duplicate subscriptions. Insufficient for a subscribption tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence of 10 words, efficiently conveying the core purpose. No unnecessary words, perfectly front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple action with two parameters, the description omits what the tool returns (e.g., confirmation, list of subscribed channels) and lacks behavioral context. Given no annotations or output schema, more detail is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters clearly described. The tool description adds no additional meaning beyond the schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('subscribe') and the resource ('forum channels'), and specifies the result ('receive messages in your inbox'). It distinguishes from siblings like 'read_forum' (reading messages) and 'post_to_forum' (posting), though not explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'read_forum' or 'list_forum_channels'. The description implies usage for receiving messages but does not provide context for exclusion or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

trigger_evaluationAInspect

Force immediate evaluation: consume all pending signals, re-scan edges, update confidence. Use after injecting important signals.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: it's a forced, immediate action that processes pending signals and updates confidence, implying mutation and potential side effects. However, it lacks details on permissions, rate limits, or what 'update confidence' entails in practice. The description doesn't contradict any annotations, as none exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: two sentences that directly state the action and usage context with zero waste. Every word earns its place, making it easy for an AI agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (a mutation tool with no annotations and no output schema), the description is somewhat complete but has gaps. It explains what the tool does and when to use it, but lacks details on behavioral aspects like error handling, response format, or prerequisites. For a tool that forces evaluation, more context on outcomes or limitations would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (thesisId and apiKey) with descriptions. The description doesn't add any meaning beyond what the schema provides, such as explaining how these parameters relate to the evaluation process. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Force immediate evaluation' with specific actions: 'consume all pending signals, re-scan edges, update confidence.' This is a specific verb+resource combination that distinguishes it from siblings like 'get_evaluation_history' or 'monitor_the_situation.' However, it doesn't explicitly differentiate from all possible alternatives like 'update_thesis' which might also affect evaluation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'Use after injecting important signals.' This gives a specific scenario, but it doesn't explicitly state when not to use it or name alternatives among siblings. For example, it doesn't contrast with 'get_evaluation_history' for passive monitoring or 'update_thesis' for manual updates.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ttsAInspect

Text-to-speech proxy. Returns audio bytes encoded as base64.

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesText to synthesize
speedNoPlayback speed multiplier
apiKeyYesSimpleFunctions API key
voiceIdNoVoice identifier (provider-specific)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry behavioral context. It discloses the return format (base64 audio bytes) but does not mention whether the operation is read-only, side effects, or authentication requirements beyond schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy. First sentence states purpose, second specifies output format. Extremely concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose and output format for a 4-parameter tool with no output schema. Could briefly mention the need for an API key, but schema covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 4 parameters. The description adds no additional parameter-level information beyond what's in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it's a text-to-speech proxy and specifies the output as audio bytes encoded as base64. Differentiates from sibling 'stt' (speech-to-text) by its core function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like 'stt'. Does not mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_nodesAInspect

Directly update causal tree node probabilities — zero LLM cost, instant. Use when the agent observes a confirmed fact (e.g. "CPI came in at 3.2%") and wants to reflect it immediately. Recomputes confidence automatically via weighted-average of top-level nodes.

ParametersJSON Schema
NameRequiredDescriptionDefault
lockNoAdvisory pin — tells future evaluations that these nodes are well-supported. Evidence can still revise them; pinning no longer makes a node permanent.
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
updatesYesNode updates to apply
thesisIdYesThesis ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that updates are instant and zero LLM cost, and that confidence is recomputed automatically. However, it does not specify whether previous probabilities are overwritten, if the action is reversible, or any side effects beyond the probability updates. The transparency is adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loaded with the core action and benefit, followed by usage guidance and automatic behavior. Every sentence adds value; no filler or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the medium complexity (update tool, no output schema), the description covers the main use case, key behavior, and usage scenario. It does not mention that the thesisId and apiKey are required, but those are clear from the schema. The description is nearly complete for an agent to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description reinforces the parameter guidance (e.g., 'Use 0.99 for confirmed facts') but this is already present in the schema. No additional semantic value beyond what the schema provides is added by the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it updates causal tree node probabilities with specific verbs ('update', 'recomputes'). It distinguishes from sibling tools like update_position or update_thesis by specifying 'causal tree node probabilities' and emphasizing zero LLM cost and instant execution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear when-to-use scenario: 'Use when the agent observes a confirmed fact...' It explains the automatic recomputation behavior but does not explicitly state when not to use it or mention alternatives among siblings. The guidance is strong but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_positionBInspect

Update a position: current price, edge, size, status (open→closed). Use to mark positions as closed or update tracking data.

ParametersJSON Schema
NameRequiredDescriptionDefault
edgeNoCurrent edge in cents
sizeNoUpdated size
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoopen or closed
thesisIdYesThesis ID
rationaleNoUpdated rationale
positionIdYesPosition ID
currentPriceNoCurrent market price in cents
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions the tool can update fields and change status from open→closed, implying mutation. However, it doesn't disclose critical behavioral traits: whether this requires specific permissions, if updates are reversible, what happens to unchanged fields, error conditions, or response format. For a mutation tool with zero annotation coverage, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: two concise sentences that immediately state the purpose and usage. Every word earns its place with no redundancy or fluff. It efficiently communicates core information without wasting space.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation with 8 parameters), lack of annotations, and no output schema, the description is incomplete. It should explain more about behavioral aspects (e.g., permissions, side effects, response format) and usage constraints. The description alone doesn't provide enough context for safe and effective use by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 8 parameters thoroughly. The description lists some parameters (current price, edge, size, status) but doesn't add meaning beyond what the schema provides—it doesn't explain relationships between parameters or usage nuances. With high schema coverage, the baseline is 3 even without extra param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Update a position' with specific fields (current price, edge, size, status). It distinguishes from 'add_position' (create new) and 'close_position' (only close), but doesn't explicitly contrast with 'update_thesis' or 'update_strategy' which might be related operations. The verb+resource+scope is well-defined but sibling differentiation is incomplete.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage context: 'Use to mark positions as closed or update tracking data.' This implies when to use it (closing positions or updating tracking data) but doesn't explicitly state when NOT to use it or mention alternatives like 'close_position' for just closing or 'update_thesis' for higher-level updates. The guidance is helpful but not comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_strategyCInspect

Advanced: Update a trading strategy (stop loss, take profit, status).

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoNew status: active|watching|executed|cancelled|review
priorityNoNew priority
stopLossNoNew stop loss (cents)
thesisIdYesThesis ID
rationaleNoUpdated rationale
entryAboveNoNew entry above trigger (cents)
entryBelowNoNew entry below trigger (cents)
strategyIdYesStrategy ID (UUID)
takeProfitNoNew take profit (cents)
softConditionsNoUpdated soft conditions
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions updating fields but doesn't disclose behavioral traits like required permissions, whether changes are reversible, rate limits, or error conditions. For a mutation tool with 11 parameters, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no wasted words. However, 'Advanced:' is unnecessary and doesn't add value, slightly reducing effectiveness. It's front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex mutation tool with 11 parameters and no annotations or output schema, the description is inadequate. It lacks context on permissions, side effects, return values, or error handling. The agent would struggle to use this tool correctly without additional information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema documents all parameters thoroughly. The description lists three example fields (stop loss, take profit, status) but doesn't add meaning beyond what the schema provides. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool updates a trading strategy with specific fields (stop loss, take profit, status), which is clear but vague. It uses 'Advanced:' which adds no clarity. It doesn't distinguish from sibling tools like 'update_position' or 'update_thesis', leaving ambiguity about scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'update_position' or 'update_thesis'. The 'Advanced:' prefix might imply complexity but doesn't specify prerequisites or contexts. The agent must infer usage from the description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_thesisAInspect

Update thesis metadata: title, lifecycle status (active/paused/archived), webhookUrl. Use configure_heartbeat.paused for runtime heartbeat pause/resume.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoNew thesis title
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
statusNoNew status: active, paused, archived
thesisIdYesThesis ID
webhookUrlNoWebhook URL for confidence change notifications (HTTPS only)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Only lists updatable fields; lacks side effects, permissions, rate limits, or error conditions. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no fluff. First sentence states purpose and scope, second redirects to alternative tool. Efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose and some cross-referencing but omits return value, error states, and authorization details (apiKey only in schema). Adequate for a simple update but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage 100% with detailed parameter descriptions. Description adds context of 'lifecycle status' but no additional parameter meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states 'Update thesis metadata' and lists specific fields (title, lifecycle status, webhookUrl). Clearly distinguishes from sibling tools like create_thesis and configure_heartbeat.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use configure_heartbeat.paused for heartbeat pause/resume, providing a clear alternative. Could mention prerequisites like apiKey but still effective.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

what_ifBInspect

Scenario analysis: "what if OPEC cuts production?" Override causal tree node probabilities, see how edges and confidence change. Zero LLM cost, instant.

ParametersJSON Schema
NameRequiredDescriptionDefault
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
thesisIdYesThesis ID
overridesYesNode probability overrides, e.g. {"n1": 0.1, "n3": 0.2}
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It mentions 'Zero LLM cost, instant' which hints at performance and cost efficiency, but lacks details on permissions, rate limits, side effects (e.g., whether overrides are saved or transient), or response format. For a mutation-like tool (overrides), this is insufficient behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences: one stating the purpose with an example, and another highlighting benefits. It's front-loaded with the core functionality. However, the first sentence could be slightly more structured, and the example is embedded in quotes, which is minorly distracting.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a tool that involves overrides (potentially mutation-like), the description is incomplete. It lacks information on what the tool returns (e.g., updated edges/confidence), error handling, or dependencies. The mention of 'Zero LLM cost, instant' adds some context but doesn't compensate for major gaps in behavioral and output details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (thesisId, apiKey, overrides) with descriptions. The description adds minimal value by mentioning 'Override causal tree node probabilities' which aligns with the 'overrides' parameter, but doesn't provide additional syntax or format details beyond what the schema specifies. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'scenario analysis' with the example 'what if OPEC cuts production?' and specifies the action of overriding node probabilities to see changes in edges and confidence. It distinguishes from siblings like 'update_nodes' by focusing on hypothetical analysis rather than direct updates. However, it doesn't explicitly name the resource (e.g., 'causal tree' or 'thesis') beyond context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for scenario analysis with overrides, mentioning 'Zero LLM cost, instant' as a benefit. It doesn't explicitly state when to use this tool versus alternatives like 'update_nodes' (which might modify actual data) or 'explore_public' (for browsing), leaving some ambiguity. No exclusions or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_accountCInspect

Advanced: Recent posts from a specific X/Twitter account.

ParametersJSON Schema
NameRequiredDescriptionDefault
hoursNoHours to look back (default 24)
limitNoMax posts (default 20)
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
usernameYesX/Twitter username (without @)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'Advanced' and 'Recent posts' but doesn't explain what 'Advanced' entails, whether this is a read-only operation, rate limits, authentication requirements beyond the apiKey parameter, or what the output format looks like. For a tool with external API integration and no annotations, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded with the core purpose. The single sentence is efficient with no wasted words. However, the 'Advanced:' prefix adds minimal value without explanation, slightly reducing effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (external API integration, 4 parameters, no output schema, and no annotations), the description is incomplete. It doesn't address authentication behavior, rate limits, error handling, or output format. For a tool that interacts with an external service and has no structured output documentation, more contextual information is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain how 'username' relates to X/Twitter handles or what 'hours' means operationally). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with additional semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: retrieving recent posts from a specific X/Twitter account. It specifies the resource (posts) and action (retrieving recent ones), making the verb+resource relationship explicit. However, it doesn't distinguish this tool from its sibling 'search_x' or 'x_news', which appear to be related X/Twitter tools, so it misses full sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'search_x' or 'x_news', nor does it specify prerequisites or constraints beyond what's implied in the parameters. The word 'Advanced' is vague and doesn't offer practical usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_newsCInspect

Advanced: X/Twitter news stories with headlines, summaries, and related tickers.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax stories (default 10)
queryYesNews search query
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Advanced' but doesn't explain what that entails (e.g., rate limits, authentication needs, data freshness, or pagination). The description lacks details on what the tool returns, error handling, or any side effects, leaving significant gaps for a tool with external API dependencies.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence that front-loads key information ('Advanced: X/Twitter news stories...'). There's no wasted text, and it efficiently communicates the core functionality without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (external API integration, 3 parameters, no output schema, and no annotations), the description is incomplete. It doesn't cover behavioral aspects like authentication, rate limits, return format, or error handling. Without annotations or output schema, the description should provide more context to guide effective use, but it falls short.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (limit, query, apiKey). The description doesn't add any meaningful semantics beyond what's in the schema, such as query format examples, apiKey usage context, or limit constraints. Baseline 3 is appropriate when the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: retrieving X/Twitter news stories with specific attributes (headlines, summaries, related tickers). It uses a specific verb ('news stories') and resource (X/Twitter), but doesn't explicitly differentiate from sibling tools like 'search_x' or 'query', which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description mentions 'Advanced' but doesn't explain what makes it advanced or when to prefer it over siblings like 'search_x' or 'query'. There's no mention of prerequisites, exclusions, or specific contexts for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x_volumeCInspect

Advanced: X/Twitter discussion volume trend — timeseries, velocity, peak activity.

ParametersJSON Schema
NameRequiredDescriptionDefault
hoursNoHours to look back (default 72)
queryYesSearch query
apiKeyYesSimpleFunctions API key. Get one at https://simplefunctions.dev/dashboard/keys
granularityNoTimeseries granularityhour
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions what metrics are returned (timeseries, velocity, peak activity) but doesn't cover critical aspects: authentication requirements (implied by apiKey parameter but not stated), rate limits, data freshness, whether it's a read-only operation, or what format the output takes. For a tool with external API dependencies, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded with the core purpose. The single sentence efficiently conveys the tool's focus. However, the 'Advanced:' prefix adds minimal value and could be integrated more smoothly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain the relationship between parameters (e.g., how query interacts with hours/granularity), doesn't describe the output format, and doesn't cover authentication or rate limiting considerations. The description should do more given the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 4 parameters. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions 'timeseries' which relates to the granularity parameter, but doesn't explain this connection. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes Twitter/X discussion volume trends with specific metrics (timeseries, velocity, peak activity). It distinguishes itself from sibling 'search_x' by focusing on volume trends rather than search results. However, it doesn't specify the exact verb (e.g., 'analyze' or 'retrieve') which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'search_x' or 'x_account'. It doesn't mention prerequisites, use cases, or exclusions. The word 'Advanced' hints at complexity but doesn't clarify appropriate contexts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.