Trillboards DOOH Advertising

Name: Trillboards DOOH Advertising
Author: trillboards

by io.github.snehdhruv

Server Details

DOOH advertising via AI agents. 5,000+ screens with edge AI audience intelligence.

Status: Healthy
Last Tested: 2026-05-25 12:33
Transport: Streamable HTTP
URL
Repository: trillboards/dooh-agent-example
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.8/5.0

Tool DescriptionsA

Average 4.3/5 across 74 of 74 tools scored. Lowest: 3.4/5.

Server CoherenceA

Disambiguation3/5

Many tools have distinct purposes (e.g., anomaly_detect vs. get_live_audience), but there is significant overlap in data retrieval and analysis functions. For example, get_analytics, get_network_stats, and get_campaign_performance all provide performance metrics with unclear boundaries, and query_observations overlaps with semantic_search_observations and find_similar_moments. Descriptions help differentiate, but the sheer number of tools (74) creates ambiguity in selection.

Naming Consistency4/5

Most tools follow a consistent verb_noun or verb_adjective_noun pattern (e.g., get_audience_forecast, create_campaign, list_devices), with clear action prefixes like get, create, update, delete, list, and query. There are minor deviations (e.g., anomaly_detect vs. predictive_query, where 'detect' and 'query' are less standardized), but overall naming is predictable and readable across the set.

Tool Count2/5

With 74 tools, the count is excessive for a single server, even given the broad domain of DOOH advertising. This leads to bloat and confusion, as many tools could be consolidated or omitted without losing functionality. While the domain is complex, a more focused set of 15-30 tools would improve coherence and reduce overlap, making it borderline overwhelming for agents to navigate effectively.

Completeness5/5

The tool set comprehensively covers the DOOH advertising domain, including campaign management (create, update, get performance), audience sensing (configure, query, forecast), attribution (get_attribution_timeseries, get_roas), billing (setup, get usage), and integration features (webhooks, AdCP protocols). There are no obvious gaps; tools support full CRUD operations, analytics, and advanced workflows like experiments and predictive modeling, ensuring agents can handle end-to-end tasks without dead ends.

Available Tools

74 tools

activate_signalAInspect

[AdCP Signals] Activate an audience signal for DSP targeting.

Generates IAB Audience Taxonomy 1.1 segment parameters that DSPs can use in bid requests. Returns an activation_key token for referencing this signal activation.

WHEN TO USE:

Converting audience signals into actionable targeting parameters
Generating IAB segment IDs for programmatic bid requests
Creating reusable targeting configurations

RETURNS:

activation_key: Token for referencing this activation (24h expiry)
targeting: { iab_segments, iab_taxonomy_version, custom_params }
screen_count, provider, data_source, methodology

EXAMPLE: User: "Activate high-income professional targeting on my retail screens" activate_signal({ signal_type: "audience", parameters: { income: "high", lifestyle: "professional" }, screen_ids: ["507f1f77bcf86cd799439011"] })

ParametersJSON Schema

Name	Required	Description	Default
`parameters`	Yes
`screen_ids`	No
`signal_type`	Yes
`destinations`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It successfully documents the 24-hour expiry of the activation_key, specifies the return structure (targeting object with iab_segments), and explains that it generates segment parameters for DSP bid requests. Minor gap: does not clarify if re-activation is idempotent or if there are rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers ([AdCP Signals], WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core purpose in the first sentence. No redundant text; every sentence earns its place. The example is appropriately scoped to clarify usage without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema descriptions, no annotations, and no output schema, the description compensates well by documenting return values (activation_key structure, targeting object fields) and providing usage scenarios. Minor deduction for failing to document the 'destinations' parameter and omitting error handling or prerequisite information (e.g., screen registration requirements).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The EXAMPLE section demonstrates semantics for signal_type ('audience'), parameters (income/lifestyle object), and screen_ids. However, the 'destinations' parameter (a complex nested array with seat_id and platform fields) is completely undocumented in the description, leaving a significant semantic gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool 'Activate[s] an audience signal for DSP targeting' and specifies it generates 'IAB Audience Taxonomy 1.1 segment parameters.' The verb (activate), resource (audience signal), and scope (DSP targeting) are clearly defined, distinguishing it from sibling tools like get_signals or cross_signal_correlate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section with three specific scenarios: 'Converting audience signals into actionable targeting parameters,' 'Generating IAB segment IDs for programmatic bid requests,' and 'Creating reusable targeting configurations.' This provides clear guidance on when to select this tool over alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

anomaly_detectAInspect

Detect anomalies in observation patterns. Alert when metrics deviate significantly from trailing averages.

Computes trailing mean and standard deviation for a given metric from the observation_stream, then identifies observations that fall beyond the configured sigma threshold (z-score based anomaly detection).

WHEN TO USE:

Monitoring for unusual audience patterns (sudden spikes or drops in face count)
Detecting equipment anomalies (confidence drops indicating sensor issues)
Identifying unusual commerce or vehicle patterns
Finding outlier moments that may indicate events, incidents, or opportunities

RETURNS:

anomalies: Array of anomalous observations with:
- observation_id, device_id, venue_type, observed_at
- metric_value: The observed value
- z_score: How many standard deviations from the mean
- direction: 'above' or 'below' the mean
- payload: Full observation payload for context
baseline: { mean, stddev, sample_count, lookback_hours }
suggested_next_queries: Follow-up queries to investigate anomalies

EXAMPLE: User: "Are there any unusual audience patterns at retail venues?" anomaly_detect({ metric: "face_count", venue_type: "retail", lookback_hours: 24, threshold_sigma: 2.0 })

User: "Detect anomalies in vehicle counts at this screen" anomaly_detect({ metric: "vehicle_count", screen_id: "507f1f77bcf86cd799439011", lookback_hours: 48, threshold_sigma: 2.5 })

ParametersJSON Schema

Name	Required	Description	Default
`metric`	Yes
`screen_id`	No
`venue_type`	No
`lookback_hours`	No
`threshold_sigma`	No

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Explains the computational algorithm (trailing mean/stddev calculation), discloses the exact return structure including anomalies array with z_score/direction fields and baseline statistics, and mentions 'suggested_next_queries' for follow-up behavior. Lacks explicit read-only declaration or rate limit warnings, but the detailed return contract compensates significantly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with purpose and algorithm. No redundant text; each section serves distinct needs (use cases, return contract, invocation patterns). Appropriate length for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, no output schema in structured fields, and no annotations, the description achieves completeness through detailed RETURNS section documenting the exact anomaly object structure, baseline metadata, and follow-up suggestions. Algorithm explanation and multiple use cases cover behavioral complexity sufficiently for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially through detailed examples showing metric='face_count', threshold_sigma=2.0, etc., which infer parameter purposes. However, lacks explicit parameter definitions (e.g., no direct explanation that 'metric' is the field name to analyze), relying entirely on contextual inference from examples and algorithm description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opening sentences specify the exact statistical method (z-score based anomaly detection using trailing mean/standard deviation) and the core action (detect anomalies in observation patterns). Distinguishes from siblings like query_observations or get_analytics by emphasizing statistical deviation detection rather than raw data retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists four concrete scenarios (audience pattern monitoring, equipment anomalies, commerce/vehicle patterns, outlier moment detection). Includes two detailed usage examples showing different parameter combinations for retail audience vs vehicle detection, providing clear invocation patterns.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

batch_impressionsAInspect

Record multiple impressions in a single request (up to 100).

WHEN TO USE:

Bulk reporting impressions from offline period
Efficient batch processing of impressions
When device was offline and needs to sync

RETURNS:

success: Boolean indicating success
processed: Number of impressions processed
failed: Number of failed impressions
total_earnings: Total earnings credited
errors: Any error details for failed impressions

EXAMPLE: User: "Sync the last hour of impressions" batch_impressions({ impressions: [ { fingerprint: "P_abc123", ad_id: "507f1f77bcf86cd799439011", duration_seconds: 15 }, { fingerprint: "P_abc123", ad_id: "507f1f77bcf86cd799439012", duration_seconds: 10 } ] })

ParametersJSON Schema

Name	Required	Description	Default
`impressions`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It compensates well by documenting the return structure (success, processed, failed, total_earnings, errors) and disclosing the financial side effect ('total_earnings credited'). However, it lacks mention of idempotency guarantees, rate limits, or authorization requirements for the batch operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly demarcated sections (purpose, WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded and scannable. Every sentence serves a distinct purpose—no redundancy or verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the RETURNS section adequately documents response fields. The example provides concrete usage patterns. However, with 0% schema coverage and nested object complexity, the description should explicitly define the impression object fields rather than relying solely on the example.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While the example demonstrates the JSON structure, the description fails to explain what 'fingerprint' (device identifier?), 'ad_id', or 'duration_seconds' semantically represent. No parameter meanings are defined beyond the example values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Record multiple impressions in a single request (up to 100)'—a specific verb + resource with clear batch scope. This effectively distinguishes it from the sibling tool 'record_impression' (singular) by emphasizing bulk processing capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes a dedicated 'WHEN TO USE' section with three specific scenarios: bulk reporting from offline periods, efficient batch processing, and device sync after offline status. This provides explicit guidance on when to prefer this over the single-record alternative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

configure_sensingAInspect

Configure what a screen should sense using natural language. Generates and optionally pushes a sensing profile to the device.

Uses Gemini AI to interpret a natural language sensing intent and generate a sensing profile that maps to available on-device ML models (BlazeFace, AgeGender, FER+, MoveNet, YAMNet, WhisperTiny, EfficientDet, YOLOv8-nano).

WHEN TO USE:

Setting up a new screen to sense specific things (faces, vehicles, emotions, etc.)
Changing what a screen detects based on venue type or business needs
Configuring custom sensing for special events or campaigns
Translating business intent into ML model configuration

RETURNS:

data: The generated sensing profile with:
- profile_name, profile_type, description
- models: Array of ML model IDs to activate
- classes: COCO classes to detect (for object detection models)
- thresholds: Confidence and alert thresholds
- observation_families: What types of observations will be produced
- capture_interval_ms, report_interval_ms: Timing configuration
- estimated_fps_impact: CPU cost estimate
- data_fields_produced: All data fields the profile will generate
- reasoning: Why these models/classes were chosen
- deployment_status: 'generated' | 'pushed' | 'push_failed'
metadata: { screen_id, auto_deploy, profile_id }
suggested_next_queries: Follow-up actions

EXAMPLE: User: "Set up the lobby screen to detect foot traffic and emotions" configure_sensing({ screen_id: "507f1f77bcf86cd799439011", intent: "Detect foot traffic patterns, count people, and measure emotional reactions to displayed content", auto_deploy: false })

User: "Configure this drive-through screen for vehicle counting" configure_sensing({ screen_id: "507f1f77bcf86cd799439011", intent: "Count vehicles in drive-through lane, detect vehicle types, measure queue length", auto_deploy: true })

ParametersJSON Schema

Name	Required	Description	Default
`intent`	Yes
`screen_id`	Yes
`auto_deploy`	No

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, yet description carries full burden excellently: discloses Gemini AI involvement, lists specific ML models available (BlazeFace, YOLOv8-nano, etc.), details deployment_status outcomes ('generated' | 'pushed' | 'push_failed'), warns of CPU cost via estimated_fps_impact, and documents side effects of profile generation/pushing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loads the core purpose. Length is substantial but justified by complexity of ML configuration domain. Each section earns its place, though RETURNS section is verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but description provides extensive RETURNS documentation covering profile structure, timing configuration, and data fields. Compensates for missing annotations with behavioral details. Minor gap: could explicitly document parameter formats rather than relying solely on examples.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially via two detailed JSON examples showing screen_id format (MongoDB ObjectID), intent string patterns, and auto_deploy boolean usage. However, lacks explicit parameter definitions or format specifications outside of examples, leaving some semantic ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific action ('Configure what a screen should sense') and mechanism ('using natural language'), immediately distinguishes from sibling query tools like get_signals or query_observations. Explicitly names the AI backend (Gemini) and target resource (sensing profiles).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with four concrete scenarios (setting up new screens, changing venue detection, special events, translating business intent). Lacks explicit 'when not to use' or direct comparison to siblings like query_observations, but provides strong contextual guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_campaignAInspect

Create a new advertising campaign targeting DOOH screens.

WHEN TO USE:

Setting up a new ad campaign on Trillboards screens
Targeting specific venues, locations, or audience profiles
Allocating budget for programmatic DOOH buys

RETURNS:

campaign_id: Unique campaign identifier (UUID)
name, status, budget, screen_count, dates

Campaign starts in "draft" status. Use update_campaign to set status to "active".

EXAMPLE: User: "Create a campaign targeting retail screens in NYC at $5 CPM" create_campaign({ name: "NYC Retail Q1", budget_cpm: 5.0, daily_budget_usd: 100, venue_types: ["retail"], targeting: { geo: { city: "New York", state: "NY" } }, creative_url: "https://cdn.example.com/ad.mp4", start_date: "2026-03-01", end_date: "2026-03-31" })

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`end_date`	No
`targeting`	No
`budget_cpm`	No
`screen_ids`	No
`start_date`	No
`venue_types`	No
`creative_url`	No
`creative_type`	No
`daily_budget_usd`	No
`total_budget_usd`	No
`creative_duration`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses that campaigns start in 'draft' status (mutation side-effect) and documents the return structure (campaign_id, status, budget, etc.) compensating for the lack of output schema. It lacks disclosure of auth requirements, rate limits, or error conditions, preventing a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is optimally structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE) that front-load decision-making information. No sentences are wasted; the example is concrete and immediately follows the conceptual explanation. Length is appropriate for the complexity (12 parameters, nested objects).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex creation tool with 12 parameters, nested objects, no annotations, and no output schema, the description provides strong contextual support through the return value documentation and working example. It adequately compensates for missing structured metadata, though explicit parameter documentation would achieve full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must compensate significantly. The EXAMPLE section demonstrates realistic usage of 8 parameters (name, budget_cpm, daily_budget_usd, venue_types, targeting, creative_url, dates), providing implicit semantics. However, it fails to explain 4 undocumented parameters (screen_ids vs venue_types distinction, creative_type enum values, total_budget_usd vs daily budget logic, or targeting sub-field semantics), leaving gaps the agent must infer.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear, specific action: 'Create a new advertising campaign targeting DOOH screens.' It specifies the verb (create), resource (advertising campaign), and distinct scope (DOOH/digital out-of-home screens), effectively distinguishing it from siblings like create_media_buy or create_experiment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section explicitly lists three scenarios for tool selection. Critically, it states 'Use update_campaign to set status to active,' explicitly naming the next-step alternative and clarifying the campaign lifecycle (draft → active), which prevents confusion about whether this tool immediately activates campaigns.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_experimentAInspect

Create an incrementality experiment for a campaign.

Sets up a geo-holdout, ghost ads, or propensity score matching experiment to causally measure DOOH advertising lift.

WHEN TO USE:

Setting up a new A/B test before or during a campaign
Defining treatment and control DMAs for geo-holdout tests
Configuring experiment parameters (holdout %, MDE, power)

RETURNS: The created experiment object with experiment_id, status, and all parameters.

EXAMPLE: create_experiment({ campaign_id: "camp_abc123", experiment_type: "geo_holdout", treatment_dmas: ["501", "504"], control_dmas: ["503", "505"], holdout_pct: 0.15, target_mde: 0.10 })

ParametersJSON Schema

Name	Required	Description	Default
`target_mde`	No
`campaign_id`	Yes
`holdout_pct`	No
`control_dmas`	No
`target_alpha`	No
`target_power`	No
`treatment_dmas`	No
`experiment_type`	Yes

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. States it 'Sets up' experiments and returns an object with status, indicating persistence. However, lacks disclosure of side effects, idempotency behavior, prerequisites (e.g., whether campaign_id must exist), or error conditions for invalid DMA combinations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded purpose statement. No redundant text; each section serves distinct function. Example JSON is appropriately scoped to illustrate usage without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Strong coverage given no output schema: RETURNS section documents the return object structure. Addresses the complexity of 8 parameters with statistical concepts via example. Minor gap: does not mention prerequisite that campaign_id should exist or relationship to get_incrementality for analysis phase.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, description partially compensates via the EXAMPLE (shows 6/8 parameters) and WHEN TO USE (mentions holdout %, MDE, power). However, target_alpha is only in the example with no explanation, and critical terms like 'DMA' and 'MDE' are not defined. Adds some meaning but insufficient for full semantic understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Create' and resource 'incrementality experiment', clearly distinguishing from sibling tools like create_campaign or create_media_buy. Explicitly lists the three experiment types (geo-holdout, ghost ads, PSM) that match the schema enum, clarifying scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three specific scenarios including timing (before/during campaign) and configuration contexts. Lacks explicit 'when not to use' or reference to sibling get_incrementality for reading existing experiments, but provides strong contextual guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_media_buyAInspect

[AdCP Media Buy] Create a media buy (campaign) from an AdCP buy specification.

Creates a campaign that targets DOOH screens based on the provided specification. Returns a media_buy_id for tracking and a creative_deadline for asset submission.

WHEN TO USE:

Executing a programmatic DOOH buy via an AI agent
Creating campaigns from DSP trading desk agents
Automated media buying workflows

RETURNS:

media_buy_id: Unique identifier for this media buy
campaign_id: Internal campaign identifier
creative_deadline: Deadline for creative asset submission
targeting_summary: What was targeted
budget_summary: Budget allocation details

EXAMPLE: User: "Buy retail screens in NYC at $5 CPM for next week" create_media_buy({ name: "NYC Retail Week 12", buy_spec: { venue_types: ["retail"], geo: { city: "New York", state: "NY" }, budget: { daily_usd: 500, bid_cpm: 5.0 }, schedule: { start_date: "2026-03-16", end_date: "2026-03-22" } }, creative: { url: "https://cdn.example.com/creative.mp4", type: "video", duration_seconds: 15 }, buyer_ref: "agency-order-12345" })

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`brand`	No
`buy_spec`	No
`creative`	No
`end_time`	No
`packages`	No
`buyer_ref`	No
`start_time`	No

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description must carry full behavioral burden. It successfully discloses return values (media_buy_id, creative_deadline) and the targeting mechanism, but omits critical operational details like idempotency, error conditions, authentication requirements, or billing implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-organized with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core purpose first. The JSON example is lengthy but justified given the 0% schema coverage. Minor redundancy exists between the opening sentence and RETURNS section.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (8 params, nested objects) and absence of output schema, the description adequately compensates by documenting return fields and providing a concrete input example. However, it falls short of fully documenting all input parameters or error handling scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The JSON example effectively illustrates the semantic structure of core parameters (buy_spec, creative) showing expected field names and formats. However, it fails to document several parameters (brand, packages, start_time/end_time) leaving significant gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it creates a 'media buy (campaign)' targeting 'DOOH screens' using an 'AdCP buy specification.' The [AdCP Media Buy] prefix and DOOH focus clearly distinguish it from the generic sibling create_campaign.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes a dedicated 'WHEN TO USE' section with three specific scenarios (programmatic DOOH buys, DSP trading desk agents, automated workflows). However, it lacks explicit 'when not to use' guidance or named alternatives like create_campaign.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_webhookAInspect

Create a new webhook subscription for real-time events.

WHEN TO USE:

Setting up real-time notifications for device events
Integrating with external systems
Monitoring ad playback and impressions

AVAILABLE EVENTS:

device.online: When a device comes online
device.offline: When a device goes offline
impression.recorded: When an impression is logged
campaign.allocated: When a campaign is allocated to a device
payout.processed: When a payout is processed
programmatic.ad_started: When a programmatic ad begins playing
programmatic.ad_ended: When a programmatic ad finishes playing
programmatic.no_fill: When a programmatic ad request gets no fill
programmatic.error: When a programmatic ad request errors

RETURNS:

webhook_id: Unique webhook identifier
url: The webhook endpoint URL
events: Subscribed events
secret: HMAC signing secret (if provided)
status: enabled/disabled

EXAMPLE: User: "Set up a webhook for device status changes" create_webhook({ url: "https://api.mycompany.com/trillboards/webhooks", events: ["device.online", "device.offline", "programmatic.error"], secret: "my-signing-secret-123" })

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes
`events`	Yes
`secret`	No
`description`	No

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Compensates well by documenting all 8 available event triggers with human-readable conditions, and discloses return values (webhook_id, secret, status). Missing explicit retry policies or permission requirements, but event catalog and return structure are thoroughly covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, AVAILABLE EVENTS, RETURNS, EXAMPLE). Front-loaded one-liner summarizes purpose. Length is justified given zero schema documentation—every section adds necessary context that structured fields omit.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a 4-parameter mutation tool with no output schema. Covers critical path (event types, required params via example, return structure). Minor gaps on rate limits and full optional parameter documentation, but sufficient for correct invocation given the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates effectively: AVAILABLE EVENTS explains the 'events' array enum values; EXAMPLE demonstrates 'url' format and 'secret' usage for HMAC signing. Only gap is the optional 'description' parameter which is undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Create' + resource 'webhook subscription' + context 'real-time events.' The action clearly distinguishes from sibling tools like list_webhooks, delete_webhook, and test_webhook through the creation semantics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three concrete scenarios (real-time notifications, external integrations, monitoring ad playback). Lacks explicit 'when-not-to-use' or direct sibling comparisons, but the contextual scenarios provide strong implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cross_signal_correlateAInspect

Discover correlations between different signal types. Example: relationship between ad fill rate and audience attention for QSR venues.

Queries the cross_signal_insights table for pre-computed correlations, or computes ad-hoc correlations from the observation_stream when no pre-computed insight exists.

WHEN TO USE:

Understanding relationships between different sensing signals
Finding which audience behaviors correlate with business outcomes
Discovering hidden patterns (e.g., crowd_energy vs purchase_intent)
Validating hypotheses about audience-venue-time relationships

RETURNS:

data: Correlation analysis with:
- signal_a, signal_b: The two signals being correlated
- correlation_r: Pearson correlation coefficient (-1 to +1)
- correlation_r2: R-squared (proportion of variance explained)
- p_value: Statistical significance
- sample_count: Number of data points used
- effect_size: Cohen's d effect size
- confidence_interval_lower, confidence_interval_upper: 95% CI bounds
- insight_summary: Human-readable interpretation
metadata: { computation_method, window, filters_applied }
suggested_next_queries: Related correlation analyses to explore

EXAMPLE: User: "Is there a correlation between audience attention and ad fill rate at QSR venues?" cross_signal_correlate({ signal_a: "attention_score", signal_b: "ad_fill_rate", filters: { venue_type: "restaurant_qsr" } })

User: "How does crowd energy relate to purchase intent during lunch hours?" cross_signal_correlate({ signal_a: "crowd_energy", signal_b: "purchase_intent", filters: { daypart: "lunch" } })

ParametersJSON Schema

Name	Required	Description	Default
`filters`	No
`signal_a`	Yes
`signal_b`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses critical dual-path behavior (pre-computed insights vs ad-hoc computation from observation_stream) and extensively documents return structure (statistical measures, confidence intervals, metadata). However, with no annotations provided, it omits safety classification (read-only status), computational cost warnings, or rate limiting behavior for the ad-hoc computation path.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, mechanism, when to use, returns, examples). Front-loaded with specific purpose statement. Lengthy but justified given lack of output schema and 0% parameter coverage. RETURNS section is verbose but necessary substitute for missing output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (statistical correlation analysis), 0% schema coverage, and absence of output schema, the description achieves high completeness. It documents the complex return structure extensively (correlation metrics, confidence intervals, interpretation), provides two full usage examples, and explains the underlying data source logic.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring description compensation. Examples demonstrate signal_a/signal_b are signal identifiers (e.g., 'attention_score') and show filters usage (venue_type, daypart), but nested filter properties (time_range format, dma_code semantics) remain undocumented. Examples partially compensate but don't fully close the documentation gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity with clear verb ('Discover correlations'), resource ('different signal types'), and mechanism (pre-computed table vs ad-hoc observation_stream computation). Opening example immediately clarifies scope with concrete use case (ad fill rate vs attention at QSR venues).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists four specific scenarios (understanding relationships, finding behavior-outcome correlations, discovering hidden patterns, validating hypotheses). Lacks explicit 'when not to use' or direct sibling comparisons (e.g., vs query_observations for raw data), but examples effectively demonstrate appropriate use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_deviceAInspect

Soft-delete a device from the partner account.

WHEN TO USE:

Removing a device that's been decommissioned
Cleaning up test devices
Removing a device that's been relocated to another partner

RETURNS:

success: Boolean indicating success
device_id: The deleted device ID
message: Confirmation message

EXAMPLE: User: "Remove the old lobby kiosk" delete_device({ device_id: "lobby-kiosk-old" })

ParametersJSON Schema

Name	Required	Description	Default
`device_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full disclosure burden. It effectively communicates the 'soft-delete' nature (critical for a destructive-looking operation) and documents return values (success, device_id, message) despite no structured output schema. Missing side effects on associated campaigns or recovery procedures.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear sections (main description, WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the critical 'soft-delete' behavior stated immediately. No redundant or wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter mutation tool, the description is comprehensive: it explains the soft-delete behavior, usage contexts, and return structure. Could be improved by mentioning error scenarios (e.g., device not found) or side effects on active campaigns associated with the device.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% with no parameter descriptions. The description compensates partially through the EXAMPLE section showing device_id: 'lobby-kiosk-old', which implies the ID can be a human-readable string. However, it lacks explicit parameter documentation (e.g., whether it's a UUID, serial number, or friendly name).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with 'Soft-delete a device from the partner account,' providing a specific verb (soft-delete), resource (device), and scope (partner account). It clearly distinguishes from siblings like get_device, list_devices, and register_device through the specific soft-delete semantics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes an explicit 'WHEN TO USE' section with three specific scenarios (decommissioned devices, test cleanup, relocation). However, it lacks explicit 'when not to use' guidance or alternatives (e.g., suggesting a 'deactivate' sibling tool for temporary removal instead of deletion).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_webhookAInspect

Delete a webhook subscription.

WHEN TO USE:

Removing a webhook that's no longer needed
Cleaning up old integrations
Removing test webhooks

RETURNS:

success: Boolean indicating success
webhook_id: The deleted webhook ID
message: Confirmation message

EXAMPLE: User: "Delete the old webhook" delete_webhook({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d" })

ParametersJSON Schema

Name	Required	Description	Default
`webhook_id`	Yes

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Documents return values (success boolean, webhook_id, message) which is helpful given no output schema exists. However, fails to disclose critical behavioral traits for a destructive operation: does not warn that deletion is permanent, explain what happens to pending deliveries, or mention required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Each section earns its place - RETURNS is necessary given no output schema exists. Slightly verbose but appropriately so for a tool with zero schema documentation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter destructive operation with no annotations and no output schema, the description adequately covers purpose, usage scenarios, return structure, and usage example. Would be improved with a note about operation permanence, but otherwise complete for the complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no description field for webhook_id parameter). Description provides an EXAMPLE section showing the expected format 'wh_mmmpdbvj_8b7c5a59296d', which adds value beyond the raw pattern regex. However, lacks explicit parameter definition explaining that webhook_id is the unique identifier of the webhook to be deleted.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with the specific action 'Delete a webhook subscription' - clear verb + resource combination that immediately distinguishes it from sibling tools like create_webhook, update_webhook, or list_webhooks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit 'WHEN TO USE' section with three concrete scenarios (removing unneeded webhooks, cleaning up old integrations, removing test webhooks). Lacks explicit 'when not to use' guidance or alternative suggestions (e.g., when to prefer update_webhook over deletion).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

describe_endpointAInspect

Describe a single API operation including its parameters, response shape, and error codes.

WHEN TO USE:

Inspecting an endpoint's full contract before calling it.
Discovering which error codes an endpoint can return and how to recover.

RETURNS:

operation: Full discovery record for the endpoint.
parameters: Raw OpenAPI parameter definitions.
request_body: Body schema (when applicable).
responses: Map of status code → description/schema.
linked_error_codes: Error catalog entries the endpoint can emit.

EXAMPLE: Agent: "How do I call the screen audience endpoint?" describe_endpoint({ path: "/v1/data/screens/{screenId}/audience", method: "GET" })

ParametersJSON Schema

Name	Required	Description	Default
`path`	Yes
`method`	Yes

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly explains what information the tool returns (operation details, parameters, request_body, responses, linked_error_codes) and provides a concrete example of usage. However, it doesn't mention potential limitations like rate limits, authentication requirements, or error handling for invalid inputs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, usage guidelines, returns, example) and every sentence adds value. It's appropriately sized for a tool with two parameters and no annotations, with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, 0% schema coverage, no annotations, and no output schema, the description provides complete context. It explains what the tool does, when to use it, what it returns, and includes a concrete example showing parameter usage. This is sufficient for an agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for the two parameters (path and method), the description compensates fully by providing an example that shows exactly how to use these parameters: path as the endpoint path string and method as the HTTP method. The example demonstrates the expected format and usage context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('describe a single API operation') and resources ('parameters, response shape, and error codes'). It distinguishes itself from sibling tools like 'list_endpoints' by focusing on detailed inspection of individual endpoints rather than listing them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'WHEN TO USE' section with two specific scenarios ('inspecting an endpoint's full contract before calling it' and 'discovering which error codes an endpoint can return and how to recover'), providing clear guidance on when this tool should be selected over alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

device_heartbeatAInspect

Send a heartbeat signal from a device to report its status.

WHEN TO USE:

Regular device health monitoring (every 30-60 seconds)
Reporting current playback status
Reporting errors or issues

RETURNS:

success: Boolean indicating success
device_status: Current device status in system
next_heartbeat_seconds: Recommended interval for next heartbeat

EXAMPLE: User: "Send heartbeat for device P_abc123" device_heartbeat({ fingerprint: "P_abc123", status: "playing", current_ad_id: "507f1f77bcf86cd799439011", uptime_seconds: 3600 })

ParametersJSON Schema

Name	Required	Description	Default
`status`	No
`fingerprint`	Yes
`current_ad_id`	No
`error_message`	No
`uptime_seconds`	No

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It successfully documents return values (success, device_status, next_heartbeat_seconds) and timing recommendations in the RETURNS section. However, it lacks disclosure of idempotency, rate limiting, failure handling, or side effects on server state beyond the basic success indicator.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly demarcated sections (main description, WHEN TO USE, RETURNS, EXAMPLE). Every sentence serves a distinct purpose. The front-loaded purpose statement followed by specific usage contexts creates an efficient information hierarchy appropriate for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple flat schema (5 parameters, no nesting) and absence of structured output schema, the description compensates well by documenting return values in the RETURNS section and providing a concrete invocation example. Completeness would be perfect with explicit parameter descriptions, but the combination of example and usage guidelines provides sufficient context for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The EXAMPLE section provides implicit semantic context (fingerprint as device identifier, status values like 'playing', uptime_seconds format), but lacks explicit parameter definitions. It adequately compensates for basic understanding but doesn't fully document constraints or meanings for all five parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The opening sentence uses a specific verb ('Send') with clear resource ('heartbeat signal') and purpose ('report its status'). It effectively distinguishes from sibling CRUD operations like register_device, get_device, and delete_device by emphasizing the ongoing health-monitoring aspect versus lifecycle management.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The explicit 'WHEN TO USE' section provides precise guidance including frequency (every 30-60 seconds) and specific scenarios (health monitoring, playback reporting, error reporting). This clearly defines invocation timing and distinguishes from alternatives like register_device (initial setup) or get_device (retrieval).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_inventoryAInspect

Discover available DOOH screens in the Trillboards network.

WHEN TO USE:

Finding screens by venue type (retail, transit, office, etc.)
Finding screens in a specific city/state or within a radius
Finding screens with a specific audience profile (high income, professionals, etc.)
Getting an overview of available inventory with live audience data

RETURNS:

screens: Array of screen objects with location, venue type, online status, and live audience data
total: Total matching screens
online_count: Number of currently online screens

Each screen includes real-time audience data when available:

face_count, attention_score, income_level, mood, lifestyle
purchase_intent, crowd_density, ad_receptivity, dwell_time

EXAMPLE: User: "Find retail screens in New York with high-income audience" discover_inventory({ venue_types: ["retail"], location: { city: "New York", state: "NY" }, audience_profile: { income: "high" }, limit: 20 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`location`	No
`venue_types`	No
`audience_profile`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Provides detailed 'RETURNS' section disclosing output structure (screens array with location/venue/audience fields, total counts, online status) and explains real-time data availability ('when available'). Lacks error handling or rate limit disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, WHEN TO USE, RETURNS, EXAMPLE). Front-loaded with specific purpose statement. Length is appropriate given complexity and lack of schema annotations. Example is concrete and illustrative without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive coverage for a 4-parameter discovery tool with nested objects. Compensates for 0% schema coverage and missing output schema by documenting return values and semantics in the description. Provides sufficient context for an agent to construct valid invocations and understand results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but description compensates via 'WHEN TO USE' (explaining venue_types values like retail/transit, location options like city/state/radius) and concrete 'EXAMPLE' showing exact syntax for audience_profile, location, and venue_types parameters. Covers primary use cases effectively despite not documenting every nested field (e.g., lat/lng).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb+resource: 'Discover available DOOH screens in the Trillboards network.' Clearly distinguishes from sibling 'list_devices' (which likely lists registered/owned devices) by specifying this queries available inventory in the broader network for media buying purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with four specific scenarios (venue type filtering, geographic filtering, audience profiling, inventory overview). Lacks explicit 'when not to use' or named alternatives (e.g., distinguishing from 'list_devices'), but the contextual guidance is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

export_cohortAInspect

Export exposed audience cohort to a DSP for retargeting.

Pushes MAID hashes from the campaign's exposed cohort to the specified DSP (The Trade Desk, DV360, or Meta). Creates or reuses a DSP segment.

WHEN TO USE:

Activating DOOH-exposed audiences for retargeting on digital channels
Pushing cohorts to TTD, DV360, or Meta Custom Audiences
Measuring cross-channel retargeting lift

RETURNS:

status: 'synced', 'no_cohort', 'credentials_missing', or 'empty_cohort'
destination: the DSP name
segmentId: internal segment ID
externalSegmentId: DSP-side segment ID
maidCount: number of MAIDs uploaded
accepted: number accepted by DSP

Supported destinations: ttd, dv360, meta, cadent, mediaocean

EXAMPLE: export_cohort({ campaign_id: "camp_abc123", destination: "ttd" })

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes
`destination`	Yes

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and discloses: (1) data type transferred ('MAID hashes'), (2) idempotency ('Creates or reuses'), (3) detailed return status codes ('synced', 'credentials_missing', etc.), and (4) specific output fields with semantics (maidCount vs accepted). Lacks rate limiting or retry behavior details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear visual hierarchy (main description → WHEN TO USE → RETURNS → EXAMPLE). Every section earns its place; the RETURNS documentation is particularly valuable given the lack of output schema. No redundant or filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive coverage for a 2-parameter integration tool. The description documents input parameters via example, explains behavioral side effects (DSP segment creation), details return values (status codes and counts), and lists all supported DSP destinations, fully compensating for missing annotations and schema descriptions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the description compensates effectively: the 'EXAMPLE' block demonstrates parameter usage, 'Supported destinations' documents the enum values, and contextual sentences ('campaign's exposed cohort', 'specified DSP') imply the semantics of both required parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise action ('Export'), resource ('exposed audience cohort'), and purpose ('for retargeting'). It clearly distinguishes from sibling 'export_dataset' by specifying this is for DOOH-exposed audience activation rather than general data export.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE:' section explicitly lists three specific scenarios: activating DOOH-exposed audiences, pushing to specific DSPs (TTD/DV360/Meta), and measuring cross-channel lift. This provides clear selection criteria against alternatives like 'get_campaign_performance' or 'export_dataset'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

export_datasetAInspect

Export observation data as a structured dataset. Supports filtering by time, geography, venue type, and observation family. Applies k-anonymity (k=5) to protect individual privacy.

Queries the relevant table based on the selected dataset type, applies filters, enforces k-anonymity by suppressing groups with fewer than 5 observations, and returns structured data.

WHEN TO USE:

Exporting audience data for external analysis
Building datasets for machine learning or reporting
Getting structured vehicle or commerce data for a specific time/place
Creating cross-signal datasets for correlation analysis

RETURNS:

data: Array of dataset rows (schema varies by dataset type)
metadata: { row_count, k_anonymity_applied, export_id, dataset, filters_applied, time_range }
suggested_next_queries: Related exports or analyses

Dataset types:

observations: Raw observation stream data (all families)
audience: Audience-specific data (face_count, demographics, attention, emotion)
vehicle: Vehicle counting and classification data
cross_signal: Pre-computed cross-signal correlation insights

EXAMPLE: User: "Export audience data from retail venues last week" export_dataset({ dataset: "audience", filters: { time_range: { start: "2026-03-09", end: "2026-03-16" }, venue_type: ["retail"] }, format: "json" })

User: "Get vehicle data near geohash 9q8yy" export_dataset({ dataset: "vehicle", filters: { time_range: { start: "2026-03-15", end: "2026-03-16" }, geo: "9q8yy" } })

ParametersJSON Schema

Name	Required	Description	Default
`format`	No
`dataset`	Yes
`filters`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It explicitly details the k-anonymity algorithm (k=5, suppressing groups with fewer than 5 observations) and privacy protection behavior. Also describes the return structure (data array, metadata object, suggested_next_queries) which compensates for missing output schema. Lacks rate limits or timeout characteristics, preventing a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-organized with clear section headers (WHEN TO USE, RETURNS, EXAMPLE) that front-load critical information. Length is justified by the complexity (4 dataset types, privacy algorithm, nested filters) and absence of output schema. Each section earns its place, though the RETURNS section is verbose due to lack of structured output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (k-anonymity, multiple dataset types, nested filter objects), zero schema descriptions, no annotations, and no output schema, the description achieves comprehensive coverage. It explains return values, parameter semantics via examples, dataset type distinctions, and privacy behavior—sufficient for correct agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring heavy description compensation. The description maps filter parameters to semantic concepts (time, geography, venue type, observation family) and provides extensive enum value descriptions for the dataset parameter (4 types with explanations). Examples demonstrate nested structure usage, though explicit documentation of filter object internals (e.g., time_range having start/end) is implied rather than stated.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb+resource ('Export observation data as a structured dataset') and immediately distinguishes scope with filtering capabilities and k-anonymity features. The dataset type enumeration (observations, audience, vehicle, cross_signal) clearly differentiates this from sibling tools like query_observations (simple queries) and export_cohort (cohort-specific exports).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with four concrete scenarios (external analysis, ML/reporting, vehicle/commerce data, correlation analysis) that clearly delineate appropriate contexts. These scenarios help agents distinguish between this export tool and analytics/query siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_similar_momentsAInspect

Find historically similar audience moments across the screen network using embedding similarity search. Input a natural-language description of the target moment.

Moment embeddings are 768-D vectors generated from multi-modal observation data (visual, audio, environmental, social) via the MomentEmbeddingService. This tool embeds your query text and finds the closest real-world moments via pgvector cosine similarity.

WHEN TO USE:

Searching for historical moments similar to a target scenario
Finding "moments like this one" across different venues/times
Discovering when similar audience compositions or behaviors occurred
Planning ad placements based on past similar contexts

RETURNS:

data: Array of matching observations with similarity scores
- observation_id, observed_at, venue_type, device_id, screen_mongo_id
- payload: full observation data
- evidence_grade: quality of observation
- similarity: cosine similarity score (0-1, higher = more similar)
metadata: { result_count, embedding_model, min_similarity_threshold }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "Find moments with high engagement in evening restaurants with families" find_similar_moments({ query: "evening restaurant venue with families present, high emotional engagement and attention" })

User: "When did we see young adults highly engaged at transit screens?" find_similar_moments({ query: "transit venue morning commute young adults high attention" })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`venue_type`	No
`min_similarity`	No

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses technical implementation (768-D vectors, multi-modal observation data, pgvector cosine similarity), return structure (similarity scores 0-1, evidence_grade), and data source (MomentEmbeddingService). Missing operational constraints (rate limits, latency), but strong algorithmic transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear headers (WHEN TO USE, RETURNS, EXAMPLE) and front-loaded purpose statement. Technical details (768-D vectors, pgvector) provide necessary context but add length. No wasted sentences, though could condense embedding service implementation details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a complex search tool lacking annotations/output schema. Documents return structure (data array with observation_id, similarity, payload), metadata fields, and follow-up suggestions. Covers multi-modal data sources and similarity methodology. Missing only operational/constraint details given the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for 4 parameters, requiring heavy description compensation. While 'query' is well-explained ('natural-language description of the target moment') and demonstrated in examples, the other three parameters (limit, venue_type, min_similarity) are completely undocumented in the description text. Only partial compensation achieved.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Find') + resource ('audience moments') + mechanism ('embedding similarity search') and scope ('historically similar... across the screen network'). It clearly distinguishes from siblings like 'semantic_search_observations' by emphasizing multi-modal embeddings and historical moment retrieval versus general content search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit 'WHEN TO USE' section with four concrete scenarios (searching historical moments, finding 'moments like this one', discovering similar compositions, planning ad placements). Two detailed examples show query syntax. Lacks explicit 'when not to use' or named sibling alternatives (e.g., when to prefer 'query_observations'), hence not a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_adcp_capabilitiesAInspect

[AdCP] Get Trillboards AdCP capabilities and supported protocols.

Returns the full capability declaration for Trillboards as an AdCP DOOH seller agent. This tool does NOT require authentication.

WHEN TO USE:

Discovering what protocols Trillboards supports (Signals, Media Buy)
Understanding available audience signals and data methodology
Getting MCP endpoint and discovery URLs

RETURNS:

supported_protocols: ['signals', 'media_buy']
inventory: DOOH format details, network size
audience_data: signal list, methodology, refresh rate
pricing: model, currency, floor CPM
discovery: well_known_url, mcp_endpoint

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Explicitly states 'does NOT require authentication,' which is critical safety info. Compensates for missing output_schema by detailing return structure (supported_protocols, inventory, audience_data, pricing, discovery). Does not mention rate limits or caching behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with [AdCP] tag, concise purpose statement, authentication note, and clearly delineated WHEN TO USE and RETURNS sections. Every sentence earns its place; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the detailed RETURNS section documenting the five top-level response fields provides essential completeness. Auth note covers security context. Missing only operational details like rate limits or cache headers, but fully adequate for a capabilities discovery tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters. Per rubric, 0 params = baseline 4. Description appropriately does not invent parameter semantics where none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool retrieves 'Trillboards AdCP capabilities and supported protocols' and positions it as a 'capability declaration for Trillboards as an AdCP DOOH seller agent.' This effectively distinguishes it from 50+ operational siblings (create_campaign, update_media_buy, etc.) by establishing it as a meta/discovery tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three specific scenarios: discovering protocols, understanding audience signals, and getting MCP endpoints. Lacks explicit 'when not to use' or named alternatives, though as a discovery endpoint it serves a unique prerequisite role to other AdCP tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_analyticsAInspect

Get analytics data for the partner account.

WHEN TO USE:

Viewing overall performance metrics
Analyzing device performance
Generating reports on impressions and earnings
Comparing performance over time periods

RETURNS:

summary: Overall stats (impressions, earnings, active_devices)
time_series: Data points over time
top_devices: Best performing devices
breakdown: Data grouped by requested dimension

EXAMPLE: User: "Show me last week's analytics by device" get_analytics({ start_date: "2026-01-01", end_date: "2026-01-07", group_by: "device" })

User: "Get monthly performance breakdown" get_analytics({ start_date: "2025-12-01", end_date: "2025-12-31", group_by: "day" })

ParametersJSON Schema

Name	Required	Description	Default
`end_date`	No
`group_by`	No
`device_id`	No
`start_date`	No

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. The 'RETURNS' section effectively documents the response structure (summary, time_series, top_devices, breakdown) and data types (impressions, earnings, active_devices), compensating for the missing output schema. Does not mention rate limits or caching behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly delineated sections (description, WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core purpose. The examples are concrete and relevant. No wasted text despite the length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Missing crucial clarification on parameter requirements: the schema indicates 0 required parameters, but the examples imply start_date and end_date are necessary (or at least commonly used). Should specify default behavior when dates are omitted. The RETURNS section adequately compensates for the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description relies on the 'EXAMPLE' section to convey parameter semantics. The examples demonstrate date formats (YYYY-MM-DD) and group_by values ('day', 'device'), but omit documentation for device_id purpose and the remaining enum values ('week', 'month'). Partial compensation for schema gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves 'analytics data for the partner account,' specifying the scope (account-level vs. entity-specific). This distinguishes it from siblings like get_campaign_performance or get_content_performance. However, it could more explicitly contrast with these specific analytic tools to prevent confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides four specific scenarios (viewing performance metrics, analyzing devices, generating reports, comparing time periods). This gives clear activation signals. Lacks explicit 'when NOT to use' guidance or named alternatives for campaign-specific analytics.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_attention_metricsAInspect

Get edge AI attention metrics for a campaign (FEIN-powered).

This is what makes DOOH attribution better than digital: Trillboards MEASURES viewability via FEIN edge AI instead of estimating it.

WHEN TO USE:

Measuring actual human attention to ads (not just impressions)
Comparing attention-adjusted CPM (aCPM) vs standard CPM
Getting face count, dwell time, and emotion engagement data

RETURNS:

impressions: total, uniqueDevices
attention: avgScore (0-1), medianScore, p90Score, avgDwellSeconds, avgFaceCount, qualifiedPct
economics: standardCpm, attentionCpm (aCPM), costPerAttentiveReach
emotion: avgEngagement (0-1), positiveEmotionPct

aCPM = total_media_cost / (SUM(attention_score * face_count) / 1000)

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It adds substantial value by explaining FEIN technology (measurement vs estimation), providing the aCPM calculation formula, and detailing the return structure across four categories (impressions, attention, economics, emotion). Does not cover operational traits like rate limits or caching.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: one-line summary, value proposition, WHEN TO USE bullets, RETURNS bullets, and formula. Despite length, every sentence adds distinct value (technology context, use cases, return schema, calculation methodology).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description comprehensively documents all return fields (impressions, attention metrics, economics, emotion) with their types/ranges. It explains the FEIN technology context and provides the aCPM formula, making it complete for this specialized tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the single 'campaign_id' parameter. The description states metrics are retrieved 'for a campaign', implying the parameter purpose, but does not explicitly name or document 'campaign_id'. With only one required parameter, this is minimally adequate but does not fully compensate for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Get') + resource ('edge AI attention metrics') + scope ('for a campaign'), and explicitly distinguishes from sibling analytics tools by specifying 'FEIN-powered' and contrasting DOOH measurement with digital estimation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides an explicit 'WHEN TO USE' section listing three specific scenarios (measuring human attention, comparing aCPM, getting face/emotion data). Lacks explicit 'when-not-to-use' or named sibling alternatives (e.g., get_creative_attention), but the context makes the scope clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_attribution_timeseriesAInspect

Get daily attribution timeseries for a campaign.

WHEN TO USE:

Tracking attribution trends over time
Identifying which days had the strongest lift
Building attribution dashboards with daily granularity

RETURNS: Array of daily data points, each with:

date, uniqueDevices, totalExposures, avgFrequency
exposedVisitors, controlVisitors, liftPct, incrementalVisits
costPerVisit, totalMediaCost, isSignificant

EXAMPLE: get_attribution_timeseries({ campaign_id: "camp_abc123", start_date: "2026-03-01", end_date: "2026-03-10" })

ParametersJSON Schema

Name	Required	Description	Default
`end_date`	No
`start_date`	No
`campaign_id`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description compensates with detailed RETURNS section documenting all output fields (date, liftPct, incrementalVisits, etc.) and a complete usage EXAMPLE. Does not mention rate limits or caching behavior, but clearly establishes the read-only data retrieval pattern.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded purpose statement. No wasted words; every sentence provides actionable guidance or structural information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description comprehensively documents return values. With 3 parameters and zero schema descriptions, the example provides minimal viable context. Does not mention pagination or date range limits, but sufficient for tool selection and basic invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially via the EXAMPLE block showing parameter formats (e.g., '2026-03-01' for dates, 'camp_abc123' for campaign_id), but lacks explicit semantic definitions for what each parameter represents or that start_date/end_date are optional.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (Get), resource (daily attribution timeseries), and scope (for a campaign). However, it does not explicitly differentiate from siblings like get_campaign_attribution or get_multi_touch_attribution, leaving the agent to infer the distinction from 'daily' and 'timeseries' keywords.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three concrete scenarios (tracking trends, identifying lift days, building dashboards). Lacks explicit exclusions or alternative tool recommendations for non-timeseries attribution needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_audience_forecastAInspect

Predict what the audience will look like at a screen at a specific time.

WHEN TO USE:

Planning campaigns for specific time slots
Estimating audience composition before buying
Comparing audience at different times of day

Uses historical audience data to predict typical audience patterns.

RETURNS:

predicted_face_count: Expected number of viewers
predicted_attention: Expected attention score
typical_income: Most common income level at that time
typical_lifestyle: Most common lifestyle segment at that time
confidence: Prediction confidence (0-1, based on sample count)
sample_count: Number of historical data points used

EXAMPLE: User: "What's the typical audience at this screen on Monday at 3pm?" get_audience_forecast({ screen_id: "507f1f77bcf86cd799439011", hour: 15, day: 1, lookback_days: 30 })

ParametersJSON Schema

Name	Required	Description	Default
`day`	Yes
`hour`	Yes
`screen_id`	Yes
`lookback_days`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden and discloses methodology ('Uses historical audience data'), explains prediction confidence (0-1 scale based on sample count), and documents all six return fields in RETURNS section. Minor gap: no error conditions or data freshness details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Every sentence earns its place: purpose statement, three use cases, methodology note, six return value explanations, and concrete JSON example. Length is justified by 0% schema coverage and missing output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Excellent compensation for missing output schema via detailed RETURNS section. Given 0% input schema coverage and no annotations, the example provides implicit context for all four parameters, though explicit parameter descriptions would strengthen completeness further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially via EXAMPLE showing valid values (hour:15, day:1, lookback_days:30), but lacks explicit semantics for 'day' numbering (0=Sunday vs Monday) or 'lookback_days' purpose. Baseline viable given example coverage, but explicit parameter docs needed for full compensation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Predict' targeting 'audience' at a 'screen at a specific time', clearly distinguishing from sibling get_live_audience (real-time) and predict_moment_quality (quality scoring). The forecasting intent is unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section listing three specific scenarios: planning campaigns, estimating before buying, and comparing times of day. This provides clear context for selection over alternatives like get_live_audience.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_audience_lookalikeAInspect

Find screens with similar audience profiles using pgvector similarity.

Uses 64-dimensional audience vectors with HNSW cosine similarity index to find screens whose audience demographics, attention, and behavioral patterns match a target screen.

WHEN TO USE:

Expanding campaign reach to screens with similar audiences
Finding new inventory that matches a high-performing screen
Building lookalike audience segments for targeting

RETURNS: Array of similar screens ranked by cosine similarity, each with:

screen_id, similarity (0-1), metadata (face_count, attention, income, lifestyle), last_seen

EXAMPLE: get_audience_lookalike({ screen_id: "scr_abc123", limit: 10, min_similarity: 0.8 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`country`	No
`screen_id`	Yes
`venue_type`	No
`min_similarity`	No

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and delivers substantial value: it discloses the 64-dimensional vector methodology, HNSW cosine similarity algorithm, and detailed return structure (similarity scores 0-1, metadata fields). It could improve by mentioning performance characteristics or caching behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (technical overview, when to use, returns, example). Front-loaded with the core purpose. Technical details (HNSW, pgvector) are appropriately included to inform the agent about the similarity methodology without being overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema and no annotations, the description adequately documents return values through the RETURNS section (array structure, field definitions, similarity ranges). The complexity of the vector search operation is sufficiently contextualized for agent usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While the EXAMPLE block demonstrates usage of screen_id, limit, and min_similarity, it fails to document the filtering semantics of 'country' and 'venue_type' parameters, leaving 40% of the parameter space unexplained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Find') and resource ('screens with similar audience profiles'), clearly distinguishing this from siblings like discover_inventory (browsing) or get_live_audience (real-time state). The technical method (pgvector similarity) further clarifies the unique capability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides an explicit 'WHEN TO USE' section with three concrete scenarios (expanding campaign reach, finding inventory, building segments). This offers clear positive guidance, though it lacks explicit 'when not to use' comparisons or named alternative tools from the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_billing_statusAInspect

Check current billing status including whether billing is set up, credit balance, Stripe customer ID, and payment method status. Use this to determine if billing setup is needed before making paid API calls.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and successfully discloses what data is retrieved (Stripe ID, credit balance, payment status). It clarifies this is a status retrieval operation. Does not mention rate limits or caching, but covers the essential behavioral scope.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. First sentence front-loads the action and return value; second sentence provides usage context. No filler or redundant restatement of the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no input params) and lack of output schema, the description compensates adequately by listing the four key data points returned. Sufficient for an agent to understand what information will be available after invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters. Per scoring rules, 0 params establishes a baseline of 4. Description correctly does not invent parameter semantics where none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Uses specific verb 'Check' with clear resource 'billing status' and enumerates exact fields returned (credit balance, Stripe customer ID, payment method status). Implicitly distinguishes from sibling 'setup_billing' by describing a read-only verification action versus setup/configuration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'before making paid API calls' and the specific decision it enables: 'determine if billing setup is needed.' This naturally implies the alternative workflow (use setup_billing if not set up) without being verbose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_campaign_attributionAInspect

Get comprehensive attribution summary for a DOOH campaign.

WHEN TO USE:

Measuring overall campaign effectiveness (reach, footfall, sales lift)
Getting a high-level view of campaign attribution metrics
Checking statistical significance of attribution results

RETURNS:

reach: uniqueDevices, totalImpressions, avgFrequency
footfall: exposedVisitors, controlVisitors, incrementalLiftPct, incrementalVisits
cost: totalMediaCost, costPerUniqueReach, costPerIncrementalVisit
quality: avgMatchConfidence, statisticalSignificance, isSignificant
dataFreshness: latestOutcomeAt, provisionalCount, finalizedCount

Returns null if no attribution data exists for the campaign.

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Extensively documents return structure across five categories (reach, footfall, cost, quality, dataFreshness) and explicitly states 'Returns null if no attribution data exists.' Missing operational details like caching behavior or latency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS). Front-loaded purpose statement followed by contextual details. Every sentence provides value; no repetition of schema constraints or tautology.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking structured output schema, the RETURNS section comprehensively documents all return fields (15+ metrics) including data freshness indicators (provisional vs finalized). Appropriate completeness for a read-only retrieval tool with simple input requirements.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the single campaign_id parameter. While the parameter name is self-documenting and schema includes length constraints, the description doesn't compensate by explaining ID format, where to obtain it, or examples. Baseline 3 for minimal parameter count with self-evident naming.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb+resource: 'Get comprehensive attribution summary for a DOOH campaign.' The 'comprehensive' and 'DOOH' qualifiers distinguish it from sibling tools like get_attribution_timeseries (time-based) or get_creative_attribution (creative-level).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three specific scenarios (measuring effectiveness, high-level views, checking significance). However, lacks explicit alternatives or exclusions (e.g., when to use get_attribution_timeseries instead for temporal analysis).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_campaign_heatmapAInspect

Get geographic exposure heatmap data for a campaign.

Returns lat/lng clusters with exposure counts and device reach, useful for visualizing where ads were shown on a map.

WHEN TO USE:

Visualizing campaign geographic coverage
Identifying hotspots of ad exposure
Analyzing geographic distribution of attributed foot traffic

RETURNS: Array of geographic clusters (max 500), each with:

lat, lng (rounded to 3 decimal places)
uniqueDevices, totalExposures
avgConfidence (match confidence score)

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses important behavioral traits: data structure (lat/lng clusters), volume limits (max 500), precision (3 decimal places), and return fields (uniqueDevices, totalExposures, avgConfidence). However, it omits operational details like error handling, rate limits, or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly demarcated sections (main description, WHEN TO USE, RETURNS). Information is front-loaded with the core purpose. Every sentence provides unique value; no repetition of structured data or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately compensates by detailing the return structure, field meanings, and data limits (max 500). With only one input parameter, the tool is inherently simple, though mentioning the campaign_id parameter would have achieved completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage (campaign_id lacks a description field), and the tool description makes no mention of the parameter whatsoever. While the parameter name is semantically clear given the tool name, the description fails to compensate for the complete lack of schema documentation by providing context about valid campaign_id formats or sources.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Get') and clearly identifies the resource ('geographic exposure heatmap data') and scope ('for a campaign'). It distinguishes itself from siblings like get_campaign_performance and get_campaign_attribution by emphasizing geographic visualization and map-based data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section with three specific scenarios (visualizing coverage, identifying hotspots, analyzing foot traffic distribution). Provides clear positive guidance but lacks explicit exclusions or named alternatives for when other campaign tools might be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_campaign_performanceAInspect

Get detailed performance metrics for a campaign.

WHEN TO USE:

Monitoring active campaign performance
Reviewing completed campaign results
Getting per-screen impression breakdowns

RETURNS:

campaign_id, name, status, budget, dates
performance: impressions, spend_estimate_usd, avg_cpm, unique_screens, avg_latency_ms
screen_breakdown: per-screen impressions and CPM

EXAMPLE: User: "How is my NYC retail campaign performing?" get_campaign_performance({ campaign_id: "550e8400-e29b-41d4-a716-446655440000" })

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. The 'RETURNS' section excellently discloses the response structure (campaign fields, performance metrics, screen_breakdown), compensating for the lack of output schema. However, it omits explicit statements about read-only safety or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Perfectly structured with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core purpose in sentence one. No wasted words; every section earns its place by adding distinct value beyond the schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool, the description is comprehensive. The detailed RETURNS section fully compensates for the missing output schema, documenting the nested structure (performance, screen_breakdown) that the agent needs to interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage (campaign_id lacks a description field). The tool description compensates partially via the EXAMPLE section showing a UUID format, but the main text lacks explicit parameter semantics (e.g., 'UUID identifier of the campaign to retrieve').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with the specific verb 'Get' and resource 'detailed performance metrics for a campaign'. It distinguishes itself from sibling tools like get_campaign_attribution and get_campaign_heatmap by explicitly mentioning 'per-screen impression breakdowns' and specific metrics like avg_cpm and latency.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides three explicit scenarios (monitoring active, reviewing completed, per-screen breakdowns). While it lacks explicit 'when not to use' guidance or named alternative tools, the specific contexts provided effectively guide selection among the many sibling analytics tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_content_performanceAInspect

Get performance metrics for a video across the Trillboards DOOH network.

WHEN TO USE:

Checking how a specific video performs across screens (plays, attention, audience size)
Analyzing which venue types and dayparts a video resonates best in
Finding the top-performing screens for a piece of content
Comparing content performance over different time windows

RETURNS:

videoId, title, totalPlays, uniqueScreens
avgAttention (0-1), avgAudienceSize, avgDwellMs
venueDistribution: Array of { venue_type, plays }
daypartDistribution: Array of { daypart, plays }
topScreens: Top 10 screens by play count with attention scores
period: { start, end } date range

EXAMPLE: User: "How is video dQw4w9WgXcQ performing on retail screens?" get_content_performance({ video_id: "dQw4w9WgXcQ", venue_type: "retail", days: 30 })

User: "Show me the last 7 days of performance for this video" get_content_performance({ video_id: "abc123xyz", days: 7 })

ParametersJSON Schema

Name	Required	Description	Default
`days`	No
`video_id`	Yes
`venue_type`	No

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Comprehensive 'RETURNS' section discloses data structure (aggregates, distributions, top 10 screens), which substitutes for missing output schema. However, misses operational traits: no mention of data freshness/lag, rate limits, cost of query, or caching behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (header, WHEN TO USE, RETURNS, EXAMPLE). Front-loaded purpose statement. Lengthy but appropriate given zero schema coverage and lack of output schema—every section adds necessary value that structured fields omit.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Highly complete for a complex analytics tool with no annotations or output schema. Use cases, return structure, and query examples provide rich context. Only gap is explicit parameter documentation, though examples partially cover this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage (no parameter descriptions). Description partially compensates with examples showing 'retail' as venue_type and 30/7 as days values, but fails to document constraints (1-90 range for days), valid venue_type enumerations, or video_id format requirements.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb ('Get'), resource ('performance metrics'), and scope ('video across the Trillboards DOOH network'). Clearly distinguishes from sibling 'get_campaign_performance' (content-level vs campaign-level) and 'search_content' (metrics retrieval vs discovery).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Excellent 'WHEN TO USE' section lists four specific scenarios (checking plays/attention, analyzing venue types/dayparts, finding top screens, comparing time windows). Examples demonstrate query patterns. Lacks explicit 'when NOT to use' warnings (e.g., don't use for real-time streaming), but scope is clear from positive examples.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_content_recommendationsAInspect

Get best-performing content recommendations for a venue type and optional time context.

WHEN TO USE:

Deciding what content to schedule at a specific venue type
Finding content that drives the highest audience engagement at a location
Optimizing content rotation by daypart (morning, afternoon, evening, overnight)
Content programming decisions based on performance data

RETURNS:

data: Array of recommended content ranked by performance score
- videoId, title, contentCategory, durationSeconds
- totalPlays, uniqueScreens
- avgAttention (0-1), avgDwellMs
- performanceScore (composite of attention, replay density, dwell time)
meta: { count, venue_type, daypart, limit }

Performance score formula: attention(40%) + replay_density(30%) + dwell_time(30%)

EXAMPLE: User: "What content works best in bars during the evening?" get_content_recommendations({ venue_type: "bar", daypart: "evening", limit: 10 })

User: "Best performing content for transit screens" get_content_recommendations({ venue_type: "transit", limit: 20 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`daypart`	No
`venue_type`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden. It excels by documenting the performance score formula (attention/replay/dwell weighting) and detailed return structure including metadata fields. Missing operational details like data freshness, caching behavior, or rate limits keep it from a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear sections: purpose statement, WHEN TO USE bullets, RETURNS schema documentation, performance formula, and EXAMPLES. Every section earns its place; the length is appropriate given the lack of output schema and schema descriptions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema descriptions, no annotations, and no output schema, the description is remarkably complete. It compensates with detailed return value documentation, usage examples showing different parameter combinations, and business logic explanation (scoring formula).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring heavy description compensation. The daypart enum values are documented in the WHEN TO USE bullet. Venue_type is mentioned but valid values aren't specified. Limit appears only in examples without textual description of its purpose or constraints (1-100). Partial compensation via examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'best-performing content recommendations' using a 'venue type' and optional 'time context', providing specific verb and resource. However, it does not explicitly distinguish from similar siblings like `get_content_performance` or `recommend_creative`.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides explicit scenarios including content scheduling decisions, engagement optimization, and daypart programming. While excellent for positive guidance, it lacks explicit 'when not to use' guidance or naming of alternative tools like `search_content` for browsing non-ranked content.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_creative_attentionAInspect

Get per-creative attention breakdown for a campaign.

WHEN TO USE:

A/B testing creative variants by attention score
Identifying which creative drives the most engagement
Comparing aCPM across creative assets

RETURNS: Array of creatives ranked by attention score, each with:

creativeId, totalImpressions, uniqueDevices
avgAttentionScore (0-1), avgDwellSeconds, avgFaceCount
attentionCpm, avgEmotionEngagement, positiveEmotionPct, attentionQualifiedPct

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Extensive 'RETURNS' section documents output structure (array ranking, specific fields like avgAttentionScore 0-1), which adds value. However, lacks explicit statement that this is read-only/safe, pagination behavior, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Perfectly structured with three distinct sections (purpose, WHEN TO USE, RETURNS). No filler text; every sentence provides actionable information. Appropriate length for the complexity level (single parameter input).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given single parameter input and no output schema, the RETURNS section adequately documents the expected response structure. Would benefit from mentioning pagination limits for large campaigns or error conditions, but covers the essential contract sufficiently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage with only minLength/maxLength constraints. Description mentions 'for a campaign' which implies the campaign_id parameter exists, but adds no semantics about parameter format, where to obtain campaign IDs, or example values. Fails to compensate for empty schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb+resource: 'Get per-creative attention breakdown for a campaign.' Clearly distinguishes from sibling get_attention_metrics (general metrics) and get_creative_attribution (attribution focus) by specifying 'per-creative attention breakdown' granularity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three specific scenarios: A/B testing creative variants, identifying engagement drivers, and comparing aCPM. Provides clear contextual triggers for tool selection without needing to infer from the name.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_creative_attributionAInspect

Get attribution performance by individual creative variant.

Links creative execution to attribution outcomes: which creative variant drove the most store visits?

WHEN TO USE:

Comparing creative A/B/C test performance on attribution outcomes
Finding the optimal creative x venue_type x daypart x weather combination
Identifying the creative with the highest visit rate

RETURNS: Array of creatives ranked by store visits, each with:

creativeId, variant, totalVisits, avgVisitRate
attention: avgScore, avgDwell, avgEmotion, dominantEmotion
avgLiftPct, avgCostPerVisit
bestContext: { venueType, daypart, weather }
dateRange: { first, last, daysMeasured }

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It comprehensively documents the return structure in the 'RETURNS' section (ranking by store visits, attention metrics, bestContext, dateRange), compensating for the lack of output schema. However, it omits operational details like whether the operation is read-only, rate limits, or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear section headers (WHEN TO USE, RETURNS) and front-loaded purpose statement. Every sentence earns its place—no filler text. The multi-line format improves readability without sacrificing density of information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's analytical complexity and the absence of an output schema, the description provides exceptional completeness by detailing the exact return structure including nested objects (bestContext, dateRange) and specific metrics (avgLiftPct, dominantEmotion). The single input parameter (campaign_id) is simple enough that its omission in the description doesn't critically impair usability.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage (campaign_id lacks a description field), and the description text fails to compensate by explaining the parameter's semantics, constraints, or format. While 'campaign_id' is somewhat self-explanatory, the rubric requires the description to compensate for low schema coverage, which it does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool 'Get[s] attribution performance by individual creative variant' and distinguishes its purpose from siblings like get_campaign_attribution or get_creative_attention by specifying it links 'creative execution to attribution outcomes: which creative variant drove the most store visits?' This provides a specific verb, resource, and outcome that differentiates it from related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section listing three specific scenarios: comparing A/B/C test performance, finding optimal creative x context combinations, and identifying highest visit rates. This provides clear contextual guidance for when to select this tool over alternatives like get_campaign_attribution or get_creative_attention.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_cross_channel_journeyAInspect

Get cross-channel customer journey data (Sankey flow) for a campaign.

Shows how users flow between channels: DOOH -> mobile -> web -> store.

WHEN TO USE:

Visualizing the customer journey across DOOH and digital channels
Understanding channel transition patterns
Building Sankey diagrams of marketing funnels

RETURNS:

flows: Array of { source, target, count } transitions between channels
channels: Array of { channel, touchpoints, uniqueDevices } distribution

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It compensates partially by detailing return structure (flows array with source/target/count, channels array with touchpoints/uniqueDevices) and example channel transitions. However, it lacks disclosure on safety profile (read-only status), rate limits, or error behaviors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear sections (main description, WHEN TO USE, RETURNS). No filler text; every line provides value. Front-loaded with the core action, and efficiently compensates for missing output schema by documenting return values directly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema descriptions and no output schema, the description provides substantial context by documenting the return data structure and usage scenarios. Only minor gap is the lack of explicit campaign_id parameter documentation; otherwise comprehensive for a single-parameter read operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the single required parameter (campaign_id). Description mentions 'for a campaign' which provides implicit context, but fails to explicitly document the parameter's purpose, format, or constraints, leaving significant semantic gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Get' with resource 'cross-channel customer journey data (Sankey flow)' and scope 'for a campaign'. The mention of 'Sankey flow' and specific channel examples (DOOH -> mobile -> web -> store) clearly distinguishes it from siblings like get_campaign_performance or get_analytics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three specific scenarios (visualizing journeys, understanding transitions, building Sankey diagrams). Provides clear context for invocation, though lacks explicit 'when not to use' guidance or named sibling alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dataset_statsAInspect

Get statistics about available causal training data: total tuples, unique creatives, venue diversity, date range.

Queries observation_stream for rows that have both a creative ID and a VAS outcome recorded, giving a picture of how much training data is available for the causal prediction engine.

WHEN TO USE:

Checking if enough data exists for reliable causal predictions
Understanding the diversity of training data (creatives, venues, time range)
Monitoring causal dataset health and growth
Planning data collection strategies

RETURNS:

data: Dataset statistics
- total_tuples: number of context-action-outcome records
- unique_creatives: number of distinct creatives with VAS data
- unique_venue_types: number of distinct venue types represented
- date_range: { start, end } of available data
- observations_per_creative: { min, max, mean, median } distribution
metadata: { query_window_days }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "How much causal training data do we have?" get_dataset_stats({})

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries the burden well. It discloses the data source (observation_stream), filtering logic (rows with both creative ID and VAS outcome), and intended effect (picture of training data availability). It also details the return structure. Minor gap: doesn't explicitly state read-only nature or performance characteristics (e.g., query cost).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with clear summary sentence. Structure uses headers (WHEN TO USE, RETURNS, EXAMPLE) for scannability. The RETURNS section is lengthy, but justified because no output schema exists to document the nested return structure elsewhere. No redundant tautology.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a zero-parameter tool without output schema. Description compensates for missing outputSchema by fully documenting return values (data, metadata, suggested_next_queries). Purpose is clear among 60+ sibling tools. The example demonstrates practical usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters with 100% coverage (empty object). Baseline score of 4 applies. Description adds value through the EXAMPLE section showing invocation with empty object {}, confirming no parameters are expected.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb+resource: 'Get statistics about available causal training data' and lists concrete dimensions (tuples, creatives, venue diversity). It distinguishes from generic analytics siblings like get_analytics by specifying this queries observation_stream specifically for the causal prediction engine training context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with four specific scenarios (checking data sufficiency, understanding diversity, monitoring health, planning collection). This provides clear positive guidance, though it lacks explicit 'when not to use' statements or named alternatives (e.g., query_observations for raw data).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_deviceAInspect

Get detailed information about a specific device.

WHEN TO USE:

Checking status of a single device
Getting device configuration details
Debugging device issues

RETURNS:

device_id: Your internal device ID
trillboards_device_id: Internal Trillboards ID
fingerprint: Device fingerprint
name: Device name
status: online/offline
last_seen: Last heartbeat timestamp
location: Location details
specs: Device specifications
stats: Impression and earnings stats

EXAMPLE: User: "Get details for vending machine 001" get_device({ device_id: "vending-001-nyc" })

ParametersJSON Schema

Name	Required	Description	Default
`device_id`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It extensively documents the return structure (9 fields including status, location, specs, stats) which compensates for the lack of output schema. However, it omits error behavior (e.g., device not found), permission requirements, and whether this is cached or real-time.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Uses clear structural headers (WHEN TO USE, RETURNS, EXAMPLE) that make the description scannable. Despite listing 9 return fields, every sentence earns its place by compensating for missing structured schemas. The example is concrete and immediately clarifies usage patterns.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema descriptions, no output schema, and no annotations, the description is remarkably complete. It documents all return fields (substituting for output schema), provides usage scenarios, and includes a working example. Sufficient for an agent to invoke correctly without external documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The EXAMPLE section provides crucial semantic context showing 'vending-001-nyc' as a descriptive internal ID format. The RETURNS section clarifies that device_id is 'Your internal device ID', adding meaning beyond the raw schema type (string).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Get detailed information about a specific device' — a specific verb (get) + resource (device) + scope (specific/single). The 'specific device' qualifier effectively distinguishes this from sibling list_devices (bulk retrieval) and delete_device (mutation).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides explicit scenarios: checking status of a single device, getting configuration details, and debugging. While it clearly implies this is for single-device lookups versus bulk operations, it does not explicitly name sibling alternatives like list_devices or differentiate from get_device_ads.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_device_adsAInspect

Get current ads scheduled for a device (for testing).

WHEN TO USE:

Testing device ad delivery
Debugging which ads are being shown
Verifying ad targeting is working

RETURNS:

ads: Array of advertisement objects
default_stream: Default content when no ads
schedule: Current ad schedule

EXAMPLE: User: "What ads are showing on device P_abc123?" get_device_ads({ fingerprint: "P_abc123" })

ParametersJSON Schema

Name	Required	Description	Default
`fingerprint`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full disclosure burden. It compensates well by including a 'RETURNS' section documenting the response structure (ads array, default_stream, schedule) since no output schema exists. However, it lacks mention of whether data is real-time vs cached, rate limits, or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE) that front-load the critical information. Every sentence earns its place; the example is concise and immediately illustrates both parameter usage and expected format without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter read operation without output schema, the description is complete. It compensates for missing schema documentation via the example, documents return values via the RETURNS section, and provides sufficient usage context through the WHEN TO USE guidelines.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for the 'fingerprint' parameter, the description partially compensates via the EXAMPLE section showing 'P_abc123' format. However, it fails to explicitly define what a fingerprint represents (device ID, hardware hash, etc.) or its semantic meaning beyond the pattern hint in the example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Get') and resource ('current ads scheduled for a device'), clearly distinguishing it from sibling tools like get_device (which retrieves device metadata rather than ad schedules). The parenthetical '(for testing)' effectively scopes the tool's intended use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section with three specific scenarios (testing delivery, debugging ads shown, verifying targeting). This provides clear contextual triggers for agent invocation that go beyond the basic function description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_incrementalityAInspect

Get incrementality/lift test results for a campaign.

Uses Bayesian (Beta-Binomial with 10K Monte Carlo samples) and frequentist (chi-square with Yates correction) methods for causal measurement.

WHEN TO USE:

Proving causal DOOH advertising effectiveness
Getting both Bayesian and frequentist significance measures
Seeing treatment vs control group visit rates and lift

RETURNS: Array of experiments, each with:

experimentId, type (geo_holdout/ghost_ads/psm), status
treatmentDmas, controlDmas
latestResult: treatment/control rates, lift%, incrementalVisits, pValue, posteriorProbPositive, expectedUplift, credibleInterval

Returns empty array if no experiments exist for this campaign.

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden and discloses statistical methodology (10K Monte Carlo samples, Yates correction), return structure complexity, and empty-array behavior. Lacks operational details like rate limits, caching, or error conditions beyond the no-experiments case.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (implied headers), logical progression from purpose to methodology to usage to returns. No redundant text; every sentence adds specific value about statistical approach or return data structure. Appropriate length for the tool's analytical complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given lack of output schema and no annotations, description comprehensively documents return structure (experimentId, DMAs, statistical fields) and methodology. Would benefit from mentioning error cases (invalid campaign_id, permission errors) or computational cost of Monte Carlo sampling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the single campaign_id parameter. Description implies the parameter through context ('for a campaign') but does not explicitly name campaign_id or describe its constraints (maxLength 128, minLength 1). Just meets minimum viable compensation for schema deficiency.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb 'Get' and precise resource 'incrementality/lift test results', immediately distinguishing it from sibling analytics tools like get_analytics or get_campaign_performance. The statistical methodology detail (Bayesian Beta-Binomial, frequentist chi-square) further clarifies the specific measurement approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three specific scenarios (proving causal DOOH effectiveness, getting significance measures, seeing treatment vs control rates). This clearly positions the tool against alternatives by emphasizing 'causal' measurement versus standard correlation-based analytics available in sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_live_audienceAInspect

Get real-time audience data for a specific screen.

WHEN TO USE:

Checking current audience at a screen before buying
Monitoring audience during a live campaign
Getting detailed audience signals (attention, mood, purchase intent, demographics)

RETURNS real-time data from edge AI sensors (refreshed every 10 seconds):

face_count: Number of people currently viewing
attention_score: How attentively the audience is watching (0-1)
income_level: Estimated income bracket (from Gemini Vision)
mood: Current audience mood
lifestyle: Primary lifestyle segment
purchase_intent: Purchase intent level
crowd_density: Estimated venue occupancy
ad_receptivity: How receptive the audience is to ads (0-1)
emotional_engagement: Emotional engagement score (0-1)
group_composition: Solo/couples/families/friends/work groups
signals_age_ms: How fresh the data is in milliseconds

EXAMPLE: User: "What's the current audience at screen 507f1f77bcf86cd799439011?" get_live_audience({ screen_id: "507f1f77bcf86cd799439011" })

ParametersJSON Schema

Name	Required	Description	Default
`screen_id`	Yes

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description fully compensates by disclosing data refresh rate (10 seconds), source (edge AI sensors, Gemini Vision), data freshness metric (signals_age_ms), and specific AI-derived attributes (mood, purchase intent, income_level).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded purpose statement. RETURNS section efficiently documents 11 fields without prose bloat. Every sentence earns its place given lack of output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, yet description comprehensively documents all return fields (face_count, attention_score, etc.). No annotations exist, yet behavioral traits are fully disclosed. Single parameter is demonstrated in example. Complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for screen_id parameter. Description includes helpful EXAMPLE showing parameter usage with realistic ID format, but does not explicitly document what screen_id represents or how to obtain it. Baseline 3 for partial compensation via example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opening sentence 'Get real-time audience data for a specific screen' provides specific verb (Get), resource (real-time audience data), and scope (specific screen). The real-time emphasis distinguishes it from sibling get_audience_forecast.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three specific scenarios (checking before buying, monitoring live campaigns, getting detailed signals). Clear context but lacks explicit mention of alternatives like get_audience_forecast for predictive vs real-time data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_media_buy_deliveryBInspect

[AdCP Media Buy] Get delivery/performance report for a media buy.

Returns campaign performance with breakdowns by screen, venue, hour, and audience segment.

WHEN TO USE:

Monitoring campaign delivery in real-time
Getting performance breakdowns for optimization
Reporting on campaign results

RETURNS:

delivery: impressions, spend, avg_cpm, unique_screens, fill_rate
breakdowns: by_screen, by_venue, by_hour (top performers)

EXAMPLE: get_media_buy_delivery({ media_buy_id: "mbuy_abc123" })

ParametersJSON Schema

Name	Required	Description	Default
`dimensions`	No
`breakdown_by`	No
`media_buy_id`	Yes

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It compensates by detailing the return structure (delivery metrics and breakdowns) since no output schema exists. However, it lacks operational details like caching behavior, rate limits, or permission requirements that would be essential for a data retrieval tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the action statement. The RETURNS section efficiently documents the data structure without verbosity, though the bullet format is slightly informal.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema coverage, no annotations, and no output schema, the description adequately compensates by documenting return values and use cases. However, the incomplete parameter documentation leaves a significant gap for a 3-parameter tool where users need to understand the difference between 'dimensions' and 'breakdown_by'.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring the description to fully document parameters. While it mentions available breakdown types (screen, venue, hour, audience_segment), it fails to explain the critical distinction between the 'dimensions' and 'breakdown_by' parameters (which have overlapping enums) or document the required media_buy_id format beyond the example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Get[s] delivery/performance report for a media buy' with the '[AdCP Media Buy]' prefix providing domain context. It distinguishes from siblings like get_campaign_performance by specifying the 'media buy' resource rather than general campaign analytics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section listing three specific scenarios (monitoring delivery, optimization breakdowns, reporting). However, it lacks 'when not to use' guidance or explicit sibling alternatives (e.g., when to use get_campaign_performance instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_multi_touch_attributionAInspect

Get multi-touch attribution model results for a campaign.

Supported models: time_decay, position_based, attention_weighted.

WHEN TO USE:

Understanding how DOOH fits into the full marketing funnel
Seeing credit allocation across DOOH, mobile, web, and store channels
Quantifying DOOH's contribution to conversions

RETURNS:

totalChains: number of multi-touch journeys found
avgTouchpoints: average touchpoints per chain
channelAttribution: { dooh, mobile, web, store } (each 0-1, sums to 1)
conversions: total conversion events
totalConversionValue: sum of conversion values (cents)
avgConfidence: average match confidence across chains

Returns null if no multi-touch chains exist.

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses supported attribution models, detailed return structure (including value ranges like 0-1 summing to 1 and cents denomination), and critical edge case ('Returns null if no multi-touch chains exist'). Lacks explicit safety/idempotency statements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear sections: one-line purpose, supported models list, WHEN TO USE bullets, and RETURNS documentation. Every sentence adds value; no redundancy. Front-loaded with the core action statement.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's analytical complexity and lack of structured output schema, the RETURNS section comprehensively documents output fields, types, and semantics (cents, confidence scores). However, incomplete due to missing parameter documentation for the single required input.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (campaign_id lacks description field). Description mentions 'for a campaign' but never explicitly documents the campaign_id parameter, its format, constraints (minLength/maxLength from schema), or how to obtain a valid ID. Description fails to compensate for uncovered schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Get' and clear resource 'multi-touch attribution model results'. Distinguishes from siblings like get_campaign_attribution by specifying supported models (time_decay, position_based, attention_weighted) and emphasizing cross-channel analysis (DOOH, mobile, web, store).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three concrete scenarios: understanding DOOH in the funnel, credit allocation across channels, and quantifying DOOH contribution. This provides clear context for selection without requiring the agent to guess based on the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_network_statsAInspect

Get network-wide statistics across all partner screens.

WHEN TO USE:

Getting a high-level overview of network performance
Checking how many screens are online
Reviewing total impressions and revenue estimates

RETURNS:

total_screens, online_screens
impressions, total_auctions
revenue_estimate_usd, avg_cpm, fill_rate

EXAMPLE: User: "How is my network performing this week?" get_network_stats({ time_range: "7d" })

ParametersJSON Schema

Name	Required	Description	Default
`time_range`	No

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Compensates partially by documenting return fields (total_screens, revenue_estimate_usd, etc.) but omits safety traits (read-only vs destructive), rate limits, or aggregation methodology. 'Get' implies safe read operation, but this is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded purpose statement. Zero redundant text; every sentence serves selection or invocation guidance. Appropriate length for single-parameter tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but description documents return structure comprehensively (7 fields listed). Example demonstrates valid invocation pattern. Minor gap: parameter semantics not explained textually. Otherwise complete for a simple aggregation endpoint.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage (time_range property lacks description field). Description shows parameter usage in example ('time_range: 7d') but fails to explain parameter semantics, valid values, or defaults in prose. Relies entirely on schema enum constraints without textual explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Get' + resource 'network-wide statistics' + scope 'across all partner screens'. Effectively distinguishes from sibling analytics tools (e.g., get_campaign_performance, get_analytics) by emphasizing 'network-wide' aggregation vs. specific entity queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three concrete scenarios (high-level overview, checking online screens, reviewing impressions/revenue). Provides clear context for selection. Lacks explicit 'when not to use' or named sibling alternatives, preventing a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_partner_infoAInspect

Get information about the authenticated partner account.

WHEN TO USE:

Checking current partner status and stats
Verifying API key is working
Getting partner account details

RETURNS:

partner_id: Partner identifier
company_name: Registered company name
status: Account status (active, suspended, etc.)
device_count: Number of registered devices
total_impressions: Lifetime impression count
earnings: Earnings summary

EXAMPLE: User: "What's my partner account status?" get_partner_info({})

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the authentication context ('authenticated partner account') and comprehensively documents the return structure (partner_id, status, earnings, etc.), compensating for the lack of output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). No wasted words; every sentence provides actionable guidance or data structure information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description comprehensively documents all return fields. For a zero-parameter read operation, the description covers purpose, usage contexts, return values, and invocation example—sufficient for correct agent operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Zero parameters present, establishing baseline of 4. The example 'get_partner_info({})' correctly signals the empty parameter requirement, reinforcing the schema structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific verb ('Get') and resource ('information about the authenticated partner account'), distinguishing it from siblings like register_partner (creation) and get_device/get_campaign (different entities).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit WHEN TO USE section with three concrete scenarios (checking status, verifying API key, getting details). Lacks explicit 'when not to use' or named alternatives (e.g., contrast with register_partner), but the scope is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_pricingAInspect

Get machine-readable pricing for all Trillboards products. Returns graduated usage-based pricing, free tier thresholds, and committed-use discount tiers. No authentication required — use this to evaluate costs before integrating.

ParametersJSON Schema

Name	Required	Description	Default
`product`	No

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and succeeds in disclosing auth requirements (none needed) and return structure (graduated pricing, free tiers, discounts). Could improve by mentioning caching behavior or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences with zero waste: action statement, return value specification, and auth/usage guidance. Front-loaded with the essential verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for a simple single-parameter tool without output schema. Describes return content (pricing tiers) adequately. Minor gap in not clarifying the optional parameter's default behavior when omitted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the 'product' parameter. The description mentions 'all Trillboards products' but fails to clarify that the parameter is optional, what specific values are accepted (enum meanings), or whether omitting it returns all products versus a default.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb ('Get') and resource ('machine-readable pricing') with scope ('all Trillboards products'). Distinguishes from billing/purchase siblings by emphasizing 'evaluate costs before integrating,' though it could explicitly differentiate from get_products.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use context ('evaluate costs before integrating') and prerequisites ('No authentication required'). Lacks explicit when-not-to-use distinctions versus related tools like get_products or setup_billing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_productsAInspect

[AdCP Media Buy] Get available DOOH advertising products and packages.

Returns DOOH screen packages organized by venue type, location, and audience profile, with real-time pricing based on current demand and audience quality.

WHEN TO USE:

Browsing available inventory before creating a campaign
Comparing pricing across venue types and locations
Understanding what's available in a specific market

RETURNS:

products: Array of product packages with pricing, reach, and audience data
Each product includes: name, venue_type, screen_count, pricing, avg_audience

EXAMPLE: User: "What DOOH products are available in NYC?" get_products({ market: "New York", venue_types: ["retail", "transit"] })

ParametersJSON Schema

Name	Required	Description	Default
`brief`	No
`market`	No
`filters`	No
`buying_mode`	No
`venue_types`	No
`audience_profile`	No

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses key behavioral traits: real-time pricing dynamics based on demand/audience quality, and the organization structure (venue type, location, audience profile). It could improve by explicitly stating this is a read-only/safe operation, but the 'Get' verb and output documentation provide implicit clarity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly delineated sections (description, WHEN TO USE, RETURNS, EXAMPLE). Every sentence earns its place; no redundancy. The information density is high with no filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the high complexity (6 parameters, nested objects, 0% schema coverage) and lack of output schema, the description partially compensates with the RETURNS section explaining output structure and the EXAMPLE showing input usage. However, significant gaps remain regarding input parameter semantics, particularly for the buying_mode enum and filters object structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage across 6 complex parameters (including nested objects for filters and audience_profile, and an enum for buying_mode), the description fails to compensate adequately. While the EXAMPLE demonstrates market and venue_types usage, it leaves brief, buying_mode enum values, audience_profile fields, and the complex filters object completely unexplained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb ('Get') and resource ('DOOH advertising products and packages'), and the [AdCP Media Buy] prefix clearly scopes the domain. It distinguishes from siblings like create_campaign by emphasizing browsing/comparing inventory rather than creating or managing campaigns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The explicit 'WHEN TO USE' section provides three concrete scenarios: browsing before campaign creation, comparing pricing across venues, and market exploration. This clearly positions the tool relative to siblings like create_campaign (use this first) and get_pricing (use this for cross-venue comparison).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_roasAInspect

Get Return on Ad Spend (ROAS) with transaction attribution data.

Closes the ROAS loop: matches purchase events to DOOH exposures with time-decay weighting, and computes attributed revenue and incremental ROAS.

WHEN TO USE:

Measuring revenue directly attributable to DOOH advertising
Getting ROAS and incremental ROAS (iROAS) figures
Seeing sales lift between exposed and control groups

RETURNS:

transactions: total, uniquePurchasers, totalRevenueCents, avgBasketCents
attribution: attributedTransactions, attributedRevenueCents, totalMediaCostCents, roas, iroas
salesLift: exposedPurchasers, controlPurchasers, incrementalTransactions, incrementalRevenueCents, salesLiftPct, posteriorProbPositive
timing: avgHoursToPurchase, medianHoursToPurchase

Returns null if no transaction data exists.

ParametersJSON Schema

Name	Required	Description	Default
`campaign_id`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It effectively discloses algorithmic behavior (time-decay weighting, purchase-to-exposure matching) and edge cases ('Returns null if no transaction data exists'). It does not, however, disclose operational behaviors like rate limits, required permissions, or data freshness/latency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear visual hierarchy: purpose statement, 'WHEN TO USE' section, and 'RETURNS' section. The RETURNS section extensively lists output fields, which is justified given the absence of an output schema. However, the length is substantial and could potentially be trimmed by omitting obvious field explanations (e.g., 'totalRevenueCents') while keeping the structural outline.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description compensates admirably with detailed return value documentation covering transactions, attribution, salesLift, and timing objects. Usage context is also well-covered. The only significant omission is the input parameter description (campaign_id), preventing a perfect score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (campaign_id has no description), and the description fails to compensate by explaining the parameter. It never mentions 'campaign_id', what format it expects, or how to obtain a valid campaign ID. While the parameter name is somewhat self-evident, the complete lack of documentation for the sole required parameter is a significant gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb ('Get') and resource ('Return on Ad Spend with transaction attribution data'). The second paragraph distinguishes this from generic analytics siblings by specifying it 'matches purchase events to DOOH exposures with time-decay weighting' and computes 'incremental ROAS', clearly defining the tool's unique scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three concrete scenarios (measuring DOOH-attributable revenue, getting ROAS/iROAS figures, seeing sales lift). This provides clear positive guidance. However, it lacks explicit 'when not to use' guidance or named sibling alternatives (e.g., get_campaign_attribution vs. get_incrementality) that would help the agent avoid calling the wrong attribution tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_signalsAInspect

[AdCP Signals] Get real-time audience signals from DOOH screens.

This is an AdCP (Ad Context Protocol) compliant tool. It returns deterministic audience signals captured by edge AI (vision + audio + speech) on Trillboards screens.

WHEN TO USE:

Discovering available audience signals before buying inventory
Evaluating audience composition at specific venues or locations
Building targeting segments based on real-time audience data

Unlike probabilistic data, Trillboards signals are DETERMINISTIC — captured by on-device cameras and microphones, analyzed by ML Kit and Gemini Vision.

RETURNS:

signals: Array of per-screen signal objects with demographics, venue, behavior, geo
metadata: total_screens, matching_screens, screens_with_live_data

EXAMPLE: User: "What audience signals are available at retail locations?" get_signals({ signal_spec: { signal_types: ["demographics", "behavior"], filters: { venue_type: "retail" } } })

ParametersJSON Schema

Name	Required	Description	Default
`signal_spec`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses deterministic capture methodology (cameras, microphones, ML Kit, Gemini Vision), edge AI processing location, and detailed return structure (signals array fields, metadata counts). Lacks privacy/safety disclosure for biometric capture.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear headers ([AdCP Signals], WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded, no redundancy, and every sentence serves a distinct purpose (protocol compliance, methodology, usage, returns).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists; description adequately documents return structure (signals array with demographics/venue/behavior/geo, metadata fields). Explains AdCP compliance and technology stack. Complex domain (AdTech/DOOH) well-contextualized, though data freshness/latency not mentioned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially with concrete EXAMPLE showing signal_spec structure and valid values (demographics, behavior, venue_type: retail), but does not explicitly document all filter options (city, state, income, lifestyle) or enumerate all signal_types (geo, venue) outside the example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb-resource combination ('Get real-time audience signals from DOOH screens') and distinguishes from siblings by emphasizing 'deterministic' capture vs probabilistic data, and explicitly contrasting with inventory buying tools via 'before buying inventory' context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with three distinct bullet scenarios (discovering signals, evaluating composition, building segments). Clear differentiation from probabilistic data sources and purchasing workflows.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_social_attentionAInspect

Query social attention contagion metrics from the observation stream. Returns windows where attention propagated between viewers (social amplification factor > 1).

Social attention data is produced by the AttentionGraphBuilder running on CTV edge devices, which models viewer attention as a directed graph and detects when one viewer looking at the screen triggers nearby viewers to also look (attention contagion / social amplification).

WHEN TO USE:

Finding moments where social proof drove collective engagement
Identifying which venues or dayparts exhibit highest attention contagion
Understanding cascading attention patterns (cascade depth)
Correlating social amplification with ad effectiveness (VAS)

RETURNS:

data: Array of observation_stream rows with socialAttention payload
- payload.socialAttention.socialAmplificationFactor (SAF): ratio of actual-to-expected group attention (>1 = contagion detected)
- payload.socialAttention.cascadeDepth: max depth of attention propagation chain
- payload.socialAttention.viralAttentionScore: composite metric combining SAF and cascade depth
- payload.socialAttention.contagionWindowMs: time window over which cascade occurred
- payload.socialAttention.triggerViewerIndex: which viewer initiated the cascade
metadata: { result_count, time_range, min_saf_filter }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "Show me moments where attention went viral in bar venues" get_social_attention({ min_saf: 2.0, venue_type: "bar" })

User: "Find the strongest social amplification events this week" get_social_attention({ min_saf: 3.0, time_range: { start: "2026-03-09T00:00:00Z", end: "2026-03-16T00:00:00Z" } })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`min_saf`	No
`screen_id`	No
`time_range`	No
`venue_type`	No

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It comprehensively explains the data source (AttentionGraphBuilder on CTV edge devices), the graph model (directed graph of viewer attention), detection mechanism (triggering nearby viewers), and detailed return payload structure (SAF, cascadeDepth, viralAttentionScore). Could improve by mentioning performance characteristics or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear visual hierarchy: single-sentence purpose upfront, technical context paragraph, explicit WHEN TO USE bullet list, detailed RETURNS schema, and concrete EXAMPLES. Every section earns its place; information is front-loaded and appropriately scoped for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (attention contagion modeling) and lack of output schema, the description provides thorough domain context and comprehensive return value documentation (down to individual payload fields). However, it should explicitly differentiate from 'get_social_contagion_summary' and document input parameters to fully compensate for the sparse input schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, requiring description to compensate. While examples demonstrate min_saf (threshold values 2.0/3.0) and venue_type ('bar'), and the RETURNS section mentions min_saf_filter, the description lacks explicit documentation for all 5 parameters (limit, screen_id, time_range structure) and their relationships. Baseline 3 appropriate given examples partially compensate but don't fully document the interface.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states the tool queries 'social attention contagion metrics from the observation stream' and returns 'windows where attention propagated between viewers (social amplification factor > 1)'. It uses specific verbs (Query, Returns) and domain-specific resources (observation stream, social amplification factor) that distinguish it from generic analytics siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit 'WHEN TO USE' section with four specific scenarios (finding social proof moments, identifying high-contagion venues, understanding cascade patterns, correlating with ad effectiveness). However, it fails to distinguish from the similarly named sibling 'get_social_contagion_summary', which could cause selection confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_social_contagion_summaryAInspect

Aggregate social attention metrics across screens and time periods. Shows which venues and dayparts have the highest social amplification.

Queries observation_stream for social attention data and aggregates by the requested dimension (venue, daypart, or screen), computing average SAF, average cascade depth, average viral attention score, and event count.

WHEN TO USE:

Understanding which venues generate the most social amplification
Comparing daypart effectiveness for social contagion
Identifying top-performing screens for attention cascading
Planning campaigns that leverage social proof

RETURNS:

data: Array of aggregated rows, sorted by avg SAF descending
- group_key: the dimension value (venue type, daypart, or screen ID)
- avg_saf: average social amplification factor
- avg_cascade_depth: average attention cascade depth
- avg_viral_attention_score: average viral attention score
- event_count: number of social attention events in the group
metadata: { group_by, time_range, total_events }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "Which venues have the highest social amplification this week?" get_social_contagion_summary({ group_by: "venue", time_range: { start: "2026-03-09", end: "2026-03-16" } })

User: "Show me social attention by daypart over the last 7 days" get_social_contagion_summary({ group_by: "daypart" })

ParametersJSON Schema

Name	Required	Description	Default
`group_by`	No
`time_range`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Strong disclosure of internal mechanics: states it 'Queries observation_stream', specific computations performed (average SAF, cascade depth, viral score), and sorting behavior ('sorted by avg SAF descending'). With no annotations provided, this carries the full burden. Missing operational details (caching, rate limits, data latency).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, WHEN TO USE, RETURNS, EXAMPLE). Front-loaded with the core action. No wasted words; every sentence provides net new information not duplicated in schema. Appropriate length given lack of output schema and 0% input schema coverage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (multi-dimensional aggregation with computed metrics), zero schema descriptions, no output schema, and no annotations, the description compensates effectively by detailing return structure (data array fields, metadata object) and providing concrete usage examples. Minor gap: time_range string format not explicitly defined.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, description must compensate. It successfully maps group_by enum values to semantic concepts ('dimension value', 'venue type', 'daypart'). The EXAMPLES section provides implicit documentation for time_range format (ISO date strings in object structure), though explicit parameter documentation would be stronger.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity with 'Aggregate social attention metrics' and clear resource scope (screens/time periods). Distinguishes from sibling get_social_attention by emphasizing aggregation (computes averages) versus raw observation retrieval, and explicitly names the three grouping dimensions (venue, daypart, screen).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists four specific scenarios (understanding venues, comparing dayparts, identifying screens, planning campaigns). However, lacks explicit 'when not to use' guidance or reference to sibling tools like get_social_attention for non-aggregated/raw data queries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_usage_summaryAInspect

Get your current billing period usage summary with per-product breakdown and costs. Shows free tier consumption, paid usage, and total cost.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully adds context about what data is returned (per-product breakdown, free vs paid tier details), but fails to disclose operational traits like whether this is a cached vs real-time query, rate limits, or that it is a safe read-only operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two efficient sentences with zero waste. The first sentence front-loads the core action and scope ('Get your current billing period usage summary'), while the second sentence enumerates the specific cost components included, earning its place by detailing the output structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately compensates by enumerating the specific data points returned (per-product breakdown, free tier consumption, paid usage, total cost). It loses a point for not describing the return format or structure (object vs array), but is otherwise complete for a zero-parameter read operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters (empty properties object with additionalProperties: false). As per the rubric, zero parameters warrants a baseline score of 4, as there are no parameter semantics to clarify beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Get') and resource ('billing period usage summary') and clearly distinguishes the scope with specific details ('per-product breakdown', 'free tier consumption', 'paid usage', 'total cost'). This effectively differentiates it from siblings like get_billing_status or get_pricing by emphasizing usage tracking rather than account status or rate cards.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains what data is returned but provides no explicit guidance on when to select this tool versus related billing siblings (get_billing_status, get_pricing, purchase_credits). It does not specify prerequisites, decision criteria, or exclusions that would help an agent choose the correct billing tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_webhook_deliveriesAInspect

Get delivery history for a webhook.

WHEN TO USE:

Debugging failed webhook deliveries
Auditing webhook activity
Checking delivery success rates

RETURNS:

deliveries: Array of delivery records with:
- delivery_id: Unique delivery ID
- event: Event type
- status: success/failed
- response_code: HTTP response code
- response_time_ms: Response time
- attempted_at: Attempt timestamp
- error: Error message (if failed)
total: Total delivery count
success_rate: Percentage of successful deliveries

EXAMPLE: User: "Show me failed deliveries for this webhook" get_webhook_deliveries({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d", status: "failed", limit: 20 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`status`	No
`webhook_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Extensive RETURNS section discloses output structure (delivery records with 7 fields, total count, success rate), compensating for missing output schema. Does not explicitly state read-only/idempotent nature or rate limits, but 'Get' implies safe operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded with one-line summary. Length is appropriate given lack of output schema - RETURNS section earns its space by documenting return values. Slightly verbose but justified by context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 simple parameters, no annotations, and no output schema, description is comprehensive. RETURNS section fully documents return structure that would normally be in output schema. Sufficient for agent to understand tool capabilities and results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% - no parameter descriptions in schema. Description shows parameters in EXAMPLE (webhook_id, status, limit) with sample values, but never explains what each parameter does, valid values, or constraints. Fails to compensate for zero schema coverage with explicit parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Get' + resource 'delivery history for a webhook'. Clearly distinguishes from sibling tools like create_webhook, list_webhooks, or test_webhook by focusing on historical delivery records rather than webhook configuration or testing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists three specific scenarios (debugging failed deliveries, auditing activity, checking success rates). Includes concrete EXAMPLE showing parameter usage with realistic values, providing clear invocation pattern.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_devicesAInspect

List all devices registered to the partner account.

WHEN TO USE:

Getting an overview of all connected devices
Finding devices by status (online/offline)
Auditing the device fleet

RETURNS:

devices: Array of device objects
total: Total device count
online_count: Number of online devices
offline_count: Number of offline devices

EXAMPLE: User: "Show me all my online devices" list_devices({ status: "online", limit: 50 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`offset`	No
`status`	No
`device_type`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. The 'RETURNS' section effectively discloses response structure (devices array, counts) compensating for missing output schema. However, it omits mention of read-only safety, pagination behavior (despite limit/offset params), or that 'all' requires iteration due to the 100-item limit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded with precise one-line summary. No redundant text; every section adds distinct value beyond the schema and annotations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 4-parameter list operation with no output schema, the description is quite complete. It provides return value documentation and usage examples that compensate for missing structured metadata. Minor gap: does not explicitly document the pagination model implied by limit/offset parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring description compensation. The 'WHEN TO USE' and 'EXAMPLE' sections explain 'status' and 'limit' parameters, but 'offset' (pagination cursor) and 'device_type' filter remain undocumented. Partial compensation achieved but gaps remain for half the parameter set.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb ('List') + resource ('devices') + scope ('registered to the partner account'). Explicitly distinguishes from sibling 'get_device' (singular retrieval) by emphasizing 'all devices' and listing behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section provides three concrete scenarios: getting overviews, finding devices by status, and auditing. The status filtering example and example invocation further clarify appropriate usage patterns.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_endpointsAInspect

List every registered Trillboards API operation.

WHEN TO USE:

First call in an agent session to learn what the API offers.
Filter to agent_safe=true to list only side-effect-free endpoints.
Narrow to a single surface (data-api, sdk-api, device-api, sensing-api, partner-api-generated, dsp-api-generated).

RETURNS:

operations: Array of { surface, method, path, operation_id, summary, description, agent_safe, idempotent, cost_tier, tags, doc_url, example_request }
total_operations: Total count.
surfaces: Known surface identifiers.

EXAMPLE: Agent: "What read-only endpoints can I call?" list_endpoints({ agent_safe: true })

ParametersJSON Schema

Name	Required	Description	Default
`surface`	No
`agent_safe`	No
`idempotent`	No

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It does this well by describing what the tool returns (operations array with specific fields, total count, surfaces), providing an example of usage, and explaining filtering capabilities. However, it doesn't mention potential limitations like rate limits, authentication requirements, or pagination behavior, which keeps it from a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (WHEN TO USE, RETURNS, EXAMPLE) and every sentence adds value. It's appropriately sized for a tool that needs to explain its purpose, usage, and output format. The information is front-loaded with the core purpose stated first, followed by practical guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only discovery tool with 3 parameters and no output schema, the description provides excellent context about what the tool does, when to use it, and what it returns. The example adds practical guidance. The only minor gap is the lack of explicit mention that this is a read-only operation (though implied by 'list' and 'agent_safe' filtering), which prevents a perfect score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for the 3 parameters, the description fully compensates by explaining what each parameter does: 'surface' is described as filtering to specific API surfaces, 'agent_safe' filters to side-effect-free endpoints, and 'idempotent' is listed as a filter parameter. The example shows how to use the agent_safe parameter, providing practical semantic context beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verb ('List') and resource ('every registered Trillboards API operation'). It distinguishes itself from sibling tools like 'describe_endpoint' by focusing on listing all operations rather than describing a single one. The description explicitly mentions what the tool returns, making its purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('First call in an agent session to learn what the API offers') and how to filter results (by agent_safe, surface). It includes a concrete example showing an agent query and the corresponding tool invocation, making usage scenarios clear. The WHEN TO USE section is structured and comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_error_codesAInspect

List every error code in the Trillboards API error catalog.

WHEN TO USE:

Understanding what error codes the API can return.
Building a client-side error handler that covers all cases.
Looking up error types, HTTP statuses, and documentation URLs.

RETURNS:

object: "list"
data: Array of { code, type, http_status, description, doc_url }
total: Total number of error codes.

Equivalent to GET /v1/errors but executed in-process (no HTTP round-trip).

EXAMPLE: Agent: "What error codes can the API return?" list_error_codes()

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that the tool returns a list of error codes with specific fields (code, type, http_status, etc.) and mentions it's equivalent to a GET endpoint but executed in-process. However, it doesn't cover potential limitations like rate limits, authentication needs, or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (description, when to use, returns, note, example). Every sentence adds value—no redundancy or fluff. It's front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (0 parameters, no annotations, no output schema), the description is nearly complete. It explains what the tool does, when to use it, and what it returns. A minor gap is the lack of explicit mention of authentication or rate limits, but for a read-only list tool, this is acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so the baseline is 4. The description appropriately doesn't discuss parameters, focusing instead on the tool's purpose and output.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('List every error code') and resource ('Trillboards API error catalog'), distinguishing it from sibling tools which focus on analytics, campaigns, devices, etc. It provides a complete picture of what the tool does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'WHEN TO USE' section with three specific scenarios (understanding API errors, building error handlers, looking up error details). It clearly defines the context for using this tool versus alternatives, though no sibling tools directly overlap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_webhooksAInspect

List all webhook subscriptions for the partner account.

WHEN TO USE:

Viewing all configured webhooks
Auditing webhook subscriptions
Finding a webhook to update or delete

RETURNS:

webhooks: Array of webhook objects with:
- webhook_id: Unique identifier
- url: Endpoint URL
- events: Subscribed events
- enabled: Whether webhook is active
- created_at: Creation timestamp
- last_delivery: Last successful delivery time

EXAMPLE: User: "Show me all my webhooks" list_webhooks({})

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It compensates effectively by detailing the complete return structure (webhooks array with webhook_id, url, events, enabled, timestamps) since no output schema exists. The verb 'List' implies read-only behavior, though it could explicitly state this is a safe, non-destructive operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the purpose statement. Every section earns its place—the RETURNS section is particularly valuable given the absence of an output schema. No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple zero-parameter list operation, the description is comprehensive. It covers purpose, usage contexts, return value structure (compensating for missing output schema), and invocation example. Given the low complexity of the tool, no additional information (like pagination or rate limits) appears necessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters (empty object). Following the rule that 0 params equals a baseline of 4, this is appropriate. The description adds value by providing an explicit example call 'list_webhooks({})' which confirms the empty parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear, specific statement: 'List all webhook subscriptions for the partner account.' It uses a specific verb (List) with a specific resource (webhook subscriptions) and scope (partner account). It clearly distinguishes from siblings like create_webhook, delete_webhook, and update_webhook through its singular focus on listing/reading.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section with three specific scenarios including 'Finding a webhook to update or delete,' which directly references sibling mutation tools (update_webhook, delete_webhook). This provides clear guidance on when to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

log_eventAInspect

[AdCP Media Buy] Record a conversion or attribution event.

Records conversion events for post-campaign attribution analysis. Events are deduplicated by event_id + event_type combination.

WHEN TO USE:

Recording offline conversions (store visits, purchases)
Tracking post-view attribution events
Logging custom KPI events

EXAMPLE: log_event({ media_buy_id: "mbuy_abc123", event: { event_id: "conv_12345", event_type: "store_visit", value_cents: 5000, screen_id: "507f1f77bcf86cd799439011", metadata: { store: "NYC-001", dwell_minutes: 12 } } })

ParametersJSON Schema

Name	Required	Description	Default
`event`	Yes
`media_buy_id`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full disclosure burden. It successfully documents the deduplication logic ('deduplicated by event_id + event_type combination'), which is critical idempotency information. However, it lacks details on return values, error conditions, or whether the operation is synchronous/asynchronous.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear visual separation: domain tag, one-line summary, behavioral note, bullet-point usage guidelines, and concrete code example. No redundant phrases; every section provides distinct value (usage context, behavioral constraints, param examples).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the nested parameter complexity (7 fields in event object) and complete absence of schema descriptions or output schema, the description is minimally adequate. The example compensates for some param documentation gaps, but the tool lacks any description of return values, success indicators, or error responses, which is a significant omission for a write operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must compensate. It partially succeeds via the JSON example showing field structure and values, and textually explains the semantic relationship between event_id and event_type (composite key for deduplication). However, it omits explanations for timestamp formats, value_cents currency semantics, screen_id references, and event_source purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear verb ('Record') and resource ('conversion or attribution event'), immediately clarifying scope. The '[AdCP Media Buy]' prefix contextualizes the domain, and the second sentence ('Records conversion events for post-campaign attribution analysis') distinguishes this from sibling tools like record_impression or get_campaign_attribution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section listing three specific scenarios (offline conversions, post-view attribution, custom KPIs). This provides concrete guidance on appropriate invocation contexts without requiring the agent to infer from the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

predictive_queryAInspect

Generate predictive insights from observation patterns. Predict whether a venue is likely to see increased foot traffic based on current patterns.

Uses historical observation_stream data to compute trend analysis via linear regression on time-bucketed metrics. Generates predictions with confidence intervals based on the observed trend, variance, and sample size.

WHEN TO USE:

Predicting future audience patterns at a venue or screen
Forecasting foot traffic trends for campaign planning
Understanding whether metrics are trending up, down, or stable
Making data-driven decisions about inventory and pricing

RETURNS:

prediction: The predicted trend and expected values
- trend: 'increasing' | 'decreasing' | 'stable'
- current_avg: Current average metric value
- predicted_avg: Predicted average over the time horizon
- change_pct: Expected percentage change
- confidence_interval: { lower, upper } bounds
confidence: Overall prediction confidence (0-1)
supporting_data: Recent data points that inform the prediction
- data_points: Array of { bucket, avg_value, sample_count }
- total_observations: Total observations analyzed
methodology: Description of the prediction approach
suggested_next_queries: Follow-up queries to refine the prediction

EXAMPLE: User: "Will this QSR venue see more foot traffic next week?" predictive_query({ question: "Will foot traffic increase at QSR venues?", venue_type: "restaurant_qsr", time_horizon: "7d" })

User: "Predict audience attention trends for this screen" predictive_query({ question: "What will audience attention look like?", screen_id: "507f1f77bcf86cd799439011", time_horizon: "3d" })

ParametersJSON Schema

Name	Required	Description	Default
`question`	Yes
`screen_id`	No
`venue_type`	No
`time_horizon`	No

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden effectively. It reveals the algorithm (linear regression), data source (observation_stream), statistical basis (variance, sample size), and exhaustively documents return values including confidence intervals and methodology. Missing only operational details like rate limits or costs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear demarcation between purpose, implementation details, usage guidelines, return values, and examples. Slightly verbose due to inline output schema documentation, but justified given the absence of a structured output schema. Every section earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's statistical complexity and 0% input schema coverage, the description compensates adequately with detailed methodology explanation, comprehensive return value documentation (acting as surrogate output schema), and concrete usage examples. Only lacks explicit parameter definitions to be fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While the EXAMPLES section demonstrates parameter usage patterns (e.g., '7d' for time_horizon, 'restaurant_qsr' for venue_type), it lacks explicit definitions for each parameter's semantics and valid values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates predictive insights via linear regression on observation_stream data, using specific verbs ('generate', 'predict', 'compute'). It distinguishes from simple querying tools by emphasizing trend analysis and confidence intervals, though explicit differentiation from sibling 'get_audience_forecast' would strengthen it further.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The explicit 'WHEN TO USE' section provides four concrete scenarios including forecasting for campaign planning and inventory decisions. However, it lacks 'when NOT to use' guidance or explicit alternatives (e.g., when to use 'get_audience_forecast' instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

predict_moment_qualityAInspect

Predict the VAS (Viewability Attention Score) a specific creative would achieve at a given moment, based on historical data and causal modeling.

Uses the CausalPredictionService which:

Embeds the moment description to find historically similar moments
If >= 5 similar moments exist with the same creative, uses weighted-average prediction
If insufficient data, falls back to Gemini generative prediction
Always decomposes the prediction into causal factors

WHEN TO USE:

Evaluating whether a creative will perform well in a specific context
A/B testing creative placement hypotheses before committing budget
Understanding which causal factors drive VAS for a creative
Comparing expected performance across different moment types

RETURNS:

prediction: { predictedVAS (0-1), confidence (0-1), method ('historical'|'model'), sampleSize }
causal_factors: { audienceMatch, contextMatch, attentionState, socialPotential } (each 0-1)
metadata: { creative_id, moment_description }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "How would a coffee ad perform at a transit station during morning rush?" predict_moment_quality({ moment_description: "transit venue, morning commute, 12 viewers, high attention, mostly 25-34 age range", creative_id: "coffee-brand-morning-30s" })

ParametersJSON Schema

Name	Required	Description	Default
`creative_id`	Yes
`moment_description`	Yes

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, yet description fully compensates by disclosing internal algorithm (embedding similarity, >=5 sample threshold, Gemini fallback), prediction decomposition logic, and complete return structure including confidence scores and causal factor breakdowns.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent information architecture with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Every sentence earns its place; algorithm steps 1-4 efficiently convey fallback logic without verbosity. Well front-loaded with core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex prediction tool with causal modeling and no output schema, the description comprehensively documents return values (prediction object with method/sampleSize, causal_factors decomposition, metadata) and sibling differentiation. The example query demonstrates realistic usage patterns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description relies on the EXAMPLE section to demonstrate parameter semantics (showing moment_description format as 'transit venue, morning commute...' and creative_id pattern). While illustrative, it lacks explicit definitions of what creative_id references or moment_description schema requirements beyond the example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Predict' and precise resource 'VAS (Viewability Attention Score)', clearly distinguishing this as a predictive/simulation tool rather than measurement (get_creative_attention) or recommendation (recommend_creative). Expands with methodology details that reinforce its unique scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists four distinct scenarios including budget commitment timing and causal factor analysis. Effectively differentiates from sibling measurement tools by emphasizing 'before committing budget' and hypothesis testing use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

provide_performance_feedbackAInspect

[AdCP Media Buy] Provide optimization signals from buyer agent.

Accepts feedback from buyer agents for floor price adjustment and inventory optimization. Enables closed-loop optimization between buyer and seller agents.

WHEN TO USE:

Sending bid response feedback to optimize future pricing
Providing conversion data for bid price calibration
Adjusting floor prices based on demand signals

EXAMPLE: provide_performance_feedback({ media_buy_id: "mbuy_abc123", feedback: { type: "bid_response", avg_bid_price_cpm: 6.5, fill_rate_percent: 72, preferred_hours: [8, 9, 10, 17, 18], quality_score: 0.85 } })

ParametersJSON Schema

Name	Required	Description	Default
`feedback`	Yes
`media_buy_id`	Yes

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It mentions behavioral outcomes ('floor price adjustment,' 'inventory optimization,' 'closed-loop optimization') but omits critical safety properties: whether the operation is destructive, idempotent, requires specific media_buy states, or has rate limits. It also fails to describe what happens upon success/failure given no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with domain prefix [AdCP Media Buy], concise purpose statement, WHEN TO USE bullets, and EXAMPLE code block. Every sentence earns its place; the most critical information is front-loaded before the structured sections.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with nested objects, 0% schema coverage, and no output schema, the description covers primary use cases via examples but leaves gaps. It doesn't explain the relationship to sibling create_media_buy (prerequisite?), doesn't document the 'type' enum values comprehensively, and doesn't specify return behavior or side effects despite the complex input structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The EXAMPLE section illustrates the nested feedback structure (avg_bid_price_cpm, fill_rate_percent, etc.), providing implicit semantics for some fields. However, it fails to document all enum values for 'type' (missing 'quality' and 'pacing' explanations), doesn't explain valid ranges (e.g., is 0.85 quality_score good?), and doesn't describe the media_buy_id reference format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Provide[s] optimization signals from buyer agent' and specifies it accepts feedback for 'floor price adjustment and inventory optimization.' It distinguishes from siblings like create_media_buy or get_analytics by focusing specifically on the feedback loop role rather than creation or retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit 'WHEN TO USE' section with three concrete scenarios: 'Sending bid response feedback,' 'Providing conversion data,' and 'Adjusting floor prices.' This directly guides the agent on appropriate invocation contexts versus alternatives like get_media_buy_delivery or update_media_buy.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

purchase_creditsAInspect

Purchase committed-use credits at a discount. Three tiers: tier_500 ($500 → $625 credit, 25% bonus), tier_2000 ($2,000 → $3,100 credit, 55% bonus), tier_5000 ($5,000 → $10,000 credit, 100% bonus). Requires an active payment method.

ParametersJSON Schema

Name	Required	Description	Default
`tier`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden. It excellently documents the financial conversion mechanics ($500→$625, etc.) and 'committed-use' nature implying pre-payment constraints. Minor gap: doesn't explicitly state transaction finality or immediate charging behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences with zero waste. Front-loaded purpose statement followed by necessary tier specifications and prerequisite. The tier details, while lengthy, are essential domain knowledge for correct invocation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for input parameters and purchase mechanics given the single-parameter complexity. Lacks description of return values or confirmation behavior (e.g., 'credits are immediately applied'), which would be helpful given no output schema exists and this is a financial transaction.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Perfect compensation for 0% schema description coverage. The description exhaustively maps each enum value (tier_500, tier_2000, tier_5000) to exact dollar amounts and bonus percentages, providing complete semantic meaning that the bare schema lacks.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity with clear verb 'Purchase', resource 'committed-use credits', and value proposition 'at a discount'. The financial nature clearly distinguishes it from sibling billing tools like get_billing_status or setup_billing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States the prerequisite 'Requires an active payment method' but lacks explicit workflow guidance (e.g., when to use setup_billing first) and doesn't contrast with get_pricing for price checking vs purchasing. Usage is implied but not guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_changelogAInspect

Query the Trillboards API changelog for recent changes, breaking changes, deprecations, and fixes.

WHEN TO USE:

Check what has changed in the API before upgrading an integration.
Find breaking changes since a specific date.
Discover new features added to a specific API surface.

PARAMETERS:

since (YYYY-MM-DD, optional): Only entries dated on or after this date. Unreleased entries are always included.
type (string, optional): Filter by change category. Accepts: "breaking" → changed + removed entries "additive" → added entries "deprecation" → deprecated entries "fix" → fixed entries Can be comma-separated: "breaking,deprecation"

RETURNS:

object: "list"
data: Array of { version, date, type, surface, description }
total: Number of matching entries.

EXAMPLE: Agent: "What broke since April 1st?" query_changelog({ since: "2026-04-01", type: "breaking" })

ParametersJSON Schema

Name	Required	Description	Default
`type`	No
`since`	No

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes what the tool does (queries changelog entries), including that it returns both released and unreleased entries, and specifies the return format (list with data array and total count). However, it doesn't mention potential limitations like rate limits, authentication needs, or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (description, WHEN TO USE, PARAMETERS, RETURNS, EXAMPLE), each containing only essential information. Every sentence earns its place by providing specific guidance, parameter details, or usage context without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema, no annotations), the description is nearly complete. It covers purpose, usage, parameters, and returns thoroughly. The only minor gap is the lack of explicit mention of authentication requirements or error handling, which would be helpful for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must fully compensate. It provides comprehensive parameter semantics: 'since' is explained as a date filter (YYYY-MM-DD format) with the note that unreleased entries are always included, and 'type' is detailed with all accepted values ('breaking', 'additive', 'deprecation', 'fix') and the comma-separated syntax for multiple categories. This adds significant value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('query', 'check', 'find', 'discover') and resources ('Trillboards API changelog', 'recent changes, breaking changes, deprecations, and fixes'). It distinguishes itself from sibling tools by focusing exclusively on changelog queries, unlike other tools that handle analytics, campaigns, devices, or webhooks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides explicit guidance with three distinct scenarios: checking changes before upgrades, finding breaking changes since a date, and discovering new features. It clearly indicates when to use this tool versus alternatives (e.g., for changelog-specific queries rather than general analytics or configuration tasks).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_observationsAInspect

Query the universal observation stream using natural language or structured filters. Returns multi-modal sensing data (audience, vehicle, environment, commerce) from physical-world observations across the screen network.

WHEN TO USE:

Exploring raw observation data from edge AI sensors on screens
Filtering observations by venue type, device, time range, or geography
Getting audience, vehicle, environment, or commerce observation data
Answering natural language questions about what screens are sensing

RETURNS:

data: Array of observation objects with device, venue, payload, confidence, model versions
metadata: { observation_count, time_range, coverage_pct, model_versions }
suggested_next_queries: Contextual follow-up queries

Each observation includes:

observation_id, device_id, screen_mongo_id, venue_type
observed_at: Timestamp of the observation
observation_family: audience | vehicle | environment | commerce
payload: JSONB with model outputs (face_count, emotion, vehicle_count, etc.)
confidence: Model confidence score (0-1)
evidence_grade: Quality grade of the observation
model_versions: Which ML models produced this data

EXAMPLE: User: "Show me audience observations at QSR venues in the last hour" query_observations({ query: "audience observations at QSR venues", filters: { observation_family: ["audience"], venue_type: ["restaurant_qsr"], time_range: { start: "2026-03-16T14:00:00Z", end: "2026-03-16T15:00:00Z" } }, limit: 50 })

User: "What are screens sensing right now?" query_observations({ query: "latest observations from all screens", limit: 20 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`filters`	No

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden and extensively documents return structure (data array, metadata object, suggested_next_queries) and data semantics (confidence scores, evidence_grade, model_versions). Deducted one point for not explicitly stating safety/destructiveness or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (WHEN TO USE, RETURNS, EXAMPLES). Front-loaded with purpose. Length is justified given lack of output schema and annotation coverage, though slightly verbose in listing observation field details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a complex multi-modal sensing tool with nested schema and no annotations. Compensates for missing output schema by detailing return structure and providing realistic JSON examples. Missing only operational details like pagination cursors or explicit read-only guarantees.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially via examples showing query string and filters structure, and mentions filterable dimensions in WHEN TO USE. However, fails to document several parameters (geohash, device_id, screen_id) or explain distinctions between similar fields (device_id vs screen_id).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Query' and resource 'universal observation stream'. Distinguishes from siblings like 'get_signals' and 'semantic_search_observations' by specifying 'raw observation data from edge AI sensors' and listing specific observation families (audience, vehicle, environment, commerce).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section with four clear bullet points covering exploration, filtering, data retrieval, and natural language queries. Provides strong contextual guidance, though does not explicitly name sibling alternatives (e.g., when to use 'semantic_search_observations' instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recommend_creativeAInspect

Given a moment description, rank candidate creatives by predicted VAS performance.

Evaluates each creative candidate against the described moment context using historical similarity and causal prediction. Returns a ranked list sorted by predicted VAS score, with confidence levels for each prediction.

WHEN TO USE:

Choosing which creative to show at a specific moment/venue
Comparing multiple creatives for a campaign across different contexts
Optimizing creative rotation for maximum VAS
Pre-campaign creative selection based on audience and venue

RETURNS:

rankings: Array sorted by predicted VAS (descending)
- creativeId, predictedVAS (0-1), confidence (0-1), rank (1-N)
metadata: { candidate_count, moment_description }
suggested_next_queries: Follow-up queries

EXAMPLE: User: "Which of these 3 creatives will perform best at a gym in the evening?" recommend_creative({ moment_description: "gym venue, evening, 6 viewers, high attention, mostly male 18-34", creative_ids: ["fitness-brand-30s", "energy-drink-15s", "tech-gadget-20s"] })

ParametersJSON Schema

Name	Required	Description	Default
`creative_ids`	Yes
`moment_description`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses prediction methodology ('historical similarity and causal prediction') and detailed return structure including confidence levels. Missing operational details like rate limits or sync/async nature, but covers core behavioral traits well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Perfectly structured with clear sections: purpose statement, WHEN TO USE, RETURNS, and EXAMPLE. Every sentence earns its place; no redundancy. Information density is high without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive coverage of input purpose, usage scenarios, and output format (including metadata and suggested next queries). Complete for a recommendation tool despite lack of formal output schema in structured metadata. Minor gap around error conditions or edge case behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%—neither parameter has schema descriptions. Description partially compensates by mapping 'moment_description' to 'moment context' and 'creative_ids' to 'candidate creatives' in the narrative, and provides an example showing expected formats. However, fails to document critical constraints (max 20 creatives, max 1000 char description) that exist only in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb ('rank') plus resource ('candidate creatives') and metric ('VAS performance'). Distinguishes from sibling get_content_recommendations by specifying moment-context evaluation using 'historical similarity and causal prediction'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section with four specific scenarios covering campaign optimization, pre-campaign selection, and creative rotation. Includes concrete example showing exact parameter values and user intent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

record_impressionAInspect

Record a single ad impression from a device.

WHEN TO USE:

Reporting that an ad was displayed on a device
Recording impression with detailed metadata
Single impression events (for batch, use batch_impressions)

RETURNS:

success: Boolean indicating success
impression_id: Unique impression identifier
earnings: Earnings credited for this impression

EXAMPLE: User: "Record an impression for ad 507f1f77bcf86cd799439011" record_impression({ fingerprint: "P_abc123", ad_id: "507f1f77bcf86cd799439011", duration_seconds: 15 })

ParametersJSON Schema

Name	Required	Description	Default
`ad_id`	Yes
`metadata`	No
`timestamp`	No
`fingerprint`	Yes
`duration_seconds`	No

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses return structure (success boolean, impression_id, earnings) which adds value, but lacks mutation characteristics: no mention of idempotency, rate limits, billing side effects, or error conditions for invalid fingerprints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core action. Slightly verbose formatting but zero wasted content; every sentence provides actionable guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 5 parameters with nested objects and zero schema descriptions, description provides return value documentation (compensating for missing output schema) and concrete usage example. Gaps remain in parameter documentation and error handling for a financial/tracking operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring description to compensate. Example demonstrates three parameters (fingerprint, ad_id, duration_seconds) with realistic values, but 'metadata' and 'timestamp' are undocumented. 'Fingerprint' semantics (device fingerprint?) remain unexplained despite being required.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Record' + resource 'ad impression' + scope 'from a device'. Explicitly distinguishes from sibling tool 'batch_impressions' ('Single impression events (for batch, use batch_impressions)'), clarifying scope vs alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Dedicated 'WHEN TO USE' section lists three specific scenarios including 'Reporting that an ad was displayed' and 'Recording impression with detailed metadata'. Explicitly names alternative 'batch_impressions' for batch operations, providing clear selection criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_deviceAInspect

WHEN TO USE:

Adding a new screen/kiosk/vending machine to the network
Updating device location or configuration
Re-registering a device after maintenance

RETURNS:

device_id: Your internal device ID (echoed back)
trillboards_device_id: Internal Trillboards device ID
fingerprint: Device fingerprint (e.g., "P_abc123")
embed_url: URL to load in the device's WebView
status: Device status

EXAMPLE: User: "Register a vending machine in NYC" register_device({ device_id: "vending-001-nyc", name: "NYC Office Lobby Vending", device_type: "vending_machine", location: { lat: 40.7128, lng: -74.0060, city: "New York", state: "NY", venue_type: "office" } })

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`specs`	No
`location`	No
`metadata`	No
`device_id`	Yes
`device_type`	No

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the upsert nature (register OR update) and documents return fields (device_id, fingerprint, embed_url, status), but omits operational traits like idempotency, authentication requirements, rate limits, or safety warnings that would be critical for a registration endpoint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clearly delineated sections (main description, WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded and every sentence serves a purpose. The formatting with headers makes it scannable and appropriately sized for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the RETURNS section effectively documents output structure. The example covers the primary use case. However, with 6 parameters (including complex nested objects) and 0% schema coverage, the description falls short of fully documenting optional parameters like 'specs' and 'metadata', preventing a perfect score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring the description to compensate. While the EXAMPLE demonstrates usage of device_id, name, device_type, and location, it fails to document the semantics of 'specs' (nested hardware configuration) and 'metadata' (arbitrary additional properties), leaving significant gaps in parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the dual purpose ('Register or update') with specific resource (device) and scope (partner's network). It effectively distinguishes from siblings like get_device, delete_device, and list_devices by emphasizing the upsert behavior and network registration context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Excellent explicit guidance via the 'WHEN TO USE' section listing three specific scenarios: adding new hardware (screens/kiosks/vending machines), updating location/configuration, and re-registering after maintenance. This provides clear context for when to invoke versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_partnerAInspect

WHEN TO USE:

First-time setup for a new partner integration
Creating a new partner account to manage devices

RETURNS:

partner_id: Unique partner identifier
api_key: API key for authenticated requests (store securely!)
status: Account status

EXAMPLE: User: "Register my vending machine company" register_partner({ company_name: "Acme Vending Co", email: "tech@acmevending.com", industry: "vending", expected_devices: 50 })

ParametersJSON Schema

Name	Required	Description	Default
`email`	Yes
`website`	No
`industry`	No
`company_name`	Yes
`contact_name`	No
`contact_phone`	No
`expected_devices`	No

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Documents return structure (partner_id, api_key, status) including critical security note 'store securely!' for the API key. However, lacks disclosure of side effects (emails sent? webhooks triggered?), idempotency behavior, or error conditions (duplicate email handling).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Information is front-loaded with the core purpose, followed by usage context, return values, and concrete example. No redundant or wasted text; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description compensates by documenting return values and their meanings. The example is comprehensive and realistic. For a partner registration tool, covers essential behavioral context, though could benefit from mentioning error scenarios or account activation workflows given the complexity of the operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. The description provides an EXAMPLE showing usage of company_name, email, industry, and expected_devices, which implicitly conveys semantics. However, it fails to document the remaining parameters (website, contact_name, contact_phone) or explicitly describe parameter purposes, formats, or validation rules beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Register a new partner organization with Trillboards' — specific verb (Register), clear resource (partner organization), and platform context. It clearly distinguishes from sibling tools like `get_partner_info` (retrieval) and `register_device` (device registration vs partner account creation).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit 'WHEN TO USE' section with two clear scenarios: 'First-time setup for a new partner integration' and 'Creating a new partner account to manage devices'. Provides clear context for invocation. Lacks explicit 'when not to use' or pointer to alternatives like `get_partner_info` for existing accounts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_contentAInspect

Semantic search over content library using natural language queries and 768-D pgvector embeddings.

WHEN TO USE:

Finding content by description or theme ("upbeat music videos", "cooking shows")
Discovering content similar to a concept or mood
Searching the content library without knowing exact titles or IDs
Content discovery for programmatic content scheduling

RETURNS:

data: Array of matching content with similarity scores
- videoId, title, contentCategory, description, durationSeconds
- reviewStatus (approved/pending/rejected)
- similarity (0-1, cosine similarity against query embedding)
meta: { count, query, limit, minSimilarity }

EXAMPLE: User: "Find fitness and workout content" search_content({ query: "fitness workout exercise gym", limit: 10, min_similarity: 0.6 })

User: "Search for calming nature content suitable for medical offices" search_content({ query: "calming nature scenes peaceful landscapes meditation", min_similarity: 0.5 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`min_similarity`	No

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses embedding mechanism (768-D pgvector), scoring method (cosine similarity 0-1), and return structure (data array with reviewStatus, meta object). Details similarity thresholds and result composition. Could explicitly state read-only nature or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear sections: mechanism summary front-loaded, followed by WHEN TO USE, RETURNS, and EXAMPLES. No wasted text; every section earns its place by aiding tool selection or clarifying output format. Appropriate length for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Fills critical gaps: documents return format extensively (since no output schema exists), explains embedding search mechanism, and provides concrete usage examples. Given zero schema coverage and no annotations, description is substantially complete, though it could explicitly contrast with 'find_similar_moments' or confirm read-only safety.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. Description provides examples showing natural language 'query' usage and explains 'min_similarity' range (0-1) in RETURNS section, but lacks explicit parameter documentation (e.g., 'limit sets maximum results'). Examples partially compensate but systematic parameter semantics are missing.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb ('search') and resource ('content library') with technical mechanism ('768-D pgvector embeddings', 'cosine similarity'). Distinguishes from siblings like 'semantic_audience_search' and 'semantic_search_observations' by specifying 'content library' target, and from exact-match tools by emphasizing 'natural language' and 'without knowing exact titles'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section with four specific scenarios including natural language queries, concept/mood discovery, and content discovery for scheduling. Implicitly excludes exact-ID lookups ('without knowing exact titles'), though it does not explicitly name sibling alternatives like 'find_similar_moments' or 'discover_inventory'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

semantic_audience_searchAInspect

Search screens by natural language scene description using pgvector.

Uses 768-dimensional Gemini embeddings on scene descriptions from FEIN edge AI to find screens matching a natural language query.

WHEN TO USE:

Finding screens by audience context ("families eating lunch in a food court")
Contextual ad placement based on real-time scene understanding
Discovering inventory matching a specific audience scenario

RETURNS: Array of matching screens ranked by semantic similarity, each with:

screen_id, mongo_screen_id, scene_description, contextual_relevance, similarity, created_at

EXAMPLE: semantic_audience_search({ query: "young professionals in a coffee shop looking at phones", limit: 10 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`since`	No
`min_similarity`	No

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description carries the burden well. Discloses technical implementation (pgvector, 768-dim embeddings), data source (FEIN edge AI scene descriptions), and return structure (array with specific fields ranked by similarity). Lacks explicit safety notes (read-only) or rate limits, though 'search' implies non-destructive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear section headers (WHEN TO USE, RETURNS, EXAMPLE). Front-loaded with core purpose. Technical implementation details (pgvector, Gemini) are slightly verbose but relevant for capability understanding. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates effectively for missing output schema by documenting return fields (screen_id, contextual_relevance, etc.) in RETURNS section. Addresses 0% input schema coverage via example for primary params, though incomplete for secondary parameters (since, min_similarity). Overall adequate for tool complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates partially via the EXAMPLE showing query and limit usage. However, critical parameters 'since' (time filter pattern) and 'min_similarity' (threshold) remain undocumented in description text, leaving semantic gaps the schema fails to cover.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: states exact action ('Search screens'), resource type ('screens'), and unique method ('natural language scene description using pgvector'). Technical details (Gemini embeddings, FEIN edge AI) clearly distinguish it from keyword-based siblings like search_content or discover_inventory.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Strong 'WHEN TO USE' section provides three concrete scenarios (audience context, contextual ad placement, inventory discovery). Distinguishes intent from other search tools. Could be improved by explicitly naming sibling alternatives to avoid (e.g., 'use search_content for keyword searches').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

semantic_search_observationsAInspect

Search observations by semantic similarity. Find moments that match a description like "lunch rush at fast casual restaurants" using vector embeddings.

Uses 768-dimensional Gemini embeddings on observation payloads to find promoted observations matching a natural language query via pgvector cosine similarity search.

WHEN TO USE:

Finding observations that match a conceptual description
Discovering contextual moments across the screen network
Searching for audience situations ("families waiting in line", "professionals on coffee break")
Finding commerce patterns ("high purchase intent near checkout")

RETURNS:

data: Array of matching observations ranked by semantic similarity, each with:
- observation_id, device_id, venue_type, observation_family
- observed_at, payload, confidence, evidence_grade
- similarity: Cosine similarity score (0-1, higher = more relevant)
metadata: { result_count, query_embedding_model, search_scope }
suggested_next_queries: Related semantic queries to explore

EXAMPLE: User: "Find lunch rush moments at fast casual restaurants" semantic_search_observations({ query: "lunch rush at fast casual restaurants with high foot traffic", filters: { venue_type: ["restaurant_qsr"] }, limit: 20 })

User: "Find moments with high emotional engagement" semantic_search_observations({ query: "audience showing strong positive emotional reactions", filters: { observation_family: ["audience"] }, limit: 10 })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`filters`	No

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden and discloses implementation (Gemini embeddings, pgvector, cosine similarity), search scope ('promoted observations'), and detailed return structure including similarity scoring (0-1). Could mention latency or rate limits for a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear headers (WHEN TO USE, RETURNS, EXAMPLE). Technical implementation details are slightly verbose but add value. Core purpose is front-loaded in the first sentence.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the RETURNS section comprehensively documents the data array structure, metadata fields, and suggested_next_queries. Sufficient for an AI agent to understand the full contract.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates via concrete examples showing natural language queries and filter usage, but lacks explicit prose definitions of what 'query', 'filters', and 'limit' parameters semantically represent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb-resource pair ('Search observations by semantic similarity') and clarifies the mechanism ('using vector embeddings'). The technical details (768-dimensional Gemini embeddings, pgvector cosine similarity) distinguish it from sibling keyword-search tools like query_observations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit 'WHEN TO USE' section with four concrete scenarios (conceptual descriptions, audience situations, commerce patterns). Lacks explicit 'when not to use' guidance or naming of alternatives like query_observations for exact filtering.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

setup_billingAInspect

Set up pay-per-use billing with a Stripe payment method. Required after exceeding free tier limits. Pass a Stripe payment method token (pm_xxx) obtained from Stripe.js or Stripe Elements.

ParametersJSON Schema

Name	Required	Description	Default
`payment_method_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the external dependency (Stripe), payment model (pay-per-use), and trigger condition (free tier limits). However, it omits critical mutation details: whether the method replaces existing ones, immediate validation behavior, idempotency, or return value structure for a financial operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: purpose declaration, usage trigger, and parameter specification. Front-loaded with the core action and appropriately sized for the single-parameter input.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero schema coverage and no output schema, the description adequately covers invocation prerequisites but lacks post-invocation context (success/failure states, whether billing activates immediately). For a billing mutation tool, this leaves operational gaps despite covering the input requirements well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no description field for payment_method_id). The description fully compensates by specifying the token format ('pm_xxx'), source ('Stripe.js or Stripe Elements'), and constraints (maxLength 200 implied by token format), providing complete semantic meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Set up') and identifies the exact resource (pay-per-use billing with Stripe), clearly distinguishing it from sibling tools like 'purchase_credits' (prepaid model) and 'get_billing_status' (read-only).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states the trigger condition ('Required after exceeding free tier limits'), providing clear context for when to invoke. Lacks explicit mention of 'purchase_credits' as an alternative prepaid option, though 'pay-per-use' implies the distinction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sync_creativesAInspect

[AdCP Media Buy] Validate and sync creative assets for a media buy.

Validates creative assets (resolution, duration, format) against screen specifications. Returns compatibility status for each screen in the campaign.

WHEN TO USE:

Submitting creative assets before campaign launch
Checking if a creative meets screen requirements
Validating VAST tags

EXAMPLE: sync_creatives({ media_buy_id: "mbuy_abc123", creatives: [{ url: "https://cdn.example.com/ad.mp4", type: "video", width: 1920, height: 1080, duration_seconds: 15, file_size_mb: 12 }] })

ParametersJSON Schema

Name	Required	Description	Default
`creatives`	Yes
`media_buy_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses what gets validated (resolution, duration, format) and what is returned (compatibility status per screen). However, it fails to clarify the safety implications of 'sync' (whether this performs writes, requires specific permissions, or is idempotent) or explain failure modes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, EXAMPLE) that front-load critical information. The categorization prefix [AdCP Media Buy] aids quick identification. No redundant or filler text; every sentence conveys usage context, behavioral traits, or parameter examples.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with nested object parameters and no output schema, the description adequately covers the return value ('compatibility status for each screen') and validation scope. It integrates well with the sibling tool ecosystem (create_media_buy, update_media_buy). Minor deduction for not clarifying the 'sync' behavior's side effects or detailing the output structure more precisely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. The EXAMPLE block provides concrete semantics for media_buy_id format and the creatives array structure, demonstrating field usage effectively. However, it omits explanation of the format_id field (present in schema but not example) and constraints like maxItems: 10, leaving gaps in parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear verb-resource combination ('Validate and sync creative assets') and includes the '[AdCP Media Buy]' prefix to distinguish it from analytics siblings like get_creative_attention and recommendation tools like recommend_creative. It specifically mentions the validation criteria (resolution, duration, format) and return value (compatibility status), leaving no ambiguity about the tool's function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains an explicit 'WHEN TO USE' section listing three specific scenarios: submitting assets before launch, checking screen requirements, and validating VAST tags. This provides clear triggering conditions that help the agent select this tool over alternatives like create_media_buy or update_media_buy.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

test_webhookAInspect

Send a test event to a webhook endpoint.

WHEN TO USE:

Verifying webhook endpoint is working
Testing integration during development
Debugging webhook delivery issues

RETURNS:

success: Boolean indicating delivery success
response_code: HTTP response code from endpoint
response_time_ms: Response time in milliseconds
error: Error message if delivery failed

EXAMPLE: User: "Test my webhook with a device.online event" test_webhook({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d", event: "device.online" })

ParametersJSON Schema

Name	Required	Description	Default
`event`	No
`webhook_id`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It compensates well by documenting the complete return structure (success, response_code, response_time_ms, error) in the 'RETURNS' section. It could explicitly state whether the operation is safe/non-destructive, but the 'test' context and return documentation provide strong implicit guidance.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear section headers (WHEN TO USE, RETURNS, EXAMPLE) that front-load critical information. Each sentence earns its place, progressing from purpose to usage contexts to technical contract to concrete example without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's limited complexity (2 parameters, no nested objects) and absence of output schema, the description achieves completeness through the RETURNS section documenting output fields and the EXAMPLE demonstrating invocation syntax. No gaps remain for agent selection or invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While the EXAMPLE section demonstrates valid parameter values (webhook_id format and event type), the description lacks explicit semantic definitions for the parameters (e.g., 'webhook_id is the identifier of the webhook to test'). The example mitigates but does not fully resolve the documentation gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with the specific action 'Send a test event to a webhook endpoint', clearly identifying the verb (send), resource (test event), and target (webhook endpoint). It effectively distinguishes from siblings like create_webhook, delete_webhook, or list_webhooks by emphasizing the testing/debugging function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The explicit 'WHEN TO USE' section provides three distinct scenarios: verifying endpoints work, testing during development, and debugging delivery issues. This creates clear boundaries between this tool and alternatives like create_webhook (setup) or get_webhook_deliveries (retrieving history).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_media_buyAInspect

[AdCP Media Buy] Update an existing media buy (campaign).

Modify budget, targeting, schedule, or status of an existing media buy.

WHEN TO USE:

Adjusting campaign budget mid-flight
Pausing or resuming a campaign
Changing targeting parameters
Extending campaign dates

EXAMPLE: update_media_buy({ media_buy_id: "mbuy_abc123", updates: { status: "paused", budget: { daily_usd: 300 } } })

ParametersJSON Schema

Name	Required	Description	Default
`updates`	Yes
`media_buy_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It identifies the operation as a mutation ('Update', 'Modify') but fails to disclose idempotency, reversibility of changes, potential side effects on billing/delivery, or authorization requirements. The mention of 'mid-flight' hints at operational context but safety characteristics remain undocumented.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with domain tag [AdCP Media Buy], immediate purpose statement, field enumeration, bullet-pointed usage guidelines, and a concrete code example. Every section earns its place; no redundant or filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex nested input schema (7+ sub-fields) and lack of schema descriptions, the description adequately covers inputs by listing fields and providing an example. However, with no output schema provided, it fails to describe the return value or success/failure behavior expected from this mutation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. It effectively does so by listing updatable fields in prose and providing a concrete EXAMPLE showing the nested structure of the 'updates' object (including nested budget properties), which clarifies how to construct the complex parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Update an existing media buy') and enumerates exactly which fields can be modified (budget, targeting, schedule, status). The term 'existing' effectively distinguishes this from the sibling 'create_media_buy' tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides an explicit 'WHEN TO USE' section with four concrete scenarios (e.g., 'Adjusting campaign budget mid-flight', 'Pausing or resuming'). However, it lacks explicit 'when not to use' guidance or named alternatives (e.g., create_media_buy vs update) to prevent confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_webhookAInspect

Update an existing webhook subscription.

WHEN TO USE:

Changing the webhook endpoint URL
Adding or removing subscribed events
Enabling or disabling a webhook
Updating the webhook description

RETURNS:

webhook_id: The updated webhook ID
url: Updated endpoint URL
events: Updated event subscriptions
enabled: Updated enabled status
updated_at: Update timestamp

EXAMPLE: User: "Disable the webhook for maintenance" update_webhook({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d", enabled: false })

User: "Add impression events to my webhook" update_webhook({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d", events: ["device.online", "device.offline", "impression.recorded"] })

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the mutation nature ('Update') and documents return fields (webhook_id, url, events, enabled, updated_at) which compensates for missing output schema. However, lacks disclosure of error behavior (e.g., invalid webhook_id), idempotency guarantees, or side effects on pending deliveries.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, WHEN TO USE, RETURNS, EXAMPLES) and efficient bullet formatting. Front-loaded with core action. Minor verbosity in dialogue-style examples, and RETURNS section is necessary given lack of output schema, though field list is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero annotations, empty input schema, and no output schema, the description achieves strong completeness through usage scenarios, return value documentation, and concrete examples. Only gap is explicit parameter schema documentation, though examples partially cover this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema declares 0 parameters (empty properties object), establishing baseline 4. The EXAMPLES section compensates by demonstrating parameter semantics: webhook_id follows 'wh_' prefix pattern, enabled is boolean, and events accepts string arrays like 'device.online'. While helpful, explicit parameter documentation would be stronger than implicit examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb 'Update' and resource 'webhook subscription'. The phrase 'existing webhook' clearly distinguishes from sibling create_webhook, while 'subscription' differentiates from test_webhook or delete_webhook.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'WHEN TO USE' section lists four specific scenarios (changing URL, modifying events, enabling/disabling, updating description). The EXAMPLES section provides concrete dialogue contrasting maintenance disable vs. event addition use cases, effectively guiding selection over alternatives like create_webhook or delete_webhook.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_requestAInspect

Validate a proposed request payload against the registered Zod schema for an operation, returning the exact canonical error envelope the HTTP surface would emit.

WHEN TO USE:

Before calling a write endpoint, to catch payload bugs locally.
Debugging 400 validation_error responses.

RETURNS:

valid: true when the payload would pass Zod validation.
When invalid, the canonical { error: { type, code, message, param, doc_url, details[] } } envelope is included under error.

EXAMPLE: validate_request({ path: "/v1/data/query", method: "POST", payload: { dataset: "inference_outcomes", limit: 9999 } })

ParametersJSON Schema

Name	Required	Description	Default
`path`	Yes
`method`	Yes
`payload`	No

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes the tool's behavior: it validates payloads against Zod schemas and returns error envelopes matching HTTP responses. It specifies the return structure ('valid: true' or error envelope) and mentions it's for local debugging, which adds useful context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, usage guidelines, returns, example) and every sentence adds value. It's appropriately sized for a validation tool with 3 parameters and no output schema, with no redundant or unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a validation tool with 3 parameters, 0% schema coverage, and no output schema, the description provides good completeness. It explains purpose, usage scenarios, return format, and includes an example. The main gap is lack of explicit output schema documentation, but the return description partially compensates for this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for 3 parameters, the description must compensate. It provides an example that clarifies parameter usage: 'path' specifies the endpoint path, 'method' specifies HTTP method, and 'payload' contains the data to validate. This adds meaningful semantics beyond the bare schema, though it doesn't detail all possible values or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('validate a proposed request payload') and resources ('against the registered Zod schema for an operation'). It distinguishes itself from sibling tools by focusing on validation rather than data retrieval, creation, or analysis operations like 'create_campaign' or 'get_analytics'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'WHEN TO USE' section with two specific scenarios: 'Before calling a write endpoint, to catch payload bugs locally' and 'Debugging 400 validation_error responses.' This provides clear guidance on when this tool should be used versus alternatives, though it doesn't explicitly name sibling tools as alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_proof_of_playAInspect

Verify cryptographic proof of ad delivery or get campaign proofs.

Requires either campaign_id or proof_payload (at least one must be provided).

Two modes:

Verify a proof: pass proof_payload with signature fields to verify
Get proofs: pass campaign_id to get Ed25519-signed proofs for a campaign

Uses Ed25519 signatures (v2) that can be independently verified by third parties using the Trillboards public key.

WHEN TO USE:

Verifying that ads were actually delivered to screens
Exporting cryptographically signed proof records for auditors
Getting proof-of-play data for campaign transparency reports

RETURNS (verify mode):

valid: boolean, reason: string if invalid, version: 'v1' or 'v2'

RETURNS (get proofs mode):

campaignId, totalImpressions, proofsReturned
proofs: Array of signed impression proofs
pagination: { limit, hasMore, nextCursor }
signatureVersion, publicKeyUrl

EXAMPLE (verify): verify_proof_of_play({ proof_payload: { signature: "ed25519=abc123...", timestamp: "2026-03-10T15:30:00Z", adId: "ad_123", impressionId: "imp_456", screenId: "scr_789", deviceId: "dev_012" } })

EXAMPLE (get proofs): verify_proof_of_play({ campaign_id: "camp_abc123", start_date: "2026-03-01", end_date: "2026-03-10" })

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`cursor`	No
`end_date`	No
`start_date`	No
`campaign_id`	No
`proof_payload`	No

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and excels: discloses Ed25519 signature algorithm (v2), notes that proofs can be independently verified by third parties using the Trillboards public key, and details pagination behavior for the get proofs mode.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with clear headers (Two modes, WHEN TO USE, RETURNS, EXAMPLE). Every sentence earns its place; examples are provided for both modes without excessive verbosity. Complex cryptographic concepts are explained succinctly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (cryptographic verification, dual modes, nested objects) and lack of output schema/annotations, the description is remarkably complete. It documents return structures for both modes, explains signature verification details, and provides working examples.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must compensate. It effectively documents the complex proof_payload nested object via example and explains the campaign_id requirement, but does not explicitly explain the limit and cursor parameters (only mentioned in returns pagination).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the dual purpose with specific verbs: 'Verify cryptographic proof of ad delivery or get campaign proofs.' It distinguishes from sibling analytics tools (like get_campaign_performance) by emphasizing the cryptographic/audit nature of the proofs and Ed25519 signatures.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes an explicit 'WHEN TO USE' section listing three specific scenarios (verifying delivery, exporting for auditors, transparency reports). Also clearly delineates between the two modes (verify vs get proofs) and states the parameter requirement logic ('Requires either campaign_id or proof_payload').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources