lilo Vacation Rentals

Name: lilo Vacation Rentals
Author: lilo-property

by io.github.lilo-property

Server Details

Vacation rental discovery, direct booking, and property protection for AI agents.

Status: Healthy
Last Tested: 2026-07-05 07:36
Transport: Streamable HTTP
URL
Repository: lilo-property/mcp-server
GitHub Stars: 0
Server Listing: lilo-vacation-rentals

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.6/5.0

Tool DescriptionsA

Average 4/5 across 67 of 67 tools scored. Lowest: 3.2/5.

Server CoherenceB

Disambiguation2/5

There is significant overlap and redundancy among tools, causing ambiguity. For example, analyze_booking_threat_risk, analyze_guest_communication_risk, and analyze_guest_interaction_risk are aliases or near-duplicates, and multiple tools like get_vacation_rental_details and fetch_vacation_rental_details appear to serve the same purpose. This makes it difficult for an agent to choose the correct tool without confusion.

Naming Consistency4/5

Most tools follow a consistent verb_noun pattern (e.g., check_vacation_rental_availability_and_pricing, get_vacation_rental_details), which aids readability. However, there are minor deviations, such as some tools using 'analyze' vs. 'assess' or 'detect' for similar risk functions, and a few tools like 'ingest_philadelphia_public_records' use a different verb style, but overall the naming is largely predictable.

Tool Count2/5

With 67 tools, the count is excessive for a vacation rental server, leading to bloat and potential overwhelm. Many tools could be consolidated (e.g., multiple risk analysis tools, duplicate property detail fetchers), and the scope feels unfocused, including niche tools like those for Philadelphia events or World Cup compliance that might be better handled as parameters in broader tools.

Completeness5/5

The tool set provides comprehensive coverage for the vacation rental domain, including booking, risk assessment, compliance, maintenance, evidence management, and guest interactions. There are no obvious gaps; it supports full CRUD/lifecycle operations and specialized workflows, ensuring agents can handle most scenarios without dead ends.

Available Tools

67 tools

analyze_booking_threat_riskA

Read-only

Inspect

Analyze a vacation rental booking or guest interaction for potential threats and risks. Returns risk assessment level, identified concerns, and recommended actions for the host. Pass booking_id, message_content, and/or guest_profile for analysis.

ParametersJSON Schema

Name	Required	Description
`booking_id`	No	Booking UUID to analyze
`guest_profile`	No	Guest profile data
`message_content`	No	Message text to analyze

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds value by specifying the return format (risk assessment level, identified concerns, recommended actions), which isn't covered by annotations. However, it doesn't disclose additional behavioral traits like rate limits, authentication needs, or data sources used for analysis.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences: one stating the purpose and outputs, and another specifying the parameters. It's front-loaded with the core function. There's minimal waste, though it could be slightly more structured by explicitly separating input guidance from output details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk analysis with multiple optional inputs) and lack of output schema, the description does a good job covering the essentials: purpose, parameters, and return format. With annotations providing safety context, it's mostly complete, though adding more behavioral context (e.g., analysis methodology) could enhance it further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all three parameters clearly documented in the schema. The description mentions the same parameters (booking_id, message_content, guest_profile) but doesn't add meaningful semantics beyond what's in the schema, such as how they interact or which combinations are most effective. Baseline 3 is appropriate given the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: analyzing vacation rental bookings or guest interactions for threats and risks, with specific outputs (risk assessment level, concerns, recommended actions). It distinguishes from some siblings like 'assess_extended_stay_squatter_risk' by being more general, but doesn't explicitly differentiate from similar tools like 'analyze_guest_communication_risk' or 'detect_guest_communication_risk'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by specifying what to pass (booking_id, message_content, guest_profile) for analysis, suggesting it's for evaluating existing bookings or communications. However, it doesn't provide explicit guidance on when to use this tool versus alternatives like 'screen_guest_before_booking' (pre-booking) or 'assess_vacation_rental_booking_risk' (similar function), leaving room for ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_guest_communication_riskB

Read-only

Inspect

Analyze guest messages at a vacation rental for concerning patterns. Returns risk assessment, evidence documentation, and response guidance for hosts. Alias for detect_guest_communication_risk.

ParametersJSON Schema

Name	Required	Description
`message`	Yes	Guest message to analyze for risk patterns
`booking_id`	No	Booking UUID for evidence chain
`guest_name`	No	Guest name for documentation
`property_id`	No	Property UUID for cross-reference
`conversation_history`	No	Previous messages for conversation context analysis

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds value by specifying the output includes 'risk assessment, evidence documentation, and response guidance for hosts,' which goes beyond annotations. However, it doesn't disclose behavioral traits like rate limits, authentication needs, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded, stating the core purpose in the first sentence. The second sentence adds output details and alias information efficiently. Both sentences earn their place, but minor improvements could make it more structured (e.g., separating output details into a bullet list).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk analysis with multiple parameters) and lack of output schema, the description is moderately complete. It outlines the purpose and output components but doesn't detail the format of 'risk assessment' or 'evidence documentation.' With annotations covering safety, it's adequate but could benefit from more context on output structure or usage scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain how 'conversation_history' affects analysis). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with extra semantic details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Analyze guest messages at a vacation rental for concerning patterns.' It specifies the verb ('analyze'), resource ('guest messages'), and context ('vacation rental'). However, it doesn't explicitly differentiate from sibling tools like 'analyze_booking_threat_risk' or 'detect_guest_message_threat_pattern', which appear to have overlapping domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('for concerning patterns' and 'for hosts') but doesn't provide explicit guidance on when to use this tool versus alternatives. It mentions an alias ('detect_guest_communication_risk'), which helps identify a direct alternative, but doesn't explain differences from other risk analysis tools in the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_guest_interaction_riskA

Read-only

Inspect

Analyze a guest interaction at a vacation rental for potential risks. Returns risk assessment and recommended actions. Alias for analyze_booking_threat_risk. Pass booking_id, message_content, and/or guest_profile.

ParametersJSON Schema

Name	Required	Description
`booking_id`	No	Booking UUID to analyze
`guest_profile`	No	Guest profile data
`message_content`	No	Message text to analyze

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds value by specifying the return format ('risk assessment and recommended actions'), which is useful context beyond annotations. However, it does not disclose other behavioral traits like rate limits, authentication needs, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: purpose, return value, and parameters. Each sentence earns its place by providing essential information without redundancy, making it easy for an agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk analysis with optional parameters) and the absence of an output schema, the description is mostly complete. It covers purpose, return values, and parameters, but could improve by detailing the format of the risk assessment or examples of recommended actions to fully compensate for the missing output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters. The description lists the parameters ('booking_id, message_content, and/or guest_profile') but does not add meaning beyond what the schema provides, such as explaining how they interact or which combinations are most effective. The baseline score of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('analyze'), resource ('guest interaction at a vacation rental'), and outcome ('potential risks', 'risk assessment and recommended actions'). It explicitly distinguishes this tool from its sibling 'analyze_booking_threat_risk' by noting it's an alias, helping the agent avoid duplication.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('analyze a guest interaction... for potential risks') and mentions an alternative ('alias for analyze_booking_threat_risk'), but does not explicitly state when NOT to use it or compare it to other risk-related siblings like 'detect_guest_communication_risk' or 'assess_vacation_rental_booking_risk'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_vacation_rental_questionA

Read-only

Inspect

Ask a natural language question about a vacation rental property and get an answer grounded in verified data. Examples: 'What is the WiFi password?', 'Where do I park?', 'Is there a dishwasher?', 'What time is check-out?'. Requires property_id and the question text.

ParametersJSON Schema

Name	Required	Description
`guest_id`	No	Optional: Guest ID for vetting verification
`question`	Yes	Natural language question (e.g., 'What's the WiFi password?', 'Where do I park?')
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds that answers are 'grounded in verified data,' which provides useful context about data reliability. However, it does not disclose other behavioral traits like response format, potential limitations (e.g., unanswered questions), or any rate limits. With annotations covering safety, a 3 is appropriate as the description adds some value but not rich behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured: it starts with the core purpose, provides concrete examples, and ends with parameter requirements. Every sentence earns its place, with no redundant or vague language, making it easy to parse and front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (interactive Q&A), rich annotations (readOnlyHint, destructiveHint), and 100% schema coverage, the description is mostly complete. It covers purpose, usage examples, and parameter requirements. However, without an output schema, it does not describe the return format (e.g., structured answer, confidence score), which is a minor gap for an agent invoking the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for all parameters (property_id, question, guest_id). The description mentions that property_id and question are required, and provides examples of question content, but does not add significant meaning beyond what the schema already documents. Baseline 3 is correct when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Ask a natural language question about a vacation rental property and get an answer grounded in verified data.' It specifies the verb ('ask'), resource ('vacation rental property'), and outcome ('answer grounded in verified data'), distinguishing it from sibling tools like 'get_vacation_rental_faqs' or 'get_vacation_rental_details' which provide static information rather than interactive Q&A.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for asking natural language questions about vacation rentals, with examples like 'What is the WiFi password?' and 'What time is check-out?'. It mentions the required parameters (property_id and question text) but does not explicitly state when not to use it or name specific alternatives among the many sibling tools, though the examples help differentiate its interactive nature from static lookup tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

assess_extended_stay_squatter_riskA

Read-only

Inspect

Assess the risk of a guest establishing tenancy rights during an extended vacation rental stay. Evaluates stay duration against jurisdiction-specific tenant protection laws. Returns risk level, relevant state laws, and preventive recommendations. Pass check_in_date, check_out_date, and state code.

ParametersJSON Schema

Name	Required	Description
`state`	No	State code (e.g., CA, NY)
`property_id`	No	Property UUID
`check_in_date`	Yes	Check-in date
`check_out_date`	Yes	Check-out date

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable behavioral context beyond annotations by specifying what the tool evaluates (stay duration against laws) and what it returns (risk level, state laws, recommendations). It doesn't mention rate limits, authentication needs, or data sources, but provides meaningful operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with three sentences that each serve distinct purposes: stating the tool's purpose, explaining its evaluation logic, and specifying required parameters. There's no wasted language, and the information is front-loaded with the core functionality stated immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations and full schema coverage, the description provides sufficient context about what the tool does and returns. The main gap is the lack of output schema, so the description doesn't detail the structure of 'risk level, relevant state laws, and preventive recommendations.' However, given the tool's straightforward analytical purpose, the description is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all four parameters thoroughly. The description mentions three parameters (check_in_date, check_out_date, state code) but doesn't add semantic meaning beyond what the schema provides. It omits 'property_id' entirely. The baseline score of 3 reflects adequate parameter documentation primarily through the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('assess', 'evaluates') and resources ('guest establishing tenancy rights', 'extended vacation rental stay', 'jurisdiction-specific tenant protection laws'). It distinguishes itself from sibling tools by focusing specifically on squatter risk assessment for extended stays, unlike broader risk analysis tools like 'assess_vacation_rental_booking_risk' or communication-focused tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through its focus on extended stays and jurisdiction-specific laws, suggesting it should be used for longer-term vacation rentals where tenancy rights become relevant. However, it doesn't explicitly state when to use this tool versus alternatives like 'assess_vacation_rental_booking_risk' or provide clear exclusions for short-term stays.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

assess_vacation_rental_booking_riskA

Read-only

Inspect

Assess risk factors for a specific vacation rental booking with protection recommendations. Returns risk level, identified concerns, and suggested protective actions. Pass booking_id (UUID) and optional include_guest_risk (default true).

ParametersJSON Schema

Name	Required	Description	Default
`booking_id`	Yes	Booking UUID to assess
`include_guest_risk`	No	Include guest risk analysis (default true)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context about what the tool returns ('risk level, identified concerns, and suggested protective actions') and mentions default behavior for 'include_guest_risk'. It doesn't disclose rate limits or authentication needs, but with annotations covering safety, this provides good supplemental behavioral insight.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently covers purpose, parameters, and output. It's front-loaded with the core function and avoids any redundant or unnecessary information, making it highly effective for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk assessment with recommendations), lack of output schema, and rich annotations, the description is mostly complete. It explains the return content but could benefit from more detail on risk categories or example outputs. However, it adequately supports the agent's understanding for a read-only tool with clear parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters ('booking_id' and 'include_guest_risk') fully documented in the schema. The description mentions these parameters and the default for 'include_guest_risk', but doesn't add significant meaning beyond what's already in the schema descriptions. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Assess risk factors'), target resource ('vacation rental booking'), and output ('risk level, identified concerns, and suggested protective actions'). It distinguishes from siblings like 'analyze_booking_threat_risk' by focusing on comprehensive risk assessment with protection recommendations rather than just threat analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through 'risk factors' and 'protection recommendations', suggesting it's for evaluating booking safety. However, it doesn't explicitly state when to use this tool versus alternatives like 'screen_guest_before_booking' or 'predict_booking_chargeback_probability', nor does it mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

assign_cleaner_to_rental_turnoverA

Destructive

Inspect

Assign a cleaner to a vacation rental property turnover between bookings. Supports primary and backup cleaner assignments. Pass property_id, cleaner_id, booking_id, optional role (primary/backup), and scheduled_date.

ParametersJSON Schema

Name	Required	Description
`role`	No	Role: primary or backup
`booking_id`	Yes	Booking UUID for the turnover
`cleaner_id`	Yes	Cleaner UUID
`property_id`	Yes	Property UUID or lilo_code
`scheduled_date`	No	Scheduled date (ISO format)

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=true, which the description aligns with by implying a write operation ('Assign'). The description adds valuable context beyond annotations by specifying that it handles 'primary and backup cleaner assignments' and 'turnover between bookings,' clarifying the operational scenario. However, it does not detail potential side effects, error conditions, or confirmation behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first states purpose and scope, the second lists parameters. It is front-loaded with essential information, avoids redundancy, and every sentence contributes directly to tool understanding without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive tool with no output schema, the description adequately covers the basic operation and parameters. However, it lacks details on return values, error handling, or confirmation steps, which could be important for safe invocation. Given the annotations provide safety hints, the description is minimally complete but could be more informative.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description lists the parameters ('property_id, cleaner_id, booking_id, optional role (primary/backup), and scheduled_date') but adds minimal semantic value beyond the schema, such as explaining relationships between parameters or usage nuances. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Assign a cleaner'), target resource ('to a vacation rental property turnover between bookings'), and scope ('Supports primary and backup cleaner assignments'). It distinguishes itself from sibling tools by focusing on cleaner assignment rather than analysis, booking, verification, or other functions listed among the siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for assigning cleaners during turnover periods, but does not explicitly state when to use this tool versus alternatives (e.g., 'get_rental_cleaning_schedule' for viewing schedules). No exclusions or prerequisites are mentioned, leaving the agent to infer context from the tool's purpose alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

book_vacation_rental_directA

Destructive

Inspect

Book a vacation rental property directly through AI. Host receives 100% of nightly rate; 3% guest service fee added at checkout. Creates booking record, calculates pricing, and notifies the host. Returns booking_id, confirmation_code, pricing breakdown, and check-in link. Always call check_vacation_rental_availability_and_pricing first.

ParametersJSON Schema

Name	Required	Description
`guest_name`	Yes	Name of the guest
`guest_count`	No	Number of guests (default 1)
`guest_email`	Yes	Guest email address
`guest_phone`	No	Guest phone number
`property_id`	Yes	Property UUID or lilo_code
`check_in_date`	Yes	Check-in date (YYYY-MM-DD)
`check_out_date`	Yes	Check-out date (YYYY-MM-DD)
`special_requests`	No	Any special requests from guest

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=true, but the description adds valuable context beyond this: it explains financial details ('Host receives 100% of nightly rate; 3% guest service fee added at checkout'), the notification process ('notifies the host'), and return values. It doesn't contradict annotations, as 'destructiveHint=true' aligns with creating a booking record, but could mention irreversible aspects more explicitly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with key actions and structured efficiently in two sentences: one for the booking process and fees, another for operations and prerequisites. Every sentence adds value without redundancy, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (destructive booking operation) and lack of output schema, the description is mostly complete: it covers purpose, usage guidelines, financial behavior, and return values. However, it could improve by detailing error cases or confirmation steps, but annotations help fill gaps, making it sufficient for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 8 parameters. The description adds no specific parameter semantics beyond what the schema provides, such as format examples or constraints. It implies parameters like 'property_id' and dates but doesn't enhance their meaning, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Book', 'Creates', 'calculates', 'notifies') and resources ('vacation rental property'), distinguishing it from siblings like 'check_vacation_rental_availability_and_pricing' by emphasizing the booking action. It explicitly differentiates from risk analysis or search tools in the sibling list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Always call check_vacation_rental_availability_and_pricing first.' This clearly states a prerequisite and distinguishes when to use this tool versus alternatives, such as availability checks or risk assessments among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_str_permit_requirementsA

Read-only

Inspect

Check what permits and licenses are required to operate a short-term rental at a specific location. Returns required permits, application processes, fees, and renewal schedules. Pass state code (required), optional city and property_id.

ParametersJSON Schema

Name	Required	Description
`city`	No	City name
`state`	Yes	State code
`property_id`	No	Property UUID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds useful context about what information is returned (permits, application processes, fees, renewal schedules), which goes beyond annotations, but does not detail behavioral aspects like rate limits, authentication needs, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and return values, the second specifies parameters. Every sentence adds necessary information with zero waste, making it front-loaded and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, 1 required), rich annotations (readOnlyHint, destructiveHint), and 100% schema coverage, the description is largely complete. It explains the purpose, return values, and parameter usage. However, without an output schema, it could benefit from more detail on return format or limitations, but annotations help cover safety aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for all parameters (state, city, property_id). The description adds minimal value by noting that state is required and city/property_id are optional, but does not provide additional semantic context beyond what the schema already documents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Check what permits and licenses are required') and resource ('to operate a short-term rental at a specific location'), distinguishing it from siblings like 'get_short_term_rental_regulations' or 'get_str_insurance_requirements' by focusing on permit requirements for a specific location rather than general regulations or insurance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the target (short-term rental permits at a location) and listing parameters, but does not explicitly state when to use this tool versus alternatives like 'get_short_term_rental_regulations' or provide exclusions. The context is clear but lacks explicit guidance on alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_vacation_rental_availability_and_pricingA

Read-only

Inspect

Check real-time availability and pricing for a vacation rental property on specific dates. Returns whether the property is available, calculated total price, capacity check, and any conflicting bookings. Host receives 100% of nightly rate; 3% guest service fee at checkout. Always call this BEFORE book_vacation_rental_direct.

ParametersJSON Schema

Name	Required	Description
`guest_count`	No	Number of guests
`property_id`	Yes	Property UUID or lilo_code
`check_in_date`	Yes	Check-in date (YYYY-MM-DD)
`check_out_date`	Yes	Check-out date (YYYY-MM-DD)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable behavioral context beyond this: it specifies 'real-time' availability/pricing, discloses financial details ('Host receives 100% of nightly rate; 3% guest service fee at checkout'), and mentions what the tool returns (availability, total price, capacity check, conflicting bookings). While it doesn't cover rate limits or error handling, it significantly enhances transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and well-structured: two sentences that front-load the core functionality and follow with critical usage guidance. Every sentence earns its place—the first defines purpose and output, the second adds financial context and prerequisite rule—with zero wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (real-time checks with financial implications), the description is mostly complete. It explains the tool's purpose, output, financial context, and usage sequence. However, there is no output schema, and the description doesn't detail the return format (e.g., structure of 'conflicting bookings'). With annotations covering safety and schema covering inputs, the description provides strong context but leaves some output ambiguity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters well-documented in the schema (e.g., property_id as 'Property UUID or lilo_code', dates in YYYY-MM-DD format). The description adds no additional parameter semantics beyond what the schema provides, such as explaining how guest_count affects pricing or capacity. Given the high schema coverage, a baseline score of 3 is appropriate as the description doesn't compensate but doesn't need to.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Check real-time availability and pricing') and resource ('vacation rental property'), distinguishing it from siblings like 'book_vacation_rental_direct' (which books) or 'get_vacation_rental_details' (which retrieves static info). It explicitly mentions what it returns (availability, pricing, capacity check, conflicting bookings), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Always call this BEFORE book_vacation_rental_direct.' This clearly indicates when to use this tool (as a prerequisite check) versus its sibling 'book_vacation_rental_direct' (for actual booking). It also implies an alternative (the booking tool) and sets a clear sequence, leaving no ambiguity about its role.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_vacation_rental_protection_statusA

Read-only

Inspect

Check whether a vacation rental property is actively protected by lilo. Returns protection status (active/inactive/expired), protection tier, active features, and last activity timestamp.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context by specifying the return data (protection status, tier, features, timestamp), which goes beyond annotations and helps the agent understand the output structure without an output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the purpose and lists the return values clearly. There is no wasted verbiage, and every part of the sentence contributes essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 parameter, no output schema) and rich annotations, the description is mostly complete. It specifies what is returned, compensating for the lack of output schema. However, it could slightly improve by mentioning error cases or prerequisites, but it's sufficient for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'property_id' fully documented in the schema. The description does not add any additional meaning or details about the parameter beyond what the schema provides, so it meets the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Check whether a vacation rental property is actively protected by lilo') and the resource ('vacation rental property'), distinguishing it from siblings like 'get_vacation_rental_details' or 'get_vacation_rental_onboarding_status' by focusing on protection status rather than general information or onboarding.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for checking protection status but does not explicitly state when to use this tool versus alternatives (e.g., 'get_vacation_rental_onboarding_status' might overlap). It provides context but lacks explicit guidance on exclusions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_world_cup_2026_str_complianceA

Read-only

Inspect

Check World Cup 2026 specific short-term rental compliance requirements for the 16 US host cities. Returns special regulations, surge pricing rules, enhanced permit requirements, and safety standards that apply during the tournament. Pass the city name (required), optional property_id.

ParametersJSON Schema

Name	Required	Description	Default
`city`	Yes	World Cup host city
`property_id`	No	Property UUID (optional)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, which the description aligns with by describing a checking operation. The description adds valuable context beyond annotations by specifying the temporal scope ('during the tournament') and the types of compliance details returned, enhancing the agent's understanding of what to expect without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently conveys purpose, scope, output details, and parameter guidance. It is front-loaded with key information and avoids redundancy, making every word count without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (checking compliance with specific parameters), the description is mostly complete: it covers purpose, scope, output types, and parameter notes. However, without an output schema, it could benefit from more detail on return format or error handling, but annotations and schema coverage mitigate this gap adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for 'city' and 'property_id'. The description adds minimal semantics by noting that city is required and property_id is optional, but does not provide additional details like city format or property_id usage beyond what the schema already covers, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Check') and resource ('World Cup 2026 specific short-term rental compliance requirements'), specifying the scope ('for the 16 US host cities') and output details ('special regulations, surge pricing rules, enhanced permit requirements, and safety standards'). It effectively distinguishes itself from sibling tools like 'get_short_term_rental_regulations' by focusing on tournament-specific requirements.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'during the tournament' and lists sibling tools, but does not explicitly state when to use this tool versus alternatives like 'get_short_term_rental_regulations' or 'get_philadelphia_world_cup_2026_info'. It provides some guidance through scope but lacks explicit comparisons or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_vacation_rentals_side_by_sideA

Read-only

Inspect

Compare up to 5 vacation rental properties side-by-side on key metrics including price, bedrooms, bathrooms, amenities, reputation score, and protection status. Pass an array of property_ids.

ParametersJSON Schema

Name	Required	Description	Default
`property_ids`	Yes	Array of property IDs to compare (max 5)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond annotations by specifying the comparison scope (up to 5 properties) and the key metrics included, which helps the agent understand what behavioral output to expect. However, it does not mention rate limits, authentication needs, or response format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently conveys the tool's purpose, scope, and key metrics without any redundant information. It is front-loaded with the core action and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (comparison of multiple properties), the description is mostly complete. It lacks an output schema, so the description does not explain return values, but it compensates by listing the key metrics compared. With annotations covering safety and the schema fully documenting parameters, the description provides sufficient context for agent use, though output details are missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'property_ids' fully documented in the schema as an array of strings with a max of 5 items. The description adds no additional parameter semantics beyond what the schema provides, such as format examples or ID sourcing. Baseline 3 is appropriate when the schema handles all parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('compare'), resource ('vacation rental properties'), scope ('up to 5'), and key metrics ('price, bedrooms, bathrooms, amenities, reputation score, and protection status'). It distinguishes itself from sibling tools like 'get_vacation_rental_details' or 'search_vacation_rentals_by_amenities' by emphasizing side-by-side comparison rather than individual retrieval or filtering.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when comparing multiple properties, but does not explicitly state when to use this tool versus alternatives like 'get_vacation_rental_details' for single properties or 'search_vacation_rentals_by_amenities' for filtering. It provides basic context (comparing up to 5 properties) but lacks explicit exclusions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_rental_maintenance_taskA

Destructive

Inspect

Create a new maintenance task for a vacation rental property. Supports one-time and recurring tasks. Pass property_id, task_name, and task_type (hvac_filter, deep_clean, appliance_service, pest_control, exterior, safety_check, other). Optional: priority, description, recurring, frequency_days, due_date.

ParametersJSON Schema

Name	Required	Description
`due_date`	No	Due date (ISO format)
`priority`	No	Priority: low, medium, high, urgent
`recurring`	No	Is this a recurring task?
`task_name`	Yes	Name of the maintenance task
`task_type`	Yes	Type: hvac_filter, deep_clean, appliance_service, pest_control, exterior, safety_check, other
`description`	No	Detailed description
`property_id`	Yes	Property UUID or lilo_code
`frequency_days`	No	Days between occurrences (if recurring)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=false and destructiveHint=true, indicating a mutation operation with destructive potential. The description adds valuable context beyond annotations by specifying that it 'Supports one-time and recurring tasks' and listing the task_type enum values, which helps the agent understand the tool's capabilities. However, it doesn't mention authentication requirements, rate limits, or what exactly gets destroyed (e.g., whether existing tasks are overwritten).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and key features, the second lists parameters with clear required/optional distinction. Every word serves a purpose with no redundancy or fluff, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with 8 parameters, 100% schema coverage, and annotations indicating destructive mutation, the description provides adequate context about what the tool does and what parameters it accepts. However, without an output schema, it doesn't describe what the tool returns (e.g., task ID, confirmation message), leaving a minor gap in completeness for the agent's understanding of the full operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters well-documented in the schema (including enums for priority and task_type, and format for due_date). The description lists parameters and provides the task_type enum values, but adds minimal semantic value beyond what the schema already provides. The baseline of 3 is appropriate given the comprehensive schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Create a new maintenance task') and resource ('for a vacation rental property'), distinguishing it from sibling tools like 'get_rental_maintenance_schedule' (which retrieves rather than creates) and 'assign_cleaner_to_rental_turnover' (which handles cleaning assignments rather than maintenance tasks). It specifies the scope of supported task types (one-time and recurring).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by listing required parameters and optional ones, but does not explicitly state when to use this tool versus alternatives. It mentions 'Supports one-time and recurring tasks' which provides some context, but lacks explicit guidance on prerequisites, when-not-to-use scenarios, or named alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

detect_guest_communication_riskA

Read-only

Inspect

Analyze guest messages for risk patterns in vacation rental communications. Returns risk assessment, evidence documentation, and response guidance for hosts. More comprehensive than detect_guest_message_threat_pattern — includes documentation and guest/property cross-referencing.

ParametersJSON Schema

Name	Required	Description
`message`	Yes	Guest message to analyze for risk patterns
`booking_id`	No	Booking UUID for evidence chain
`guest_name`	No	Guest name for documentation
`property_id`	No	Property UUID for cross-reference
`conversation_history`	No	Previous messages for conversation context analysis

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful behavioral context: it returns 'risk assessment, evidence documentation, and response guidance for hosts' and includes 'guest/property cross-referencing.' However, it doesn't disclose other behavioral traits like rate limits, authentication needs, or error conditions. With annotations covering safety, a 3 is appropriate—the description adds some value but not rich behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, and the second adds differentiation from a sibling tool. Every sentence earns its place with no wasted words, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk analysis with multiple parameters) and lack of output schema, the description is reasonably complete. It explains the purpose, distinguishes from a sibling, and hints at return values ('risk assessment, evidence documentation, and response guidance'). However, without an output schema, more detail on return structure would be helpful, but the description compensates adequately for a non-critical analysis tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description implies the tool uses 'guest/property cross-referencing' (hinting at booking_id, guest_name, property_id) and 'conversation context analysis' (hinting at conversation_history), but doesn't add syntax or format details beyond what the schema provides. Baseline 3 is correct when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Analyze guest messages for risk patterns in vacation rental communications.' It specifies the verb ('Analyze'), resource ('guest messages'), and domain context ('vacation rental communications'). It also explicitly distinguishes from sibling 'detect_guest_message_threat_pattern' by stating it's 'more comprehensive' and includes 'documentation and guest/property cross-referencing.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: analyzing guest messages for risk patterns. It explicitly mentions an alternative ('detect_guest_message_threat_pattern') and explains this tool is 'more comprehensive.' However, it doesn't specify when NOT to use it or mention other potential alternatives among the many sibling tools, which prevents a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

detect_guest_message_threat_patternB

Read-only

Inspect

Analyze a guest message or conversation for concerning patterns that may indicate risks to the vacation rental host. Pass the message text and optional conversation_history for context. Returns threat assessment and recommended response.

ParametersJSON Schema

Name	Required	Description
`message`	Yes	Message text to analyze
`booking_id`	No	Booking UUID for context
`conversation_history`	No	Previous messages for context

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds useful context about the tool's focus on 'concerning patterns' and 'risks to the vacation rental host,' and mentions the return format ('threat assessment and recommended response'). However, it doesn't disclose behavioral details like rate limits, authentication requirements, or how the analysis is performed beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured in two sentences: the first explains the purpose and inputs, and the second states the output. It avoids unnecessary details and is front-loaded with the core functionality. However, it could be slightly more efficient by combining ideas more tightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (analyzing messages for threats), the description covers the basic purpose, inputs, and outputs. However, there's no output schema, so the description must fully explain return values, which it does minimally ('threat assessment and recommended response'). With annotations covering safety, it's adequate but lacks depth on behavioral aspects like error handling or analysis methodology.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents all parameters thoroughly. The description mentions 'message text and optional conversation_history for context,' which aligns with the schema but doesn't add significant semantic value beyond it. The baseline score of 3 is appropriate since the schema handles most parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Analyze a guest message or conversation for concerning patterns that may indicate risks to the vacation rental host.' It specifies the verb ('analyze'), resource ('guest message or conversation'), and goal ('detect risks'). However, it doesn't explicitly differentiate from similar siblings like 'detect_guest_communication_risk' or 'analyze_guest_communication_risk', which appear to have overlapping functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage context by mentioning 'optional conversation_history for context' and specifying the input parameters. However, it doesn't explicitly state when to use this tool versus alternatives like 'detect_guest_communication_risk' or 'analyze_booking_threat_risk', nor does it outline any prerequisites or exclusions for its use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fetch_vacation_rental_detailsA

Read-only

Inspect

Fetch complete details for a specific vacation rental property by its lilo code (e.g. PROP-2343). Returns full property data including title, description as markdown text, photos, reviews, amenities, trust score, and booking URL. Optimized for AI deep research consumption.

ParametersJSON Schema

Name	Required	Description	Default
`lilo_code`	Yes	The property lilo code (e.g. PROP-2343)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about the return format ('full property data including...') and optimization for AI consumption, but doesn't disclose behavioral traits like rate limits, authentication needs, or error conditions beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and parameter, the second details the return data and optimization. Every phrase adds value with zero wasted words, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations and a simple single-parameter schema, the description provides adequate context about what data is returned and the tool's optimization. However, without an output schema, more detail about the return structure would be helpful, and it doesn't address potential limitations or error cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with only one parameter fully documented. The description reinforces the parameter's purpose ('by its lilo code') and provides an example format ('e.g. PROP-2343'), but doesn't add significant meaning beyond what the schema already provides. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Fetch complete details'), resource ('vacation rental property'), and identifier mechanism ('by its lilo code'). It distinguishes from sibling tools by specifying it returns comprehensive property data rather than performing analysis, booking, or searching functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('for a specific vacation rental property by its lilo code') and implies it's for 'AI deep research consumption.' However, it doesn't explicitly state when not to use it or name alternative tools for similar purposes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_similar_vacation_rentalsA

Read-only

Inspect

Find vacation rentals similar to a given property. Useful for recommending alternatives when a property is unavailable or when the traveler wants to compare similar options. Pass a property_id (UUID or lilo_code) and optional limit (default 5).

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max results (default 5)
`property_id`	Yes	Property UUID or lilo_code to find similar to

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds useful context about the tool's purpose (finding similar rentals) and default behavior (limit default 5), but does not disclose additional behavioral traits like how similarity is determined, response format, or error handling. With annotations covering safety, the description adds some value but not rich behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose followed by usage context and parameter hints. Every sentence earns its place with no wasted words, making it efficient and well-structured for quick understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, read-only operation), the description covers purpose, usage, and basic parameter info. However, there is no output schema, and the description does not explain return values or similarity criteria. With annotations providing safety context, it is mostly complete but could benefit from more behavioral detail for a higher score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for both parameters (property_id and limit). The description adds minimal semantics by mentioning property_id accepts 'UUID or lilo_code' and limit is 'optional (default 5)', but this mostly repeats or slightly elaborates on schema info. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'find' and resource 'vacation rentals similar to a given property', making the purpose specific. It distinguishes from siblings like 'search_vacation_rentals_by_location' or 'compare_vacation_rentals_side_by_side' by focusing on similarity based on a property ID rather than location, amenities, or direct comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'Useful for recommending alternatives when a property is unavailable or when the traveler wants to compare similar options.' This gives practical scenarios, but it does not explicitly state when not to use it or name specific alternatives among siblings, which prevents a score of 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flag_booking_for_enhanced_monitoringA

Destructive

Inspect

Flag a vacation rental booking for enhanced monitoring by lilo's protection system. Creates an alert with the specified reason and risk indicators. Use this when a host identifies concerning behavior. Requires booking_id and reason. Optional: array of risk_indicators.

ParametersJSON Schema

Name	Required	Description
`reason`	Yes	Reason for flagging
`booking_id`	Yes	Booking UUID to flag
`risk_indicators`	No	Risk indicators observed

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=true, confirming this is a mutation with potential side effects. The description adds valuable context beyond annotations by specifying that it 'Creates an alert' and involves 'lilo's protection system,' which clarifies the behavioral impact. It does not detail rate limits or auth needs, but with annotations covering safety, this provides sufficient transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by additional details in two concise sentences. Every sentence adds value: the first defines the action, the second explains the outcome, and the third provides usage and parameter guidance. There is no wasted text, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with destructiveHint=true and no output schema, the description is reasonably complete. It covers purpose, usage context, and parameters, though it lacks details on response format or error handling. Given the annotations provide safety context and schema covers parameters, this is adequate but could be enhanced with output expectations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for all parameters (booking_id, reason, risk_indicators). The description adds minimal semantics by mentioning 'Requires booking_id and reason' and 'Optional: array of risk_indicators,' but this mostly repeats what the schema already states. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Flag a vacation rental booking for enhanced monitoring') and resource ('booking'), distinguishing it from sibling tools like 'analyze_booking_threat_risk' or 'assess_vacation_rental_booking_risk' which analyze rather than flag. It specifies the system involved ('lilo's protection system') and the outcome ('Creates an alert'), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('when a host identifies concerning behavior'), which helps differentiate it from analysis tools. However, it does not explicitly state when not to use it or name specific alternatives among the many siblings, such as 'report_rental_inventory_issue' for other issues, leaving some ambiguity in tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forecast_vacation_rental_demandA

Read-only

Inspect

Forecast booking demand for vacation rentals in a specific location over a date range. Returns seasonal trends, event-driven demand spikes (World Cup, holidays, concerts), occupancy predictions, and pricing recommendations. Pass location (required), date_range_start, and date_range_end.

ParametersJSON Schema

Name	Required	Description
`location`	Yes	Location to forecast
`date_range_end`	No	Forecast end date
`date_range_start`	No	Forecast start date

Tool Definition Quality

A3.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, which the description aligns with by describing a forecasting operation. The description adds valuable behavioral context beyond annotations by specifying the return content (seasonal trends, event-driven demand spikes, occupancy predictions, pricing recommendations), which helps the agent understand the output format and scope.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose and efficiently lists return values and parameters in two sentences. It avoids unnecessary details, though it could be slightly more structured (e.g., separating return items with bullets).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (forecasting with 3 parameters), annotations cover safety (read-only, non-destructive), and the description adds output details, it is mostly complete. However, without an output schema, the description could benefit from more specifics on return format (e.g., structured data types), but it adequately conveys the key information for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters. The description mentions the parameters ('location (required), date_range_start, and date_range_end') but does not add significant semantic details beyond what the schema provides, such as format examples or constraints. Baseline 3 is appropriate as the schema handles most of the parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('forecast booking demand') and resources ('vacation rentals in a specific location over a date range'). It distinguishes from siblings by focusing on demand forecasting rather than risk analysis, booking, or compliance checks, which are covered by other tools like 'analyze_booking_threat_risk' or 'check_vacation_rental_availability_and_pricing'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, exclusions, or compare it to similar tools like 'search_vacation_rental_market' or 'get_vacation_rental_pricing_analysis', leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_google_vacation_rentals_feedA

Destructive

Inspect

Generate a Google Vacation Rentals XML feed for a host's properties. Enables direct booking via Google Search results with a 3% guest service fee. Pass host_id (UUID) and optional format (xml or json, default xml).

ParametersJSON Schema

Name	Required	Description	Default
`format`	No	Output format: xml or json (default xml)
`host_id`	Yes	Host UUID to generate feed for

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true and readOnlyHint=false, suggesting this is a write operation that may have side effects. The description adds valuable context beyond annotations by specifying the 3% guest service fee and confirming it's a generation operation (not just a read). There's no contradiction with annotations - 'generate' aligns with destructiveHint=true as it likely creates/modifies feed data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: first stating the core purpose and business value, second covering required parameters. Every element earns its place with no redundant information. It's appropriately sized for a tool with 2 parameters and clear functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, 100% schema coverage, and annotations covering safety profile, the description provides good contextual completeness. It explains the business purpose (direct booking via Google Search), mentions the service fee, and covers parameter basics. The main gap is lack of output schema, so the description doesn't specify what the generated feed looks like, but this is partially compensated by mentioning output format options.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents both parameters (host_id as 'Host UUID to generate feed for' and format as 'Output format: xml or json (default xml)'). The description repeats this information but doesn't add meaningful semantic context beyond what's in the schema. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a Google Vacation Rentals XML feed'), the resource ('for a host's properties'), and the business purpose ('Enables direct booking via Google Search results with a 3% guest service fee'). It distinguishes itself from sibling tools by focusing on feed generation rather than analysis, booking, or verification operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool (to enable direct booking via Google Search results), but doesn't explicitly state when not to use it or name specific alternatives among the many sibling tools. The mention of '3% guest service fee' implies a cost consideration, which is helpful guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_str_tax_documentationA

Destructive

Inspect

Generate tax documentation for short-term rental income. Includes occupancy tax calculations, income summaries, and reporting data for tax filing. Pass host_id (UUID) and year (required). Optional: quarter (1-4) for quarterly reports.

ParametersJSON Schema

Name	Required	Description
`year`	Yes	Tax year
`host_id`	Yes	Host UUID
`quarter`	No	Quarter (1-4, optional)

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=true, suggesting a write operation with potential destructive effects. The description adds value by clarifying the output ('tax documentation... for tax filing') and the optional quarter parameter, but it does not elaborate on the destructive nature (e.g., what gets modified or overwritten) or other behavioral traits like permissions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by parameter guidance in two efficient sentences. Every sentence adds value: the first defines the tool's function and components, and the second specifies parameter requirements and options, with no redundant or vague language.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (tax documentation generation with destructive hints) and lack of an output schema, the description is adequate but incomplete. It covers the purpose and parameters well but does not detail the output format, potential side effects from destructiveHint, or error conditions, leaving gaps for an AI agent to infer behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for year, host_id, and quarter. The description adds minimal semantics by reiterating that host_id is a UUID and quarter is optional (1-4), but it does not provide additional context like format examples or usage scenarios beyond what the schema already documents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate tax documentation') and resource ('for short-term rental income'), with detailed components like 'occupancy tax calculations, income summaries, and reporting data for tax filing.' It effectively distinguishes itself from sibling tools focused on risk analysis, booking, compliance, and other rental management functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying required parameters (host_id and year) and an optional quarter for quarterly reports, but it does not explicitly state when to use this tool versus alternatives or provide context about prerequisites. No sibling tools overlap in tax documentation generation, so differentiation is inherent but not articulated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_chargeback_defense_for_bookingA

Read-only

Inspect

Generate a chargeback defense packet for a payment dispute on a vacation rental booking. Includes verified evidence of guest consent, check-in documentation, and interaction history. Pass booking_id (UUID), optional dispute_id, and disputed amount in cents.

ParametersJSON Schema

Name	Required	Description
`amount`	No	Disputed amount in cents
`booking_id`	Yes	Booking UUID
`dispute_id`	No	Dispute ID (if available)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, covering safety. The description adds valuable behavioral context beyond annotations: it specifies the output includes 'verified evidence of guest consent, check-in documentation, and interaction history', which helps the agent understand what the generated packet contains. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose and efficiently listing parameters. Every sentence adds value without waste, making it easy to scan and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with good annotations (read-only, non-destructive) and full schema coverage, the description is mostly complete. It adds output details (evidence types in the packet), though no output schema exists. It could improve by specifying format (e.g., PDF bundle) or usage timing, but gaps are minor given the context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents parameters. The description mentions parameters ('Pass booking_id (UUID), optional dispute_id, and disputed amount in cents') but adds no additional meaning beyond the schema's descriptions. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a chargeback defense packet') and resource ('for a payment dispute on a vacation rental booking'), with details about the evidence included. It distinguishes from sibling tools like 'get_dispute_defense_packet_for_booking' by specifying the packet is for chargebacks and includes verified evidence types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (payment disputes on vacation rentals) and mentions optional parameters, but does not explicitly state when to use this tool versus alternatives like 'get_dispute_defense_packet_for_booking' or 'get_dispute_evidence_bundle_for_booking'. No exclusions or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dispute_defense_packet_for_bookingA

Read-only

Inspect

Generate a dispute defense packet with independently verified evidence for a vacation rental payment dispute. Alias for get_chargeback_defense_for_booking.

ParametersJSON Schema

Name	Required	Description
`amount`	No	Disputed amount in cents
`booking_id`	Yes	Booking UUID
`dispute_id`	No	Dispute ID (if available)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds value by specifying the output includes 'independently verified evidence', which clarifies the nature of the generated packet beyond what annotations convey. However, it lacks details on format, size, or any limitations like rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose and includes the alias information without unnecessary elaboration. Every word contributes to understanding the tool's function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (generating evidence packets), annotations cover safety, and schema fully describes inputs, the description is mostly complete. However, without an output schema, it could benefit from more detail on the packet format or evidence types, though the mention of 'independently verified evidence' partially compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for booking_id, amount, and dispute_id. The description does not add any parameter-specific details beyond the schema, such as explaining how amount or dispute_id affect the output. Baseline score of 3 is appropriate as the schema adequately documents parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a dispute defense packet with independently verified evidence') and resource ('for a vacation rental payment dispute'), distinguishing it from siblings like 'get_chargeback_defense_for_booking' by noting it's an alias, and from other tools that analyze risk or manage bookings rather than generating evidence packets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('for a vacation rental payment dispute') and mentions an alternative ('Alias for get_chargeback_defense_for_booking'), but does not explicitly state when to use this tool versus other dispute-related tools like 'get_dispute_evidence_bundle_for_booking' or 'predict_booking_chargeback_probability', leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dispute_evidence_bundle_for_bookingA

Read-only

Inspect

Generate a complete evidence bundle for dispute resolution on a vacation rental booking. Includes all verified interactions, consent records, and documentation. Pass booking_id (UUID) and dispute_type (chargeback, damage, review, or general).

ParametersJSON Schema

Name	Required	Description	Default
`booking_id`	Yes	Booking UUID for the dispute
`dispute_type`	No	Type: chargeback, damage, review, or general

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond annotations by specifying what the bundle includes ('all verified interactions, consent records, and documentation') and the purpose ('for dispute resolution'), which helps the agent understand the output's nature and use case. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and content, the second specifies required parameters. Every sentence adds essential information with zero wasted words, making it easy to parse and front-loaded with key details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema), annotations cover safety (read-only, non-destructive), and the description clarifies the bundle's contents and purpose. However, without an output schema, the description could better explain the return format (e.g., file type, structure) to fully prepare the agent for what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (booking_id and dispute_type) fully described in the schema. The description adds minimal semantic value by restating the parameter names and dispute_type values, but does not provide additional context like format examples or usage nuances beyond what the schema already covers.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a complete evidence bundle') and resource ('for dispute resolution on a vacation rental booking'), including the scope ('all verified interactions, consent records, and documentation'). It distinguishes from sibling tools like 'get_chargeback_defense_for_booking' or 'get_evidence_timeline_for_rental' by focusing on comprehensive bundle generation rather than specific defenses or timelines.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'dispute resolution' and listing dispute_type values, but does not explicitly state when to use this tool versus alternatives like 'get_dispute_defense_packet_for_booking' or 'query_vacation_rental_evidence_chain'. It provides basic parameter guidance but lacks explicit when/when-not instructions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_evidence_timeline_for_rentalA

Read-only

Inspect

Get a chronological timeline of all evidence records for a vacation rental property or specific booking. Filter by date range. Returns events in order with timestamps, types, and verification status.

ParametersJSON Schema

Name	Required	Description
`end_date`	No	End date filter (YYYY-MM-DD)
`booking_id`	No	Booking UUID (optional)
`start_date`	No	Start date filter (YYYY-MM-DD)
`property_id`	Yes	Property UUID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about the return format ('events in order with timestamps, types, and verification status'), but does not disclose behavioral traits like pagination, rate limits, or authentication needs beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by filtering and return details in two efficient sentences. Every sentence earns its place without redundancy, making it appropriately sized and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (4 parameters, no output schema), the description is complete enough for a read-only operation with good annotations. It covers purpose, filtering, and return format, though it could benefit from mentioning any limitations (e.g., max date range) or error handling for missing parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all four parameters. The description adds marginal value by mentioning 'Filter by date range' and 'specific booking', which aligns with the schema's 'start_date', 'end_date', and 'booking_id', but does not provide additional syntax or format details beyond what the schema specifies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get a chronological timeline') and resources ('evidence records for a vacation rental property or specific booking'), distinguishing it from siblings like 'query_vacation_rental_evidence_chain' or 'verify_vacation_rental_evidence_record' by focusing on timeline retrieval rather than querying or verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Filter by date range') and implies usage for chronological evidence retrieval. However, it does not explicitly state when not to use it or name specific alternatives among siblings, such as 'query_vacation_rental_evidence_chain' for non-chronological queries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_lilo_founding_member_availabilityA

Read-only

Inspect

Check how many founding member spots remain for lilo's vacation rental protection. Returns total_spots (200), spots_claimed, spots_remaining, and founding_price ($149/month locked for life). No parameters required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and destructiveHint; description adds no additional behavioral context beyond listing return fields, which is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise sentence that front-loads purpose and lists return values, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with no parameters and no output schema, the description covers purpose and main return fields. Could elaborate on what 'founding member' means but sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters, so no extra meaning needed. The description accurately states 'No parameters required.'

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Check' and the resource 'founding member spots for lilo's vacation rental protection', differentiating it from sibling tools like get_lilo_protection_network_stats.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. It could mention that for broader stats, use get_lilo_protection_network_stats.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_lilo_protection_network_statsA

Read-only

Inspect

Get aggregate statistics about the lilo vacation rental protection network. Returns total properties protected, evidence integrity status, and active protection capabilities. No parameters required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, establishing this as a safe read operation. The description adds valuable context beyond annotations by specifying what data is returned (three specific metrics) and confirming no parameters are needed. It doesn't mention rate limits, authentication requirements, or response format details, but provides useful behavioral information about the tool's scope and output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly front-loaded with the core purpose in the first clause, followed by specific return details, and ends with the crucial parameter information. Every sentence earns its place, with zero wasted words or redundant information. The structure moves from general to specific in a logical flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter read-only tool with good annotations, the description provides excellent coverage of purpose, output details, and parameter requirements. The main gap is the absence of an output schema, so the agent doesn't know the exact structure of the returned data. However, the description does specify the three metrics that will be included, which provides substantial guidance for a simple stats retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema description coverage, the baseline would be 4. The description explicitly states 'No parameters required,' which reinforces what the empty input schema already shows. This clear confirmation adds value by eliminating any potential ambiguity about whether parameters might be optional rather than truly absent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('aggregate statistics about the lilo vacation rental protection network'), specifying the exact data returned (total properties protected, evidence integrity status, active protection capabilities). It distinguishes itself from sibling tools like 'check_vacation_rental_protection_status' by focusing on network-wide aggregates rather than individual property status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Get aggregate statistics'), and explicitly states 'No parameters required' which helps the agent understand this is a simple data retrieval operation. However, it doesn't explicitly mention when NOT to use it or name specific alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_local_recommendations_near_rentalB

Read-only

Inspect

Get local recommendations near a vacation rental property. Returns nearby restaurants, coffee shops, grocery stores, activities, and attractions. Filter by type: restaurants, coffee, grocery, activities, attractions.

ParametersJSON Schema

Name	Required	Description	Default
`type`	No	Optional: recommendation type (restaurants, coffee, grocery, activities, attractions)
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds context about what types of recommendations are returned (restaurants, coffee shops, etc.) and mentions filtering capability, which provides useful behavioral details beyond the annotations. However, it doesn't disclose rate limits, authentication needs, or result format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that efficiently convey the core functionality. The first sentence states the purpose and return values, the second explains filtering. No wasted words, though it could be slightly more structured with bullet points for the recommendation types.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations and full schema coverage, the description provides adequate context about what the tool returns. However, without an output schema, it doesn't specify the structure or format of recommendations (e.g., distance, ratings, addresses). The description covers the basics but leaves return format ambiguous.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters. The description mentions filtering by type which aligns with the 'type' parameter, but adds no additional semantic context beyond what's in the schema. This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get local recommendations near a vacation rental property' with specific resources listed (restaurants, coffee shops, etc.). It distinguishes itself from most siblings by focusing on local recommendations, though it doesn't explicitly differentiate from 'get_neighborhood_info_for_rental' which might overlap in scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, timing, or how it differs from similar tools like 'get_neighborhood_info_for_rental' or location-based search tools in the sibling list. Usage is implied but not explicitly stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_neighborhood_info_for_rentalA

Read-only

Inspect

Get neighborhood information for a vacation rental location including safety scores, walkability, public transit access, nearby amenities, and local character. Helps travelers understand the area around a property.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only and non-destructive behavior, which the description aligns with by using 'Get' (implying retrieval). The description adds valuable context beyond annotations by specifying the types of neighborhood information returned (safety scores, walkability, etc.), which helps the agent understand the tool's output scope. No contradictions with annotations are present.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the tool's purpose with specific details, and the second provides usage context. Every word adds value, with no redundancy or unnecessary information, making it easy to parse and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (retrieval of neighborhood data), the description is complete enough with annotations covering safety (read-only, non-destructive) and the description outlining the information scope. However, the lack of an output schema means the agent must infer the return format from the description alone, which is adequate but could be more precise about data structure or limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'property_id' clearly documented. The description does not add any parameter-specific details beyond what the schema provides, such as format examples or constraints. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate but doesn't need to.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get neighborhood information') and resources ('for a vacation rental location'), listing concrete information types like safety scores, walkability, and amenities. It distinguishes itself from sibling tools like 'get_vacation_rental_details' or 'get_local_recommendations_near_rental' by focusing on neighborhood characteristics rather than property details or specific recommendations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Helps travelers understand the area around a property'), suggesting it's for travel planning or property evaluation. However, it lacks explicit guidance on when to use this tool versus alternatives like 'get_local_recommendations_near_rental' or 'get_vacation_rental_details', and does not mention any prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_philadelphia_250th_anniversary_eventsA

Read-only

Inspect

Get America's 250th Anniversary (July 4, 2026) event information for Philadelphia. Returns key historic sites, planned celebrations, expected visitor numbers, and accommodation surge predictions for the semiquincentennial. No parameters required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds useful context beyond annotations: it specifies the return content (historic sites, celebrations, visitor numbers, accommodation predictions) and notes 'No parameters required.' The annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description doesn't add behavioral details like rate limits or authentication requirements, but provides adequate context given the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise and front-loaded: a single sentence that immediately states the tool's purpose, specifies what it returns, and notes the parameter situation. Every word earns its place with zero wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a parameterless read-only tool with good annotations, the description is nearly complete. It specifies what information will be returned (historic sites, celebrations, visitor numbers, accommodation predictions) and the event context. Without an output schema, the description provides adequate information about return values. The only minor gap is lack of detail about format or structure of returned data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema description coverage, the baseline is 4. The description explicitly states 'No parameters required,' which adds clarity beyond the empty schema. This helps the agent understand this is a parameterless query tool.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get America's 250th Anniversary (July 4, 2026) event information for Philadelphia.' It specifies the exact resource (Philadelphia's 250th anniversary events), the date context, and distinguishes itself from sibling tools like 'get_philadelphia_landmark_details' or 'search_philadelphia_event_venues' by focusing specifically on the semiquincentennial celebrations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: when information about Philadelphia's 250th anniversary events is needed. It doesn't explicitly mention when not to use it or name specific alternatives, but the context is sufficiently clear given the tool's specialized focus on a specific historical event.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_philadelphia_landmark_detailsB

Read-only

Inspect

Get detailed information about a specific Philadelphia landmark including full property manifest and Schema.org structured data. Pass landmark_id (e.g., phl-landmark-123) and source type (landmark, religious, aahs).

ParametersJSON Schema

Name	Required	Description	Default
`source`	No	Data source type
`landmark_id`	Yes	Landmark ID (e.g., phl-landmark-123)

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about what information is returned (full property manifest and Schema.org structured data), which goes beyond the annotations. However, it doesn't mention potential limitations like rate limits, authentication requirements, or data freshness, leaving some behavioral aspects uncovered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that efficiently convey the tool's purpose and parameter requirements. The first sentence states what the tool does, and the second explains what to pass. There's no unnecessary verbiage, though it could be slightly more front-loaded by moving the parameter guidance to a separate section in a longer description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given that there's no output schema, the description should ideally provide more information about the return format. While it mentions 'full property manifest and Schema.org structured data,' it doesn't specify the structure, fields, or examples of what's returned. The annotations cover safety aspects, but for a tool with no output schema, the description leaves the agent guessing about the exact response format and content.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters clearly documented in the schema. The description repeats the landmark_id example from the schema and lists the source enum values, adding minimal value beyond what's already in the structured schema. This meets the baseline expectation when schema coverage is complete, but doesn't provide additional semantic context like parameter interactions or special formatting requirements.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get detailed information about a specific Philadelphia landmark including full property manifest and Schema.org structured data.' It specifies the resource (Philadelphia landmark) and the type of information returned (detailed info, property manifest, structured data). However, it doesn't explicitly differentiate from sibling tools like 'search_philadelphia_historic_properties' or 'get_philadelphia_250th_anniversary_events', which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance by specifying the required parameters (landmark_id and source type) and giving examples of valid source values. However, it doesn't explicitly state when to use this tool versus alternatives like 'search_philadelphia_historic_properties' (which might be for broader searches) or 'get_philadelphia_world_cup_2026_info' (which is event-specific). No explicit when-not-to-use or alternative recommendations are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_philadelphia_world_cup_2026_infoA

Read-only

Inspect

Get World Cup 2026 information for Philadelphia. FREE TOOL. Returns match schedule, venue details (Lincoln Financial Field), expected accommodation demand surge, transportation info, and STR compliance requirements. No parameters required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it specifies the return content (match schedule, venue details, etc.), mentions 'FREE TOOL' indicating no payment requirements, and states 'No parameters required' clarifying simplicity. Annotations already cover read-only/non-destructive nature, so this additional context earns a strong score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by specific return details and operational notes. Every sentence adds value: the first states what it does, the second lists return items, and the third covers cost and parameters. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 0 parameters, read-only annotations, and no output schema, the description is quite complete: it covers purpose, return content, cost, and parameter requirements. The only minor gap is lack of output format details, but given the tool's simplicity, this is acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema coverage, the baseline would be 4. The description explicitly states 'No parameters required,' which adds clarity beyond the empty schema, confirming the tool's parameter-free nature and preventing confusion.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the verb ('Get') and resource ('World Cup 2026 information for Philadelphia'), with specific details like match schedule and venue details. It clearly distinguishes this tool from sibling tools focused on vacation rental analysis, compliance, or other Philadelphia events.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (for World Cup 2026 information in Philadelphia) and mentions 'FREE TOOL' which implies no cost barriers. However, it doesn't explicitly state when not to use it or name specific alternatives among the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_rental_cleaning_scheduleA

Read-only

Inspect

Get the cleaning schedule and cleaner assignments for a vacation rental property. Shows upcoming turnovers between bookings, assigned cleaners, and scheduling gaps. Pass property_id and optional days_ahead (default 14).

ParametersJSON Schema

Name	Required	Description	Default
`days_ahead`	No	Days to look ahead (default 14)
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond this by specifying what the tool shows (upcoming turnovers, cleaner assignments, scheduling gaps), which helps the agent understand the output behavior. It does not mention rate limits or auth needs, but the annotations cover the safety profile adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by specific details and parameter guidance in the second sentence. Every sentence adds value without redundancy, making it efficiently structured and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema), the description is mostly complete: it explains the purpose, what it shows, and parameter usage. However, it lacks details on output format (e.g., structure of returned schedule) and any error conditions, which could be helpful since there's no output schema. Annotations cover safety, but more behavioral context would enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (property_id and days_ahead) with their types and descriptions. The description adds minimal value by mentioning the default for days_ahead, but it does not provide additional semantic context beyond what the schema offers, aligning with the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific verb ('Get') and resource ('cleaning schedule and cleaner assignments for a vacation rental property'), with explicit details about what it shows (upcoming turnovers, assigned cleaners, scheduling gaps). It distinguishes from sibling tools like 'get_rental_maintenance_schedule' by focusing on cleaning rather than maintenance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the required property_id and optional days_ahead parameter, but it does not explicitly state when to use this tool versus alternatives (e.g., no comparison to other scheduling or assignment tools). It provides basic context but lacks explicit guidance on exclusions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_rental_maintenance_scheduleA

Read-only

Inspect

Get upcoming and overdue maintenance tasks for a vacation rental property. Includes HVAC filter changes, appliance servicing, pest control, and all recurring maintenance. Pass property_id, optional include_overdue (default true), and days_ahead (default 30).

ParametersJSON Schema

Name	Required	Description
`days_ahead`	No	Days to look ahead (default 30)
`property_id`	Yes	Property UUID or lilo_code
`include_overdue`	No	Include overdue tasks (default true)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds useful context beyond annotations by specifying the scope ('upcoming and overdue') and types of tasks included, though it does not detail rate limits, auth needs, or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by parameter details in a second sentence. It is efficiently structured with zero wasted words, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with full parameter coverage and no output schema, the description adequately covers purpose and parameters. However, it lacks details on return format or pagination, which could be helpful given the absence of an output schema, leaving minor gaps in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are fully documented in the schema. The description adds minimal value by listing parameters and their defaults, but does not provide additional semantics beyond what the schema already covers, aligning with the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'upcoming and overdue maintenance tasks for a vacation rental property', with specific examples like HVAC filter changes and appliance servicing. It distinguishes from sibling tools like 'get_rental_cleaning_schedule' by focusing on maintenance rather than cleaning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving maintenance tasks but does not explicitly state when to use this tool versus alternatives like 'create_rental_maintenance_task' or other get_* tools. No exclusions or prerequisites are mentioned, leaving usage context partially inferred.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_short_term_rental_regulationsA

Read-only

Inspect

Get local short-term rental (STR) regulations for a specific city and state. Returns permit requirements, occupancy taxes, maximum guest limits, zoning restrictions, and operational requirements. Essential for hosts to understand compliance. Pass state code (required), optional city and property_type.

ParametersJSON Schema

Name	Required	Description
`city`	No	City name
`state`	Yes	State code (e.g., CA, NY)
`property_type`	No	Property type for specific rules

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context about what information is returned (permit requirements, occupancy taxes, etc.) and the essential purpose for hosts, which goes beyond the annotations. No contradictions exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states purpose and return values, the second provides essential context and parameter guidance. Every sentence adds value with zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations and full schema coverage, the description adequately covers purpose, return values, and parameter basics. However, without an output schema, it could benefit from more detail on response format or data structure, though the listed return items provide reasonable guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds minimal value by noting 'state code (required), optional city and property_type' and implying property_type affects 'specific rules', but doesn't provide additional syntax or format details beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'local short-term rental (STR) regulations' with specific scope 'for a specific city and state'. It distinguishes from sibling tools like 'check_str_permit_requirements' by covering broader regulatory information beyond just permits.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Essential for hosts to understand compliance') but doesn't explicitly state when to use this tool versus alternatives like 'check_str_permit_requirements' or 'get_str_insurance_requirements'. It mentions parameter requirements but not comparative guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_str_insurance_requirementsA

Read-only

Inspect

Get insurance requirements and recommendations for short-term rental hosts by state. Returns liability minimums, STR-specific coverage options, and recommended policy limits based on property type and estimated revenue.

ParametersJSON Schema

Name	Required	Description
`state`	Yes	State code
`property_type`	No	Property type
`estimated_revenue`	No	Estimated annual revenue

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds valuable context about what specific information is returned (liability minimums, STR-specific coverage options, recommended policy limits) and the factors influencing recommendations (property type and estimated revenue), which goes beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and scope, the second details the return content and influencing factors. Every sentence earns its place with no wasted words, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations and full schema coverage, the description provides sufficient context about what information is returned and how it's tailored. However, without an output schema, it doesn't specify the return format (e.g., structured data, text summary), which could be helpful for an agent. It's mostly complete but has a minor gap in output details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters. The description mentions that recommendations are 'based on property type and estimated revenue,' which aligns with the optional parameters, but doesn't add syntax, format, or constraint details beyond what the schema provides. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get insurance requirements and recommendations') and resources ('for short-term rental hosts by state'), distinguishing it from sibling tools that focus on risk analysis, booking, compliance, or other aspects of vacation rental management. It precisely defines what information will be returned.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by specifying 'for short-term rental hosts by state,' but doesn't explicitly state when to use this tool versus alternatives like 'check_str_permit_requirements' or 'get_short_term_rental_regulations.' No guidance is provided on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_ai_manifestA

Read-only

Inspect

Get the AI-optimized property manifest in schema.org format with lilo extensions. Available in YAML, JSON, or JSON-LD format. Use this for structured data integration and AI agent consumption.

ParametersJSON Schema

Name	Required	Description	Default
`format`	No	Output format: yaml, json, or jsonld (default jsonld)
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond this by specifying the available output formats and the tool's intended use cases (structured data integration, AI consumption), which helps the agent understand behavioral aspects like format selection and integration scenarios.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the core purpose and available formats, the second specifies the use cases. Every element adds value without redundancy, making it easy to parse and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, read-only operation), the description provides sufficient context about what the tool returns (AI-optimized manifest in specific formats) and its primary use cases. While there's no output schema, the description compensates by specifying formats and integration purposes, though it could briefly mention the content scope of the manifest.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear documentation for both parameters (property_id and format). The description mentions 'Available in YAML, JSON, or JSON-LD format' which aligns with the format parameter but doesn't add significant meaning beyond what the schema already provides. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get'), the resource ('AI-optimized property manifest'), the format ('schema.org format with lilo extensions'), and available output formats ('YAML, JSON, or JSON-LD'). It distinguishes this from sibling tools like 'get_vacation_rental_details' or 'get_vacation_rental_identity_manifest' by focusing on structured data integration and AI consumption.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('for structured data integration and AI agent consumption'), which helps differentiate it from other get_* tools that might return different data formats or purposes. However, it doesn't explicitly state when not to use it or name specific alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_detailsA

Read-only

Inspect

Get complete details for a vacation rental property including name, location, address, property type, bedrooms, bathrooms, nightly rate, amenities, house rules, photos, protection status, and host reputation score. Use the property_id (UUID) or lilo_code (e.g. PROP-2343) to identify the property.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds value by specifying the comprehensive data returned (amenities, photos, host reputation score, etc.) and clarifying the identifier formats (UUID or lilo_code), which provides useful context beyond the annotations. However, it does not disclose behavioral traits like rate limits, error conditions, or response format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first lists all returned details, and the second specifies the required input. It is front-loaded with the key information and contains no redundant or unnecessary content, making it highly concise and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (retrieving comprehensive property details), the description is mostly complete. It outlines the scope of returned data and input requirements. However, without an output schema, it does not specify the structure or format of the response (e.g., JSON fields, nested objects), which is a minor gap. The annotations cover safety, and the schema covers parameters well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'property_id' fully documented in the schema as 'Property UUID or lilo_code'. The description repeats this information but does not add significant meaning beyond what the schema provides, such as examples of lilo_code formats or validation rules. The baseline score of 3 is appropriate given the high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get complete details') and resource ('vacation rental property'), and it enumerates the comprehensive scope of information returned (name, location, address, etc.). This distinguishes it from sibling tools like 'get_vacation_rental_host_reputation' or 'get_vacation_rental_house_rules' which retrieve only specific subsets of data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the required identifier ('property_id' or 'lilo_code'), but it does not explicitly state when to use this tool versus alternatives like 'search_vacation_rentals_by_location' or 'find_similar_vacation_rentals'. It provides basic context but lacks explicit guidance on exclusions or comparisons with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_faqsA

Read-only

Inspect

Get all frequently asked questions and answers for a vacation rental property. Covers WiFi password, check-in instructions, parking, appliances, emergency contacts, and local info. Filter by category: check_in, check_out, wifi_internet, parking, appliances, emergency, local_info.

ParametersJSON Schema

Name	Required	Description	Default
`category`	No	Optional: filter by category (check_in, check_out, wifi_internet, parking, appliances, emergency, local_info)
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds valuable context beyond annotations by detailing the specific topics covered (e.g., emergency contacts, local info) and the filtering capability by category, which helps the agent understand the tool's behavioral scope. No contradictions with annotations exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the core purpose and scope, and the second specifies the filtering capability. Every sentence adds essential information without redundancy, making it front-loaded and easy to parse. No wasted words or unnecessary details are present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (a read-only FAQ retrieval with two parameters), annotations cover safety (readOnlyHint, destructiveHint), and schema coverage is 100%, the description is largely complete. It adds useful context on content areas and filtering. However, without an output schema, it could briefly hint at the return format (e.g., list of Q&A pairs), but this is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents both parameters (property_id and category) with descriptions and optionality. The description adds marginal value by listing the category options in a more readable format, but does not provide additional semantic context beyond what the schema already states. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Get') and resource ('frequently asked questions and answers for a vacation rental property'), and distinguishes it from siblings by focusing on FAQs rather than details, rules, or other property information. It explicitly lists the content areas covered (WiFi password, check-in instructions, etc.), making the scope unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the filterable categories, suggesting it should be used to retrieve FAQ content for a property. However, it does not explicitly state when to use this tool versus alternatives like 'get_vacation_rental_details' or 'get_vacation_rental_house_rules', nor does it provide exclusions or prerequisites. The guidance is functional but lacks sibling differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_host_reputationA

Read-only

Inspect

Get the reputation score and verified evidence summary for a vacation rental host. Returns protection stats, evidence count, dispute resolution rate, and verification status. Use this to verify a host's trustworthiness.

ParametersJSON Schema

Name	Required	Description	Default
`host_id`	Yes	Host UUID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds context about what data is returned (e.g., protection stats, evidence count), which is useful beyond the annotations, but does not disclose behavioral traits like rate limits, authentication needs, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by a usage guideline, making it efficient with zero waste. Both sentences earn their place by adding value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a read-only query with one parameter) and lack of output schema, the description adequately covers the purpose and usage. However, it could be more complete by detailing the return format or potential errors, though annotations help mitigate some gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'host_id' fully documented as 'Host UUID'. The description does not add any meaning beyond what the schema provides, such as format examples or constraints, so it meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get the reputation score and verified evidence summary') and resources ('for a vacation rental host'), distinguishing it from siblings like 'get_vacation_rental_reputation_score' by including additional details like protection stats and verification status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage ('Use this to verify a host's trustworthiness'), but does not explicitly mention when not to use it or name specific alternatives among the many sibling tools, such as 'check_vacation_rental_protection_status' or 'get_vacation_rental_trust_certificate'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_house_rulesA

Read-only

Inspect

Get the house rules for a vacation rental in structured, machine-readable format. Returns maximum guests, quiet hours, pet policy, smoking policy, check-in/check-out times, parking, and special restrictions.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, covering safety. The description adds valuable behavioral context by specifying the exact fields returned (maximum guests, quiet hours, etc.) and the structured format, which helps the agent understand what to expect. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose and immediately details the return content. Every word adds value without redundancy, making it easy for an agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with one parameter and no output schema, the description provides strong context by listing the specific data fields returned. However, it lacks details on error handling or edge cases (e.g., what happens if the property_id is invalid), which would elevate it to a 5.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'property_id' fully documented in the schema. The description does not add any parameter-specific information beyond what the schema provides, such as examples of valid IDs. Baseline 3 is appropriate when the schema handles parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get the house rules') and resource ('for a vacation rental'), and specifies the output format ('structured, machine-readable format'). It distinguishes from sibling tools like 'get_vacation_rental_details' by focusing exclusively on house rules rather than general property information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when house rules are needed, but provides no explicit guidance on when to use this tool versus alternatives like 'get_vacation_rental_details' or 'get_vacation_rental_faqs'. No exclusions or prerequisites are mentioned, leaving the agent to infer context from the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_identity_manifestB

Read-only

Inspect

Get the machine-readable property identity manifest with structured evidence data, visibility metadata, and verification references. Use this for programmatic property data consumption.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds value by specifying the output includes 'structured evidence data, visibility metadata, and verification references,' which provides context beyond annotations. However, it doesn't detail behavioral aspects like rate limits, error handling, or authentication needs, leaving room for improvement.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded, with two sentences that efficiently convey the tool's purpose and usage. Every sentence adds value: the first defines what the tool does, and the second provides usage context. It avoids redundancy and is appropriately sized for a single-parameter tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (simple read operation with one parameter), annotations cover safety (read-only, non-destructive), and schema fully documents the input. However, there's no output schema, and the description only hints at output content without detailing structure or examples. For a tool focused on 'machine-readable' data, more completeness on output expectations would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'property_id' fully documented in the schema as 'Property UUID or lilo_code.' The description doesn't add any parameter-specific information beyond this, so it meets the baseline of 3 where the schema handles the heavy lifting without extra semantic value from the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get the machine-readable property identity manifest with structured evidence data, visibility metadata, and verification references.' It specifies the verb ('Get') and resource ('property identity manifest') with details about content. However, it doesn't explicitly differentiate from sibling tools like 'get_vacation_rental_ai_manifest' or 'get_vacation_rental_details', which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage context: 'Use this for programmatic property data consumption,' which implies when to use it (for automated data processing). However, it lacks explicit guidance on when to choose this tool over alternatives (e.g., vs. 'get_vacation_rental_details' for human-readable info) or any exclusions, making it only implied rather than comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_inventoryA

Read-only

Inspect

Get the inventory list for a vacation rental property. Returns all tracked items with quantities, locations, condition, and recent check history. Filter by category: furniture, appliance, electronics, linen, kitchenware, bathroom, decor, outdoor, safety, amenity, supply.

ParametersJSON Schema

Name	Required	Description	Default
`category`	No	Filter by category: furniture, appliance, electronics, linen, kitchenware, bathroom, decor, outdoor, safety, amenity, supply
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, which the description aligns with by describing a retrieval operation ('Get the inventory list'). The description adds value by detailing the return content (tracked items with quantities, locations, condition, check history) and filtering categories, but it does not disclose additional behavioral traits like rate limits, auth needs, or pagination. With annotations covering safety, this is adequate but not rich in extra context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence and efficiently adds details in the second sentence without waste. Every sentence earns its place by specifying return data and filtering options, making it appropriately sized and structured for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema), the description is mostly complete: it explains what the tool does, what it returns, and filtering options. However, it lacks details on output format (e.g., structure of the inventory list) and any limitations (e.g., pagination or error handling), which could be useful since there's no output schema. With good annotations and schema coverage, it's sufficient but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (property_id and category) well-documented in the schema. The description adds minimal semantics by listing the filter categories in its text, but this mostly repeats what's in the schema's description for the category parameter. Since the schema does the heavy lifting, the baseline score of 3 is appropriate as the description provides little additional parameter insight.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get the inventory list') and resources ('for a vacation rental property'), distinguishing it from sibling tools like 'get_vacation_rental_details' or 'report_rental_inventory_issue' by focusing on inventory items with specific attributes. It explicitly mentions what is returned (tracked items with quantities, locations, condition, check history) and filtering capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool by specifying it returns inventory data with filtering options, but it does not explicitly state when not to use it or name alternatives among sibling tools (e.g., 'report_rental_inventory_issue' might be for reporting issues, not retrieving data). The guidance is implied through the focus on inventory retrieval rather than explicit exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_onboarding_statusA

Read-only

Inspect

Check onboarding progress for a vacation rental property on lilo. Returns which setup steps are complete and which are still needed (property details, photos, house rules, calendar sync, payment setup, etc.). Pass lilo_code (e.g. PROP-6408) or property_id (UUID).

ParametersJSON Schema

Name	Required	Description	Default
`lilo_code`	No	Property lilo code (e.g. PROP-6408)
`property_id`	No	Property UUID (alternative to lilo_code)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond annotations by specifying the return content ('setup steps are complete and which are still needed') and listing examples like 'property details, photos, house rules, calendar sync, payment setup'. It does not mention rate limits, auth needs, or error conditions, but provides useful behavioral details about the output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and return value, and the second specifies the parameters. Every sentence adds essential information without redundancy, making it front-loaded and easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (read-only status check with 2 parameters) and rich annotations (readOnlyHint, destructiveHint), the description is mostly complete. It explains the purpose, output content, and parameters. However, without an output schema, it could benefit from more detail on the return format (e.g., structure of steps). It adequately covers core functionality but leaves some behavioral aspects unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters ('lilo_code' and 'property_id') well-documented in the schema. The description adds minimal value by restating the parameters and providing an example ('e.g. PROP-6408'), but does not explain semantics beyond what the schema already covers, such as parameter precedence or mutual exclusivity. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Check onboarding progress') and resource ('vacation rental property on lilo'), distinguishing it from sibling tools like 'get_vacation_rental_details' or 'get_vacation_rental_inventory' that retrieve different property information. It explicitly mentions what is returned ('which setup steps are complete and which are still needed'), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the required identifiers ('lilo_code or property_id') and listing examples of setup steps, but it does not explicitly state when to use this tool versus alternatives like 'get_vacation_rental_details' or provide exclusions. The context is clear for checking onboarding status, but no explicit guidance on alternatives or prerequisites is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_pricing_analysisA

Read-only

Inspect

Get competitive pricing analysis for a vacation rental market. Returns comparable property rates, market average, pricing optimization suggestions, and seasonal adjustments. Pass location (required), optional property_id for direct comparison, and optional bedrooms count.

ParametersJSON Schema

Name	Required	Description
`bedrooms`	No	Number of bedrooms
`location`	Yes	Location to analyze
`property_id`	No	Property UUID for comparison

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context by detailing the return content (comparable property rates, market average, suggestions, seasonal adjustments) and specifying that location is required, which enhances transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by return details and parameter guidance in two efficient sentences. Every sentence adds value without redundancy, making it appropriately sized and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (pricing analysis with multiple outputs), annotations cover safety, and schema covers parameters, the description is mostly complete. However, without an output schema, it could benefit from more detail on return format or limitations, but it adequately describes the tool's function and outputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents the three parameters. The description adds minimal semantics by noting that location is required and property_id is for direct comparison, but this mostly repeats schema information. Baseline 3 is appropriate as the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get competitive pricing analysis') and resources ('vacation rental market'), and distinguishes it from siblings by focusing on pricing analysis rather than risk assessment, availability checking, or other functions listed in the sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying required and optional parameters, but does not explicitly state when to use this tool versus alternatives like 'check_vacation_rental_availability_and_pricing' or 'search_vacation_rental_market'. It provides basic context but lacks explicit guidance on tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_reputation_scoreA

Read-only

Inspect

Get the reputation score and performance data for a specific vacation rental property. Returns dispute win rate, protection statistics, host response times, overall reputation score, and a narrative summary. Use this to evaluate a property's track record before recommending it.

ParametersJSON Schema

Name	Required	Description	Default
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, which the description aligns with by describing a retrieval operation. The description adds valuable context beyond annotations by specifying the return data types (e.g., dispute win rate, protection statistics, host response times) and the narrative summary, which helps the agent understand the output structure and behavioral traits not covered by annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences: the first states the purpose and return data, and the second provides usage guidance. Every sentence earns its place by adding clarity and context without redundancy, making it front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (retrieving reputation data), annotations cover safety, and schema covers parameters fully, the description adds necessary context like return data types and usage guidance. However, without an output schema, it could benefit from more detail on output structure, but it's largely complete for a read-only tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the parameter 'property_id' documented as 'Property UUID or lilo_code'. The description does not add further meaning beyond this, such as examples or format details, so it meets the baseline of 3 where the schema does the heavy lifting without extra value from the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'reputation score and performance data for a specific vacation rental property', making the purpose specific. It distinguishes from siblings like 'get_vacation_rental_details' by focusing on reputation metrics rather than general property information, and from 'get_vacation_rental_host_reputation' by targeting property-level data instead of host-level.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage with 'Use this to evaluate a property's track record before recommending it', which implicitly guides when to use it. However, it does not explicitly state when not to use it or name alternatives among siblings, such as when to prefer 'get_vacation_rental_details' for basic info instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vacation_rental_trust_certificateA

Read-only

Inspect

Access the continuously maintained trust certificate for a lilo-protected vacation rental property. Includes verification data for independent validation of the property's protection status, evidence integrity, and host reputation. Pass lilo_code (e.g. PROP-6408).

ParametersJSON Schema

Name	Required	Description	Default
`lilo_code`	Yes	Property lilo code (e.g. PROP-6408)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds valuable behavioral context beyond annotations by specifying 'continuously maintained trust certificate' and 'includes verification data for independent validation', which helps the agent understand the nature and freshness of the data returned.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the core purpose and value, the second provides parameter guidance. Every phrase adds value without redundancy, and it's appropriately front-loaded with the main functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter read-only tool with good annotations and full schema coverage, the description provides adequate context. It explains what the tool returns (trust certificate with verification data) and the parameter requirement. The main gap is the lack of output schema, but the description compensates reasonably by describing the certificate's contents.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with the parameter fully documented in the schema. The description adds minimal value beyond the schema by providing an example format ('e.g. PROP-6408'), which is already present in the schema description. This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('access', 'includes verification data') and resources ('trust certificate for a lilo-protected vacation rental property'). It distinguishes from siblings like 'check_vacation_rental_protection_status' by focusing on certificate retrieval with verification data rather than just status checking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'lilo-protected vacation rental property' and the required parameter, but doesn't explicitly state when to use this tool versus alternatives like 'check_vacation_rental_protection_status' or 'verify_vacation_rental_trust_chain'. It provides basic parameter guidance but lacks explicit comparison with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ingest_philadelphia_public_recordsA

Destructive

Inspect

PREMIUM: Full Philadelphia data ingestion from public records. Ingests landmarks, historic religious properties, and African American historic sites into the lilo system. Pass source: all, landmarks, religious, or aahs.

ParametersJSON Schema

Name	Required	Description	Default
`source`	No	Data source to ingest (default: all)

Tool Definition Quality

A3.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=false and destructiveHint=true, indicating this is a write operation with potential destructive effects. The description adds valuable context by specifying what gets ingested (landmarks, religious properties, AAHS sites) and the target system (lilo), which goes beyond the annotations. However, it doesn't mention rate limits, authentication requirements, or what 'destructive' specifically means in this context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: first establishes scope and purpose, second provides parameter guidance. Every phrase earns its place, though 'PREMIUM:' could be more clearly integrated. It's appropriately sized for a single-parameter tool with clear behavioral implications.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive write operation with no output schema, the description provides adequate context about what gets ingested and where. However, it doesn't explain what 'ingestion' entails operationally, what format the data takes in the lilo system, or what confirmation/response to expect. Given the destructive annotation and complexity of data ingestion, more behavioral detail would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with the single parameter 'source' fully documented in the schema. The description adds minimal value by listing the enum values in natural language ('all, landmarks, religious, or aahs'), but doesn't provide additional semantics about what each source contains or when to choose specific options. Baseline 3 is appropriate since the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Full Philadelphia data ingestion from public records' with specific resource types (landmarks, historic religious properties, African American historic sites) and target system (lilo). It distinguishes itself from siblings like 'get_philadelphia_landmark_details' by focusing on ingestion rather than retrieval. However, it doesn't explicitly contrast with 'search_philadelphia_historic_properties' which might overlap in scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for ingesting specific Philadelphia public record categories into the lilo system, but provides no explicit guidance on when to use this tool versus alternatives like 'search_philadelphia_historic_properties' or 'get_philadelphia_landmark_details'. It mentions 'PREMIUM' which might indicate access level, but doesn't clarify prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

predict_booking_chargeback_probabilityA

Read-only

Inspect

Predict the probability of a chargeback (payment dispute) for a vacation rental booking. Returns risk score, key risk factors, and specific prevention recommendations. Pass booking_id (UUID), optional amount in cents, and optional guest_profile.

ParametersJSON Schema

Name	Required	Description
`amount`	No	Booking amount in cents
`booking_id`	Yes	Booking UUID
`guest_profile`	No	Guest profile data

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context by specifying the return format ('risk score, key risk factors, and specific prevention recommendations'), which goes beyond the annotations. However, it does not mention potential limitations like rate limits or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and output, the second specifies the input parameters. Every sentence adds essential information with zero wasted words, making it easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a prediction tool with no output schema, the description provides a good overview of what the tool does and returns. However, it could be more complete by detailing the format of the output (e.g., score range, factor structure) or mentioning any dependencies, though annotations cover the safety profile adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (booking_id, amount, guest_profile). The description adds minimal value by mentioning these parameters but does not provide additional semantics like format examples or usage guidance beyond what the schema states.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('predict'), resource ('chargeback probability for a vacation rental booking'), and output details ('risk score, key risk factors, and specific prevention recommendations'). It distinguishes itself from sibling tools like 'assess_vacation_rental_booking_risk' by focusing specifically on chargeback prediction rather than general risk assessment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the mention of 'chargeback (payment dispute)' and 'vacation rental booking', but does not explicitly state when to use this tool versus alternatives like 'get_chargeback_defense_for_booking' or 'assess_vacation_rental_booking_risk'. No explicit exclusions or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

query_vacation_rental_evidence_chainA

Read-only

Inspect

Query the evidence chain for a vacation rental property or specific booking. Returns independently verified evidence records filtered by property_id, booking_id, or event_type. Use this to audit the complete evidence trail.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results (default 50)
`booking_id`	No	Booking UUID (optional)
`event_type`	No	Filter by event type
`property_id`	No	Property UUID

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, covering safety aspects. The description adds valuable context beyond annotations: it specifies that returns are 'independently verified evidence records' and mentions an 'audit' purpose, which helps the agent understand the tool's role in verification and compliance without contradicting the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose and key parameters, followed by the audit context. Every sentence adds value without redundancy, making it efficient and well-structured for quick comprehension by an AI agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity as a query with filtering options, annotations cover safety, and schema fully describes parameters, the description is mostly complete. However, the lack of an output schema means the description does not detail return values (e.g., structure of evidence records), leaving a minor gap in full contextual understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions in the input schema. The description adds minimal semantic value by listing filter options ('property_id, booking_id, or event_type'), but does not provide additional details like format examples or usage tips beyond what the schema already documents, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Query the evidence chain') and resource ('vacation rental property or specific booking'), distinguishing it from sibling tools like 'get_evidence_timeline_for_rental' by emphasizing 'independently verified evidence records' and 'audit the complete evidence trail', which suggests a broader or more detailed scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for auditing evidence trails, but does not explicitly state when to use this tool versus alternatives like 'get_evidence_timeline_for_rental' or 'verify_vacation_rental_evidence_record'. It provides some context ('filtered by property_id, booking_id, or event_type') but lacks explicit exclusions or comparisons to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

record_guest_interaction_to_evidenceA

Destructive

Inspect

Record a guest interaction to the vacation rental's evidence chain. Creates a verified evidence record of the interaction. Pass property_id, interaction_type (inquiry, complaint, request, confirmation), content text, and channel (mcp, voice, sms, email). Optional: booking_id.

ParametersJSON Schema

Name	Required	Description
`channel`	No	Channel: mcp, voice, sms, email
`content`	Yes	Interaction content/summary
`booking_id`	No	Booking UUID (optional)
`property_id`	Yes	Property UUID
`interaction_type`	Yes	Type: inquiry, complaint, request, confirmation

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=false and destructiveHint=true, indicating a mutation operation with potential data impact. The description adds value by specifying that it 'creates a verified evidence record', clarifying the creation behavior beyond the annotations. However, it doesn't detail side effects, permissions needed, or rate limits, keeping it from a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first states the purpose and action, the second lists parameters efficiently. It's front-loaded with the core functionality and avoids redundancy, making every sentence earn its place without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation with destructive hint), annotations cover safety aspects, and schema fully documents parameters. The description adds context about creating verified evidence, but lacks output details (no output schema) and doesn't fully address behavioral nuances like error handling or confirmation messages, leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters well-documented in the input schema. The description lists parameters (property_id, interaction_type, content, channel, booking_id) but doesn't add significant meaning beyond the schema, such as examples or constraints. Baseline 3 is appropriate since the schema carries the primary documentation burden.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Record a guest interaction to the vacation rental's evidence chain') and resource ('creates a verified evidence record'), distinguishing it from sibling tools like 'query_vacation_rental_evidence_chain' (which queries) and 'verify_vacation_rental_evidence_record' (which verifies). It explicitly identifies the tool as a creation/writing operation for evidence records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through parameter requirements (e.g., property_id, interaction_type) but doesn't explicitly state when to use this tool versus alternatives. It doesn't mention prerequisites, exclusions, or compare with siblings like 'detect_guest_communication_risk' or 'report_rental_inventory_issue', leaving usage guidance incomplete.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

report_rental_inventory_issueA

Destructive

Inspect

Report a missing, damaged, or low-stock inventory item at a vacation rental property. Creates an issue record with optional photo evidence. Pass property_id, item_name, and issue_type (missing, damaged, low_stock, needs_replacement). Optional: description, booking_id, photo_url.

ParametersJSON Schema

Name	Required	Description
`item_name`	Yes	Name of the inventory item
`photo_url`	No	URL to photo evidence (optional)
`booking_id`	No	Associated booking UUID (optional)
`issue_type`	Yes	Type: missing, damaged, low_stock, needs_replacement
`description`	No	Description of the issue
`property_id`	Yes	Property UUID or lilo_code

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true and readOnlyHint=false, which the description aligns with by stating 'Creates an issue record'. The description adds valuable context beyond annotations: it specifies the record includes optional photo evidence and lists the exact issue_type values, providing behavioral details not captured in structured fields.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states purpose and core functionality, the second lists parameters with clear required/optional distinctions. Every element serves a purpose with zero redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with destructiveHint=true and no output schema, the description provides adequate context: it explains what gets created (issue record), includes parameter guidance, and aligns with annotations. However, it doesn't mention potential side effects, confirmation requirements, or what the tool returns, leaving some behavioral gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all 6 parameters thoroughly. The description repeats parameter names and optionality but doesn't add meaningful semantic context beyond what's in the schema (e.g., explaining relationships between parameters or usage patterns).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Report a missing, damaged, or low-stock inventory item') and resource ('at a vacation rental property'), distinguishing it from sibling tools like 'create_rental_maintenance_task' or 'get_vacation_rental_inventory'. It precisely defines the tool's scope as creating issue records for inventory problems.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by specifying the types of inventory issues (missing, damaged, low_stock, needs_replacement) but doesn't explicitly state when to use this tool versus alternatives like 'create_rental_maintenance_task' for non-inventory issues. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

screen_guest_before_bookingA

Read-only

Inspect

Pre-booking guest risk assessment for vacation rental hosts. Evaluates guest profile, booking details, and communication patterns to provide a risk level (low/medium/high/critical) with specific recommendations. Helps hosts decide whether to accept a booking request. Pass guest_email, guest_phone, guest_name, message_text, and/or booking_details.

ParametersJSON Schema

Name	Required	Description
`guest_name`	No	Guest name
`guest_email`	No	Guest email address
`guest_phone`	No	Guest phone number
`message_text`	No	Initial message from guest
`booking_details`	No	Booking details

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about what the tool evaluates (guest profile, booking details, communication patterns) and the output format (risk level with recommendations), which goes beyond annotations. However, it doesn't mention rate limits, authentication needs, or data sources.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: purpose, evaluation scope, and usage guidance. Every sentence adds value with zero wasted words, and it's front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk assessment with multiple inputs) and lack of output schema, the description adequately covers the purpose, inputs, and output format (risk level with recommendations). However, it could better explain the risk assessment methodology or data sources for a more complete picture, though annotations provide safety context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description lists the parameters ('Pass guest_email, guest_phone, guest_name, message_text, and/or booking_details') but doesn't add meaningful semantics beyond what the schema already provides about each parameter's purpose and format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Pre-booking guest risk assessment for vacation rental hosts' with specific actions ('Evaluates guest profile, booking details, and communication patterns') and outcome ('provide a risk level with specific recommendations'). It distinguishes from siblings like 'analyze_booking_threat_risk' by focusing on pre-booking screening rather than general threat analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context: 'Helps hosts decide whether to accept a booking request.' It implies usage timing (pre-booking) but doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, though the purpose differentiation helps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_philadelphia_event_venuesA

Read-only

Inspect

Search Philadelphia event and wedding venues. Returns venue details including capacity, type, availability, and booking information. Filter by venue_type (wedding, event, historic, all) and minimum capacity.

ParametersJSON Schema

Name	Required	Description
`query`	No	Search query
`venue_type`	No	Venue type filter
`capacity_min`	No	Minimum capacity requirement

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds context about what is returned (venue details including capacity, type, availability, booking information) and filtering capabilities, which is useful but doesn't disclose rate limits, auth needs, or other behavioral traits beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose in the first sentence, followed by return details and filter criteria. Every sentence earns its place with no wasted words, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (search with filters), rich annotations (readOnlyHint, destructiveHint), and 100% schema coverage, the description is mostly complete. It explains what is returned and filtering options, but lacks output schema details (e.g., format of venue details), which is a minor gap since no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (query, venue_type, capacity_min) with descriptions. The description adds marginal value by mentioning the venue_type enum values and capacity_min purpose, but doesn't provide additional syntax or format details beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Search Philadelphia event and wedding venues') and resources ('venues'), and distinguishes it from siblings by focusing on event/wedding venues rather than vacation rentals or other Philadelphia-related tools like 'search_philadelphia_historic_properties' or 'get_philadelphia_landmark_details'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by mentioning filter criteria (venue_type, capacity_min), but does not explicitly state when to use this tool versus alternatives like 'search_philadelphia_historic_properties' or other venue-related tools. It provides context for filtering but lacks explicit guidance on tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_philadelphia_historic_propertiesA

Read-only

Inspect

Search Philadelphia historic properties and landmarks from public records. Useful for World Cup 2026 and America's 250th Anniversary (2026) planning. Filter by type: landmark, historic_religious, african_american, or all.

ParametersJSON Schema

Name	Required	Description
`type`	No	Property type filter
`limit`	No	Max results (default 20)
`query`	No	Search query (property name, address, etc.)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about the data source ('public records') and specific use cases (World Cup 2026, America's 250th Anniversary), which enhances understanding. However, it doesn't disclose behavioral traits like rate limits, authentication needs, or pagination behavior beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded: it starts with the core purpose, then adds context (use cases), and ends with parameter guidance. Both sentences earn their place by providing essential information without redundancy. However, the second sentence could be slightly more streamlined by integrating the use cases more seamlessly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema), the description is fairly complete. It covers purpose, use cases, and parameter options, and annotations handle safety. The main gap is the lack of output schema, so the description doesn't explain return values, but this is mitigated by the tool being a straightforward search operation. It could benefit from more behavioral details like result format or limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents all three parameters (type, limit, query) with descriptions. The description adds value by listing the enum values for 'type' ('landmark, historic_religious, african_american, or all'), which clarifies options, but doesn't provide additional syntax or format details beyond what the schema offers. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search Philadelphia historic properties and landmarks from public records.' It specifies the verb ('search'), resource ('historic properties and landmarks'), and source ('public records'). However, it doesn't explicitly differentiate from sibling tools like 'get_philadelphia_landmark_details' or 'search_philadelphia_event_venues', which are related but not identical.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'Useful for World Cup 2026 and America's 250th Anniversary (2026) planning.' This gives specific scenarios where the tool is applicable. However, it doesn't explicitly state when not to use it or name alternatives among sibling tools, such as 'get_philadelphia_landmark_details' for detailed info on a specific landmark.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_vacation_rental_marketA

Read-only

Inspect

Search lilo's market discovery for vacation rental properties in any US location. Filter by price range, bedrooms, superhost status, and World Cup 2026 host cities. Returns property listings including non-activated properties for market research.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results (default 20)
`location`	Yes	City, state, or neighborhood to search
`max_price`	No	Maximum nightly price
`min_price`	No	Minimum nightly price
`min_bedrooms`	No	Minimum bedrooms
`superhost_only`	No	Only show superhosts
`world_cup_city`	No	Only World Cup 2026 host cities

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable behavioral context beyond annotations: it specifies that results include 'non-activated properties for market research,' which is crucial for understanding the tool's behavior. It does not mention rate limits or authentication needs, but with annotations covering safety, this is sufficient for a high score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose and followed by key features and context. Every word earns its place, with no redundancy or fluff, making it highly efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, no output schema), the description is mostly complete. It clarifies the tool's purpose, usage context, and key behavioral aspects. However, without an output schema, it could benefit from more detail on return values (e.g., listing format), though the mention of 'property listings' provides some guidance. The annotations help cover safety, making this adequate but not perfect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so all parameters are documented in the schema. The description mentions filtering by 'price range, bedrooms, superhost status, and World Cup 2026 host cities,' which aligns with parameters but does not add significant semantic detail beyond what the schema provides. This meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search lilo's market discovery for vacation rental properties in any US location.' It specifies the exact resource (vacation rental properties) and scope (US locations), and distinguishes itself from sibling tools like 'search_vacation_rentals_by_location' by emphasizing market research and including non-activated properties.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage: 'for market research' and mentions filtering capabilities. However, it does not explicitly state when to use this tool versus alternatives like 'search_vacation_rentals_by_location' or 'find_similar_vacation_rentals', which would be needed for a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_vacation_rentalsA

Read-only

Inspect

Search for vacation rental properties with structured filtering. Returns results optimized for AI deep research with id, title, url, snippet, pricing, and property attributes. Use this for broad property discovery. Supports filtering by city, state, bedrooms, price, and pet-friendliness.

ParametersJSON Schema

Name	Required	Description
`city`	No	Filter by city name
`limit`	No	Max results to return (default 10, max 50)
`query`	Yes	Search query (e.g. 'beach house in Delaware', 'family rental near Philadelphia')
`state`	No	Filter by state name or abbreviation
`max_price`	No	Maximum nightly rate in USD
`min_bedrooms`	No	Minimum number of bedrooms
`pet_friendly`	No	Filter for pet-friendly properties only

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about the return format ('optimized for AI deep research with id, title, url, snippet, pricing, and property attributes') and filtering capabilities, but does not disclose behavioral traits like rate limits, pagination, or error handling beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: first states the purpose and return format, second provides usage guidance, third lists key filters. Every sentence adds value with zero waste, making it front-loaded and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (7 parameters, no output schema), the description is mostly complete. It covers purpose, usage, return format, and key filters, but lacks details on behavioral aspects like result limits or optimization specifics. With annotations covering safety, it provides adequate context, though could be more comprehensive for a search tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 7 parameters. The description lists the supported filters ('city, state, bedrooms, price, and pet-friendliness'), which aligns with the schema but does not add meaningful semantics beyond it. The baseline is 3 when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Search') and resource ('vacation rental properties'), and distinguishes it from siblings by mentioning it's for 'broad property discovery' rather than detailed analysis or booking. This differentiates it from tools like 'fetch_vacation_rental_details' or 'book_vacation_rental_direct'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('for broad property discovery'), but does not explicitly state when not to use it or name specific alternatives. It implies usage for initial search rather than detailed analysis, but lacks explicit exclusions or comparisons to similar tools like 'search_vacation_rentals_by_amenities'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_vacation_rentals_by_amenitiesA

Read-only

Inspect

Search for vacation rentals by amenity description using natural language. Examples: 'pool and hot tub', 'pet-friendly with fenced yard', 'EV charger and garage'. Pass the amenity query and optional location filter.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results (default 10)
`query`	Yes	Natural language amenity query
`location`	No	Location to search

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about natural language querying and examples, but does not disclose behavioral traits like rate limits, authentication needs, or result format details. With annotations covering safety, a 3 is appropriate as the description adds some value without rich behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with the core purpose stated first, followed by examples and parameter guidance. Every sentence earns its place by providing essential information without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (search with natural language input), rich annotations (readOnlyHint, destructiveHint), and 100% schema coverage, the description is mostly complete. It explains the natural language aspect and provides examples, but lacks output details (no output schema) and could mention result format or limitations. It's adequate but has minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (query, limit, location). The description mentions the query and location parameters but does not add significant meaning beyond what the schema provides, such as syntax details or usage nuances. Baseline 3 is correct when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Search'), resource ('vacation rentals'), and method ('by amenity description using natural language'). It distinguishes itself from sibling tools like 'search_vacation_rentals_by_location' and 'search_vacation_rentals_by_description' by focusing on amenities, providing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Search for vacation rentals by amenity description using natural language') and includes examples to illustrate appropriate queries. However, it does not explicitly state when not to use it or name specific alternatives among the many sibling tools, though the amenity focus implies differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_vacation_rentals_by_descriptionA

Read-only

Inspect

Search for vacation rentals using a natural language description. Examples: 'romantic beachfront getaway with hot tub', 'family-friendly house with pool near Disney', 'quiet cabin in the mountains for a writers retreat'. Finds matching properties by semantic meaning, not just keywords. Pass the description as 'query'. Optional: threshold (how strict result matching should be) and limit (max results, default 10).

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results (default 10)
`query`	Yes	Natural language description of what you're looking for
`threshold`	No	Match strictness (higher = stricter)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as read-only and non-destructive. The description adds that search uses semantic meaning rather than keywords, but does not disclose further behavioral traits like error handling or result format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with no wasted words. It leads with purpose, includes examples, and parameter explanations in a clear structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with 3 parameters and no output schema, the description covers essential usage but lacks details on threshold range, pagination, or empty result behavior. It is adequate but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, so baseline is 3. The description adds context to parameters: explains threshold as 'how strict' and states limit default as 10, providing value beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches for vacation rentals using natural language description, with helpful examples. It distinguishes from keyword search but does not explicitly differentiate from sibling tools like search_vacation_rentals or search_vacation_rentals_by_amenities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidance on when to use this tool versus siblings. Usage is implied for natural language queries, but no exclusions or alternatives are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_vacation_rentals_by_locationA

Read-only

Inspect

Search for vacation rentals, short-term rentals, and accommodation properties by location, guest count, and property type. Use this when a traveler wants to find a place to stay. Returns matching properties with names, locations, nightly rates, photos, and protection status.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results to return (default 10)
`location`	Yes	Location to search (city, state, or address)
`property_type`	No	Property type filter (entire_home, private_room, etc.)
`verified_only`	No	Only verified properties (default true)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful behavioral context by specifying the return format ('matching properties with names, locations, nightly rates, photos, and protection status'), but doesn't mention potential limitations like pagination, rate limits, or authentication requirements beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first explains the search functionality and usage context, the second details the return format. Every phrase adds value without redundancy, making it easy to parse and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with good annotations (read-only, non-destructive) and full schema coverage, the description provides adequate context about what it does and returns. However, without an output schema, it could benefit from more detail about response structure or error handling, though the return format description partially compensates for this gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so all parameters are documented in the schema. The description mentions 'guest count' as a search parameter, but this isn't reflected in the input schema, creating a minor inconsistency. Otherwise, it adds little semantic value beyond what the schema already provides, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Search for vacation rentals, short-term rentals, and accommodation properties') and resources involved, with explicit parameters ('by location, guest count, and property type'). It distinguishes from sibling tools like 'search_vacation_rentals_by_amenities' and 'search_vacation_rentals_by_description' by emphasizing location-based searching.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage ('when a traveler wants to find a place to stay'), which helps differentiate from risk analysis or booking tools. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, such as when searching by amenities instead of location.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_world_cup_rentalsA

Read-only

Inspect

Search for vacation rentals in FIFA World Cup 2026 host cities. Returns properties with stadium proximity information, match schedules, and expected demand surge data. Use this when a traveler is looking for accommodation for World Cup 2026 matches. Supports 11 US host cities: Miami, New York, Los Angeles, Dallas, Philadelphia, Atlanta, Houston, Seattle, San Francisco, Boston, Kansas City.

ParametersJSON Schema

Name	Required	Description
`city`	Yes	World Cup 2026 host city (e.g. 'Miami', 'Philadelphia')
`limit`	No	Max results to return (default 10, max 50)
`check_in`	No	Check-in date (YYYY-MM-DD format, tournament runs Jun 11 - Jul 19, 2026)
`check_out`	No	Check-out date (YYYY-MM-DD format)
`group_size`	No	Number of guests in the group

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read operation. The description adds useful context about what data is returned (stadium proximity, match schedules, demand surge) and the specific scope (11 US host cities, tournament dates). However, it doesn't describe pagination behavior, error conditions, or rate limits. With annotations covering the safety profile, a 3 is appropriate - the description adds some behavioral context but not comprehensive operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: first states purpose and returns, second provides usage guidance, third specifies scope. Every sentence adds value - no redundancy or wasted words. It's appropriately sized for the tool's complexity and front-loads the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with good annotations (readOnlyHint=true, destructiveHint=false) and 100% schema coverage, the description provides excellent context about the specialized World Cup focus, specific data returns, usage scenario, and geographic scope. The main gap is the lack of output schema, but the description compensates by detailing what information will be returned. It's nearly complete for this type of tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so all parameters are well-documented in the input schema. The description adds context about the tournament timeframe ('tournament runs Jun 11 - Jul 19, 2026') which helps interpret date parameters, and lists the 11 valid cities. However, it doesn't provide additional parameter semantics beyond what's already in the schema descriptions. Baseline 3 is correct when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Search for vacation rentals'), target resource ('in FIFA World Cup 2026 host cities'), and key differentiators ('stadium proximity information, match schedules, and expected demand surge data'). It distinguishes this from generic rental search tools by focusing on World Cup-specific features and explicitly listing the 11 supported host cities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('when a traveler is looking for accommodation for World Cup 2026 matches'). It implicitly distinguishes from sibling tools like 'search_vacation_rentals' or 'search_vacation_rentals_by_location' by specifying the World Cup context, host cities, and tournament-specific data returns.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_evidence_anchor_integrityA

Read-only

Inspect

Verify the integrity and authenticity of a specific evidence anchor record. Confirms the evidence is tamper-proof and independently verifiable. Pass evidence_id (UUID) or evidence_hash.

ParametersJSON Schema

Name	Required	Description	Default
`evidence_id`	No	Evidence UUID
`evidence_hash`	No	Evidence hash (alternative to evidence_id)

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and destructiveHint=false, indicating this is a safe read operation. The description adds valuable behavioral context beyond annotations by specifying what the verification confirms ('tamper-proof and independently verifiable') and that it accepts alternative identifiers (evidence_id or evidence_hash). It doesn't describe rate limits or authentication requirements, but adds meaningful operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place. The first sentence establishes the core purpose, while the second provides essential parameter guidance. There's no wasted language, repetition, or unnecessary elaboration, making it highly efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (verification operation with two parameters), the description provides adequate context. With annotations covering safety (read-only, non-destructive) and 100% schema coverage for parameters, the description adds necessary purpose and behavioral context. The main gap is the lack of output schema, but the description compensates by explaining what the verification confirms. For a verification tool, this is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents both parameters (evidence_id as UUID, evidence_hash as alternative). The description adds minimal semantic value by mentioning these are alternatives ('Pass evidence_id (UUID) or evidence_hash'), but doesn't provide additional details about format requirements or usage scenarios beyond what the schema provides. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('verify integrity and authenticity', 'confirms tamper-proof and independently verifiable') and identifies the resource ('evidence anchor record'). It distinguishes itself from sibling tools like 'verify_vacation_rental_evidence_record' and 'query_vacation_rental_evidence_chain' by focusing specifically on integrity verification rather than general evidence checking or chain querying.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the phrase 'verify integrity and authenticity of a specific evidence anchor record', suggesting it should be used when tamper-proof verification is needed. However, it doesn't explicitly state when to use this tool versus alternatives like 'verify_vacation_rental_evidence_record' or provide clear exclusions. The guidance is present but not explicit about alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_guest_identity_for_check_inA

Destructive

Inspect

Create a secure guest verification and check-in link for a confirmed vacation rental booking. Verifies guest identity via phone, presents house rules for acknowledgment, records consent with verified evidence, and provides access codes upon agreement. Returns a unique handshake link.

ParametersJSON Schema

Name	Required	Description
`guest_name`	Yes	Name of the guest
`guest_email`	No	Guest email address
`guest_phone`	No	Guest phone for verification
`property_id`	Yes	Property UUID or lilo_code
`check_in_date`	Yes	Check-in date (YYYY-MM-DD)
`check_out_date`	Yes	Check-out date (YYYY-MM-DD)

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=true, suggesting a write operation with potential data changes. The description adds valuable behavioral context beyond annotations: it specifies that the tool creates a link, verifies identity via phone, presents house rules, records consent with evidence, and provides access codes. This clarifies the tool's multi-step process and output (a unique handshake link), though it doesn't detail rate limits or auth needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with the core action ('Create a secure guest verification and check-in link') and following with key steps. Every sentence adds value, but it could be slightly more concise by combining some clauses without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multi-step verification process) and annotations (destructiveHint=true), the description is mostly complete. It explains the tool's behavior and output (unique handshake link), but lacks an output schema, so return values are only briefly mentioned. With good annotations and schema coverage, it provides sufficient context for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description does not add specific parameter semantics beyond what the schema provides (e.g., it doesn't explain how guest_phone is used for verification or format details). Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('create', 'verify', 'present', 'record', 'provide') and resources ('secure guest verification and check-in link', 'confirmed vacation rental booking'). It distinguishes itself from sibling tools by focusing on guest verification and check-in link generation, unlike analysis, booking, or search tools in the sibling list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for guest verification and check-in processes in vacation rental contexts, but does not explicitly state when to use this tool versus alternatives (e.g., other verification or check-in tools). It mentions 'confirmed vacation rental booking' as a prerequisite, but lacks explicit exclusions or comparisons to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_rental_checkout_conditionA

Destructive

Inspect

Record checkout verification for a vacation rental with checklist completion and photo evidence. Used by cleaners or hosts to document property condition after a guest departure. Pass property_id, booking_id, overall_condition (excellent/good/fair/poor/damaged), optional checklist_items, issues_found, photo_urls, and verified_by.

ParametersJSON Schema

Name	Required	Description
`booking_id`	Yes	Booking UUID
`photo_urls`	No	Photo evidence URLs
`property_id`	Yes	Property UUID or lilo_code
`verified_by`	No	Name of person verifying
`issues_found`	No	List of issues found
`checklist_items`	No	Completed checklist items
`overall_condition`	Yes	Overall condition: excellent, good, fair, poor, damaged

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true and readOnlyHint=false, which the description aligns with by implying data creation/recording. The description adds valuable context about who uses it (cleaners/hosts) and the purpose (documentation after departure), which goes beyond annotations. It doesn't mention side effects like notifications or data persistence details, but provides good operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: first states purpose, second provides usage context, third enumerates parameters efficiently. Every sentence earns its place by adding distinct value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with destructiveHint=true and no output schema, the description provides good operational context (who, when, what). It could be more complete by mentioning potential side effects or response format, but covers the essential usage scenario adequately given the annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are fully documented in the schema. The description lists all parameters but doesn't add meaningful semantic context beyond what's in schema descriptions (e.g., explaining relationships between parameters or business logic). Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Record checkout verification') with the resource ('vacation rental') and scope ('with checklist completion and photo evidence'). It distinguishes from siblings by focusing on post-departure documentation rather than risk analysis, booking, or other operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use it ('Used by cleaners or hosts to document property condition after a guest departure'), providing clear context. However, it doesn't specify when not to use it or name alternative tools for similar documentation tasks, which prevents a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_vacation_rental_evidence_recordA

Read-only

Inspect

Verify the authenticity and integrity of a specific evidence record. Confirms the evidence has not been tampered with, existed at the claimed timestamp, and is independently verifiable. Pass the evidence_id (UUID).

ParametersJSON Schema

Name	Required	Description	Default
`evidence_id`	Yes	Evidence record UUID

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds valuable context beyond annotations by specifying what verification entails: checking for tampering, timestamp validity, and independent verifiability. It does not contradict annotations, as 'verify' aligns with read-only behavior. However, it lacks details on rate limits, authentication needs, or output format, leaving some behavioral aspects uncovered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by a concise instruction for parameter usage. Both sentences earn their place: the first defines the tool's function, and the second specifies the required input. There is no redundant or verbose language, making it efficiently structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (verification with integrity checks), annotations cover safety (read-only, non-destructive), and schema fully describes the single parameter. The description adds context on what verification entails, which is helpful. However, without an output schema, it does not explain return values (e.g., success/failure, verification details), leaving a minor gap. Overall, it is mostly complete but could benefit from output information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'evidence_id' documented as 'Evidence record UUID.' The description adds minimal semantics by reiterating 'Pass the evidence_id (UUID),' which does not provide additional meaning beyond the schema. Since schema coverage is high, the baseline score of 3 is appropriate, as the description does not compensate with extra details like format examples or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific verb ('verify') and resource ('evidence record'), with detailed purpose: 'authenticity and integrity,' 'not been tampered with,' 'existed at the claimed timestamp,' and 'independently verifiable.' It distinguishes from siblings like 'query_vacation_rental_evidence_chain' (which likely queries multiple records) and 'verify_evidence_anchor_integrity' (which may focus on anchor points rather than individual records).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying to 'pass the evidence_id (UUID),' suggesting this tool is for verifying a specific record. However, it does not explicitly state when to use this tool versus alternatives (e.g., 'verify_evidence_anchor_integrity' or 'query_vacation_rental_evidence_chain'), nor does it provide exclusions or prerequisites. The guidance is limited to parameter input without broader context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_vacation_rental_trust_chainA

Read-only

Inspect

Independently verify a specific evidence record in a vacation rental property's trust chain. Confirms the evidence has been independently verified and is tamper-proof. Pass lilo_code (e.g. PROP-6408) and envelope_hash of the evidence record to verify.

ParametersJSON Schema

Name	Required	Description	Default
`lilo_code`	Yes	Property lilo code (e.g. PROP-6408)
`envelope_hash`	Yes	Hash of the evidence envelope to verify

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false, which the description aligns with by using 'verify' and 'confirms' (non-destructive actions). The description adds value beyond annotations by specifying that it confirms evidence is 'independently verified and tamper-proof', providing context about the verification outcome. However, it does not detail error handling or rate limits, keeping it from a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by a concise instruction on parameter usage. Every sentence earns its place by clarifying the tool's function and how to invoke it, with no redundant or verbose language. It efficiently communicates essential information in three sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (verification with two parameters), annotations cover safety (read-only, non-destructive), and schema fully describes inputs, the description is mostly complete. It adds context about the verification outcome (tamper-proof evidence). However, without an output schema, it does not specify return values or potential errors, leaving a minor gap in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for both parameters (lilo_code and envelope_hash). The description adds minimal semantics by reinforcing the parameter usage ('Pass lilo_code... and envelope_hash...') but does not provide additional context beyond what the schema already covers, such as format details or examples beyond the schema's example. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('verify', 'confirms') and resources ('evidence record in a vacation rental property's trust chain'), distinguishing it from siblings like 'query_vacation_rental_evidence_chain' (which likely queries) and 'verify_evidence_anchor_integrity' (which focuses on anchor integrity). It explicitly mentions verifying tamper-proof evidence, making the scope distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the required parameters (lilo_code and envelope_hash) for verification, but does not explicitly state when to use this tool versus alternatives like 'verify_evidence_anchor_integrity' or 'query_vacation_rental_evidence_chain'. It provides basic context but lacks explicit guidance on exclusions or comparisons with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources