HelloBooks AI Agents

Server Details

agents.hellobooks.ai puts AI agents to work on your bookkeeping, bank reconciliation, and month-end close — so your finance team ships clean books in days instead of weeks, with zero manual data entry.

Status: Healthy
Last Tested: 2026-07-20 19:34
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.9/5.0

Tool DescriptionsA

Average 4.3/5 across 26 of 26 tools scored. Lowest: 3/5.

Server CoherenceA

Disambiguation5/5

Each tool has a clearly distinct purpose, with detailed descriptions that specify the exact input and use case. Even similar tools like QBO vs Xero journal analyzers are differentiated by source system and specific checks, leaving no ambiguity.

Naming Consistency4/5

The majority of tools follow a verb_noun snake_case pattern (e.g., analyze_balance_sheet, list_features). A few exceptions like how_munimji_helps and feature_search deviate slightly, but overall the convention is consistent and readable.

Tool Count4/5

With 26 tools covering financial analysis, compliance, migration, and product info, the count is on the higher side but justified by the breadth of the accounting domain. Each tool serves a specific need without redundancy.

Completeness4/5

The tool set covers all major financial statement checks (balance sheet, P&L, trial balance, journal entries), compliance, migration, and product details. A notable gap is the lack of cash flow analysis, but the set is otherwise comprehensive for its scope.

Available Tools

29 tools

analyze_balance_sheetAInspect

Take a Balance Sheet CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected) and run three checks: (1) bs.equation_broken — the fundamental accounting equation Assets = Liabilities + Equity does not hold (every downstream ratio analysis is invalid until fixed); (2) bs.negative_asset — Cash / AR / Inventory line items with negative balances (reconciliation error signal); (3) bs.negative_equity — Total Equity < 0 (insolvency signal). Input is raw CSV text of a Balance Sheet (Reports → Balance Sheet in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, totals (totalAssets, totalLiabilities, totalEquity, equationBalances boolean), and a shareable URL. Use this when a user pastes a Balance Sheet and asks "does my balance sheet balance?", "is the accounting equation satisfied?", or "is my company solvent on paper?". A Balance Sheet that fails Assets = Liabilities + Equity invalidates every downstream financial-ratio analysis — this is the single most important check for any BS.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a Balance Sheet report. Works with QuickBooks Online (Reports → Balance Sheet), Xero (Reports → Balance Sheet), Zoho Books (Reports → Balance Sheet), and Wave (Reports → Balance Sheet). Statement should include Total Assets, Total Liabilities, and Total Equity rows. Source is auto-detected from section name signatures.
`fileName`	No	Optional filename for the share-page label.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses auto-detection of source, three checks, return fields (flags, totals, URL), and constraints (max rows, max size). It lacks explicit statement about read-only nature but since it's analysis-only, it's adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with core action and checks, followed by input details and usage prompts. While verbose, each sentence serves a purpose. Minor reduction could improve conciseness but structure is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description sufficiently covers return values (flags, totals, URL) and constraints. It answers key agent questions about input format, limits, and typical use cases. Lacks details on severity levels but otherwise complete for an analysis tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and baseline is 3. The description adds significant value beyond schema descriptions by detailing supported accounting software, required row structure, and auto-detection logic. The fileName parameter is clarified as optional label input.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool takes a Balance Sheet CSV and runs three specific checks (equation broken, negative asset, negative equity). It uniquely identifies the resource and verb, and distinguishes from sibling tools like analyze_profit_loss by focusing on balance sheet analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists example user queries that trigger use this tool, such as 'does my balance sheet balance?' and 'is my company solvent?'. It also warns about downstream invalidity if equation fails. However, it does not contrast with sibling tools like analyze_trial_balance or state when not to use, preventing a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_journal_varianceAInspect

Compare two periods of journal-entry data (QBO or Xero — source auto-detected from headers) and flag accounts whose movement deviates materially between periods. Aggregates lines per account into a net total for each period, then surfaces accounts where the period-over-period change crosses a materiality threshold (≥5% relative AND ≥$100 absolute; severity high at ≥50%, medium at ≥20%, low at ≥5%). Inputs are two CSV exports — periodACsv (earlier period) and periodBCsv (later period). Optional periodALabel / periodBLabel for human-readable flag messages (e.g. "Q1 FY2024" vs "Q2 FY2024"). Max 5,000 rows per period; max 5 MB each. Use this when a user pastes two periods and asks "what changed?", "show me variances", "what jumped period-over-period". Returns a flag list ordered by largest delta, a roll-up, and a shareable URL. Both periods must be the same source — mixing QBO + Xero in one call returns an error.

ParametersJSON Schema

Name	Required	Description
`periodACsv`	Yes	Raw CSV text of the EARLIER period's journal-entry export (QBO Journal Entries or Xero Manual Journals). Source is auto-detected from the headers.
`periodBCsv`	Yes	Raw CSV text of the LATER period's journal-entry export. Source is auto-detected from the headers (must match periodACsv).
`periodALabel`	No	Optional human label for the earlier period — e.g. "Q1 FY2024". Used in flag messages.
`periodBLabel`	No	Optional human label for the later period — e.g. "Q2 FY2024".

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: auto-detection of source, aggregation logic, materiality thresholds (≥5%, ≥100), severity levels, row/ size limits, and return type. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Reasonably concise for the detail provided, but dense. Could benefit from bullet points for readability, but not a major issue for AI use.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description covers inputs, behavior, constraints, output (flag list, roll-up, shareable URL), and error condition. Complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value: explains periodACsv/BCsv as earlier/later, optional labels for human-readable output, and auto-detection. Goes beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares two periods of journal-entry data and flags material deviations. It uses specific verbs ('compare', 'flag') and resources ('journal-entry data'), and distinguishes well from sibling tools like analyze_balance_sheet.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions when to use (e.g., 'what changed?', 'show me variances'), and includes constraints like same source required. Lacks explicit 'when not to use' or alternatives, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_profit_lossAInspect

Take a Profit & Loss / Income Statement CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from section names) and run three checks: (1) pnl.subtotal_mismatch — each "Total Section" subtotal equals the sum of its preceding line items (catches missing or duplicated rows); (2) pnl.negative_expense — flags expense-section line items with negative amounts (usually sign-flips or refunds posted to the wrong side); (3) pnl.margin_red_flag — gross-profit margin < 5% or > 95%, or negative total revenue. Input is raw CSV text of a P&L report (Reports → Profit and Loss in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, a summary with totalRevenue / totalCogs / grossProfit / grossMarginPct / netIncome (when detected), and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a P&L and asks "does my P&L look right?", "any sign errors?", "what is my gross margin?", or "anything suspicious in my income statement?". For period-over-period comparison use analyze_journal_variance with two periods of journal-entry data; this tool is single-period only.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a Profit & Loss / Income Statement report. Works with QuickBooks Online (Reports → Profit and Loss), Xero (Reports → Profit and Loss), Zoho Books (Reports → Profit & Loss), and Wave (Reports → Profit & Loss). Source is auto-detected from section names. Statement should include section headers, line items, "Total X" subtotals, and a Net Income / Net Profit row at the bottom.
`fileName`	No	Optional filename for the share-page label.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It details the three checks performed (subtotal_mismatch, negative_expense, margin_red_flag), constraints (max 5000 rows, max 5 MB), and output (flags with severity, summary, shareable URL). It does not mention any destructive effects or authentication requirements, but the tool is read-only by nature, so the description is sufficiently transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed but efficient, front-loading the core action and checks. Each sentence adds value: sources, three checks, input format, limits, output, usage guidance. It could be slightly shortened without losing substance, but it is well-structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is complete for a tool with no output schema. It covers the input format, supported sources, the three checks with their purposes, constraints, output fields (flags, summary with financial metrics, shareable URL), and explicit usage examples. It also provides a clear alternative for multi-period analysis, making it self-contained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents both parameters. The description adds significant value by explaining the expected format of csvText (source auto-detection, required sections) and the optional nature of fileName. This goes beyond the schema's minimal descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it analyzes a P&L CSV from specific sources (QuickBooks, Xero, Zoho, Wave) and runs three specific checks. It distinguishes itself from sibling tools like analyze_balance_sheet and analyze_journal_variance by explicitly mentioning the tool's scope (single-period P&L) and alternatives for period-over-period analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool (user pastes a P&L and asks about correctness, sign errors, gross margin, or suspicious items) and when not to (period-over-period comparison should use analyze_journal_variance). This provides clear guidance for an AI agent to select the correct tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_qbo_journal_anomaliesAInspect

Scan a QuickBooks Online "Journal Entries" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Round numbers are statistically rare in real bookkeeping and frequently indicate estimates, plugs, or fraud signals worth review. Input is raw CSV text from QBO Reports → Accountant → Journal. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes QBO data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Tier-0 subset — HelloBooks Phase 3.0 anomaly detection in the paid product additionally catches GL outliers vs entity history, vendor-history mismatches, archived-vendor activity, and AI-narrated suspicious lines (which require the live HelloBooks account).

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly.
`fileName`	No	Optional original filename, used only as a label on the share page.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It details the scanning behavior (round-number lines, multiples of $1,000 above $1,000 threshold), severity levels, and output including a shareable URL. It also states limits (5k rows, 5 MB). It does not cover authentication or data storage, but the description is sufficiently transparent for the tool's purpose.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, starting with the main action, then details, usage cues, constraints, and relationship to paid product. Every sentence adds value without redundancy. It is concise yet comprehensive.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains output (flagged lines with severity, URL) and input constraints. It covers the main use cases. However, it does not differentiate from sibling tools like analyze_qbo_journal_cleanup, which is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters described. The description adds significant meaning beyond the schema: csvText is specified as raw CSV text from QBO export, and fileName is an optional label. This provides essential context for correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scans a QBO Journal Entries CSV for anomalies, specifically round-number lines. It specifies the input format, constraints, output (flagged lines with severity and URL), and includes usage cues. This distinguishes it from sibling tools by focusing on journal entries and round-number detection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use this tool: when a user pastes QBO data and asks about anomalies, round numbers, or suspicious items. It notes this is a Tier-0 subset and mentions the paid product for broader detection. However, it does not explicitly state when not to use it or compare to sibling tools like analyze_qbo_journal_cleanup.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_qbo_journal_cleanupAInspect

Scan a QuickBooks Online "Journal Entries" CSV export for cleanup issues — unbalanced journals (debits ≠ credits, with severity by deviation), duplicate journals (same date + same totals, likely posted twice), and schema problems (invalid dates, malformed amounts, missing accounts, missing journal numbers). Input is the raw CSV content the user pastes after exporting from QBO via Reports → Accountant → Journal → Export. Max 5,000 rows; max 5 MB. Returns a structured flag list with severity (high/medium/low), a roll-up summary by category and severity, parse diagnostics (column mapping + unmapped columns), and a shareable URL at agents.hellobooks.ai/r/{slug} (7-day TTL) that renders a branded analysis page suitable for sending to a CA or bookkeeper. Use this when a user pastes QBO journal data, asks "check my books", "find issues in my QBO journal", or "what is wrong with my journal entries". Each flag includes a fixableInHellobooks boolean — true means HelloBooks can resolve it automatically in the paid product.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly.
`fileName`	No	Optional original filename, used only as a label on the share page.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully discloses behavioral traits: max 5,000 rows, max 5 MB input, output structure (flag list, summary, diagnostics, shareable URL with 7-day TTL), and the 'fixableInHellobooks' boolean. It leaves no ambiguity about what the tool does and its constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but slightly verbose; however, it is well-structured with clear sections and front-loaded with the primary purpose. Every sentence adds value, though some details could be streamlined without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return format (structured flag list, roll-up summary, parse diagnostics, shareable URL). It covers input limits, output details, and use cases, making it complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema coverage is 100%, and the description adds value beyond the schema by explaining the origin of the CSV (QBO export path) and that the user pastes the content. While the schema descriptions already include the export instructions, the main description reinforces the context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Scan', 'for cleanup issues') and identifies three distinct categories of issues (unbalanced, duplicates, schema problems). It clearly distinguishes the tool's focus on cleanup of QBO journal exports, differentiating it from siblings like 'analyze_qbo_journal_anomalies' which likely addresses anomalies rather than cleanup.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: when a user pastes QBO journal data or asks specific queries like 'check my books' or 'find issues in my QBO journal'. It provides concrete examples but does not mention when not to use it or list alternative tools, missing a clear exclusionary criterion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_trial_balanceAInspect

Take a Trial Balance CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from headers — YTD columns indicate Xero, Opening Balance indicates Zoho, etc.) and run three checks: (1) tb.unbalanced — debits ≠ credits (every downstream P&L / BS / cash-flow report built from this TB is wrong until fixed); (2) tb.wrong_sign — accounts whose name suggests a class (Revenue / COGS / Expense / AR / AP) carrying a balance on the wrong side (classic posting-error signal); (3) tb.round_balance — exact-multiple-of-$10,000 balances (plug-entry signal). Input is raw CSV text of a Trial Balance report. Max 5,000 rows; max 5 MB. Returns flagged accounts with severity, a roll-up showing whether the TB balances, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a Trial Balance and asks "does my TB balance?", "are there sign errors?", "what looks suspicious?", or "is this TB clean?". The Trial Balance is the foundation document for every other financial statement — if it does not balance, every downstream report is invalid.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a Trial Balance report. Works with QuickBooks Online (Reports → Trial Balance), Xero (Reports → Trial Balance), Zoho Books (Reports → Accountant → Trial Balance), and Wave (Reports → Trial Balance). Source is auto-detected from column headers.
`fileName`	No	Optional filename for the share-page label.

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Given no annotations, description fully discloses behavior: three checks with implications, output components (flagged accounts, roll-up, diagnostics, URL), and limits (5000 rows, 5MB). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is informative but lengthy, especially the detailed explanation of each check's implications. While well-structured and front-loaded, it could be more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers input, auto-detection, checks, output (including shareable URL), and usage context. Without an output schema, it provides adequate detail on return values, though some specifics of the response structure are omitted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; both parameters have descriptions. The description reinforces the csvText parameter's source details but adds little beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool analyzes a Trial Balance CSV from specific sources, runs three distinct checks, and distinguishes itself from sibling tools like analyze_balance_sheet or analyze_profit_loss.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use ('does my TB balance?', 'are there sign errors?') and implies alternatives for journal-related queries. Does not explicitly state when not to use, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_xero_journal_anomaliesAInspect

Scan a Xero "Manual Journals" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Input is raw CSV text from Xero Accounting → Advanced → Manual Journals → Export. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes Xero data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Same Tier-0 / paid-product split as the QBO variant — history-aware anomaly checks (GL outliers, vendor history, archived-vendor activity, LLM-narrated suspicious) live in the authenticated MCP / paid product.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly.
`fileName`	No	Optional original filename, used only as a label on the share page.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses input constraints (max rows, size), the type of anomalies detected, and the severity classification. It also clarifies what it does not cover (advanced checks).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose and includes necessary details about input, usage, and limitations. It is structured but somewhat lengthy; however, each sentence contributes meaningful information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the output (flagged lines with severity levels and shareable URL) and provides context about anomaly criteria and tool scope. It covers input format and constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover both parameters at 100%. The description adds value by specifying the Xero export path and file size/row limits, supplementing the schema beyond basic descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scans a Xero Manual Journals CSV for anomalies like round-number lines with a specific threshold. It specifies the source format and what it detects, and differentiates from QBO variant and cleanup tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases ('when a user pastes Xero data and asks...') and distinguishes basic anomaly detection from advanced checks in the paid product. Could be more explicit about when to use sibling tools like analyze_xero_journal_cleanup.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_xero_journal_cleanupAInspect

Scan a Xero "Manual Journals" CSV export for cleanup issues — unbalanced journals, duplicate journals (same date + same totals), and schema problems (invalid dates, malformed amounts, missing account code/name, missing group key). Input is the raw CSV content the user pastes after exporting from Xero via Accounting → Advanced → Manual Journals → Export. Xero-specific idioms handled: signed Amount column (positive = credit, negative = debit), explicit Debit/Credit fallback shape, Reference-or-Narration+Date grouping, account code preferred over name. Max 5,000 rows; max 5 MB. Returns structured flags with severity, a roll-up summary, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes Xero manual-journal data, asks "check my Xero books", or "find issues in my Xero journal". The funnel CTA routes to /migrate/from-xero for users who want to fix at scale.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly.
`fileName`	No	Optional original filename, used only as a label on the share page.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses important behaviors: max 5,000 rows, max 5 MB, Xero-specific idioms (signed Amount, Debit/Credit fallback), input format, and output structure (flags, summary, diagnostics, shareable URL). It lacks explicit error handling or idempotency info but is still strong.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly long but every sentence is informative. It is front-loaded with the main purpose, followed by details, constraints, and output. It could be slightly more concise, but the thoroughness is justified given the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and moderate complexity, the description covers all essential aspects: what it does, input format, specific checks, constraints, output structure, and even a funnel CTA. It is fully sufficient for an agent to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The description adds extra context: explains the csvText is raw CSV from Xero export via specific path, and the fileName is an optional label. It also details the Xero-specific data format, which helps the agent prepare correct input.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Scan a Xero Manual Journals CSV export for cleanup issues', specifying verb (scan) and resource (Manual Journals CSV). It lists specific issues (unbalanced, duplicate, schema) and distinguishes from sibling tools like 'analyze_xero_journal_anomalies' by focusing on cleanup.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this when a user pastes Xero manual-journal data' and gives example queries ('check my Xero books', 'find issues in my Xero journal'). It provides clear context, though it does not explicitly mention alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_books_to_hellobooksAInspect

Take a QBO or Xero journal-entry CSV (source auto-detected), run the full Tier-0 detection set (imbalance + duplicates + round-number + schema), and return a structured side-by-side comparison — "your books have X issues; here is how HelloBooks resolves each phase". This is the direct funnel tool: the response includes per-category counts mapped to HelloBooks Phases 1, 2, 3.0, 3.1, with exclusive-advantage bullets (command-center dashboard, conversational interface, one-prompt JE posting, cross-phase orchestration, auto ID resolution). Use this when a user is evaluating HelloBooks vs their current QBO/Xero, asks "should I migrate?", or pastes data while comparing accounting software. Output is suitable for the host LLM to narrate as a positioning argument; the share URL points at a branded landing page with the issue breakdown and a 1-click migrate CTA.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from the headers.
`fileName`	No	Optional filename label.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description fully discloses behavior: auto-detects source, runs Tier-0 detection (imbalance, duplicates, round-number, schema), returns per-category counts mapped to phases, includes advantage bullets, and provides a share URL. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is informative but somewhat lengthy with promotional language. However, it is well-structured, front-loading the main action, and every sentence adds value for its purpose as a migration funnel tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description fully explains the return: structured side-by-side comparison with per-category counts, phases, advantage bullets, and share URL. It also notes output is suitable for host LLM narration. Comprehensive for a comparison tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for both parameters (csvText, fileName). Description adds little beyond schema details; it reiterates auto-detection and optional label. Baseline 3 is appropriate as the schema already explains parameters adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares a QBO/Xero journal-entry CSV against HelloBooks, running detection and returning a structured comparison. It distinguishes itself from sibling analysis tools by being a 'direct funnel tool' for migration evaluation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: when evaluating HelloBooks vs QBO/Xero, asking about migration, or comparing accounting software. Does not mention when not to use or list alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compliance_capabilitiesAInspect

Return supported compliance frameworks for a country (BAS, STP, GST, MTD, 1099, etc.) with version and certification info.

ParametersJSON Schema

Name	Required	Description	Default
`country`	Yes	Required ISO country code.

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It discloses that the tool returns version and certification info with examples, but does not explicitly state that it is a read-only operation, nor does it discuss error handling or authorization requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that is front-loaded with the action and resource, includes illustrative examples, and contains no extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input (one enum parameter), full schema coverage, and no output schema, the description is largely complete. It could be improved by clarifying the structure of the response, but it sufficiently conveys what the tool does.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The sole parameter 'country' has full schema coverage (enum values with description). The description adds context by listing example frameworks, but does not add meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Return'), the resource ('supported compliance frameworks for a country'), and provides specific examples (BAS, STP, GST, etc.). It distinguishes itself from siblings like compliance_deadlines.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates the tool is for retrieving compliance frameworks for a country, but does not explicitly state when to use it versus alternatives, nor does it mention any exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compliance_deadlinesAInspect

When statutory returns and payroll filings are due, per country. Covers IN (GSTR-1/3B/9/9C, CMP-08, Form 24Q, Form 16, PF ECR, ESI), AU (BAS, STP, Super Guarantee), GB (VAT MTD, RTI, Self Assessment), US (1099-NEC/MISC, W-2, Form 941/940), CA (T4, GST/HST). Optional country, frequency, and form filters. Note: dates rotate annually — every response carries a disclaimer with the per-deadline source URL for authority confirmation.

ParametersJSON Schema

Name	Required	Description
`form`	No	Substring match against form name, e.g. "GSTR-3B", "BAS", "1099", "T4". Case-insensitive.
`country`	No	ISO country code. Filters to one country (IN, AU, GB, US, CA covered today).
`frequency`	No	Filing cadence. Useful for "what are my monthly returns" style queries.

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that dates rotate annually and responses include a disclaimer with source URL. No annotations exist, so this adds value. However, it doesn't mention whether the data is real-time or cached, or any authentication or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that efficiently front-loads the purpose, then covers scope and filters. Every sentence adds information. It could be slightly improved with bullet points for the form lists, but it is already concise and scannable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately tells what the tool does, its parameters, and notable behavior (rotating dates, disclaimer). It lacks a description of the response format, but for a list-returning tool, this is generally acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description lists the three optional filters (country, frequency, form) but provides minimal additional meaning beyond the schema's own descriptions. It doesn't repeat the substring match detail for form, so value is marginal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns due dates for statutory returns and payroll filings per country, listing specific forms for multiple countries. This distinguishes it from all sibling tools, which focus on analysis or other compliance matters.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on what the tool covers (countries and forms) and the optional filters. While it doesn't explicitly exclude alternative uses, the sibling tool list shows no competing deadlines tool, making usage guidance adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

country_supportAInspect

Return features available per supported country (AU, IN, UK, US, CA, AE, SG, NZ).

ParametersJSON Schema

Name	Required	Description	Default
`country`	No	Single ISO country code. Omit for the full matrix.

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description indicates a read-only operation, but lacks details on caching, rate limits, or data freshness. With no behavioral context, it is minimally adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise and front-loaded. However, could include a brief usage hint to improve completeness without adding much length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema), the description covers the essential points: what it returns and for which countries. Adequate for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The tool description adds the list of countries and mentions 'Omit for the full matrix', which duplicates the schema's parameter description. No additional semantics beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Return' and the resource 'features available per supported country', listing all supported countries. Distinguishes from sibling tools that focus on analysis and compliance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'feature_search' or 'list_features'. Usage is implied but not clarified.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

estimate_migration_effortAInspect

Take a QBO or Xero journal-entry CSV (source auto-detected) and return a structured migration-effort estimate — row counts, unique-account count, period span, complexity classification (low / medium / high), human-hours estimate, assisted-hours estimate, and an indicative price quote in USD. Heuristic-based — refined against the live entity once the user signs up. Accepts larger files than the other analytical tools (up to 50,000 rows / 20 MB) because no detection runs here, just sizing. Use this when a user is weighing the cost of moving books to HelloBooks, pastes data and asks "how long will migration take?", "what would this cost?", or "is it worth migrating?". The funnel CTA points at /migrate/?ref= to start the assisted flow with the parsed sizing pre-populated.

ParametersJSON Schema

Name	Required	Description	Default
`csvText`	Yes	Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from headers.
`fileName`	No

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses that the estimation is heuristic-based and refined later, and that the tool only does sizing (no detection runs). This provides good transparency, though it could explicitly state it doesn't modify data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is dense but not verbose, front-loading the core purpose and adding necessary details (file size limits, heuristic nature, use cases) in a well-organized manner. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description enumerates output fields clearly. It covers heuristic behavior, file size limits, and use cases. Could mention error handling for invalid CSVs, but overall it's sufficiently complete for an agent to understand what the tool does and returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50%, and the description adds meaning beyond schema: source auto-detection, larger file capacity (50k rows/20 MB), and clarifies that the tool only sizes. For 'fileName', no additional info is provided, but overall the description compensates for schema gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool takes a QBO or Xero journal-entry CSV, auto-detects the source, and returns a structured migration-effort estimate with specific fields (row counts, unique-account count, period span, complexity classification, hours estimates, price quote). This distinctively differentiates it from sibling analytical tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use this when a user is weighing the cost of moving books to HelloBooks, pastes data and asks...'. Also mentions it accepts larger files than other analytical tools because no detection runs, providing clear context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

feature_searchAInspect

Free-text search across the marketing feature catalog, plan features, integrations, country features, compliance frameworks, competitor positioning, statutory deadlines, local payment methods, and published articles on hellobooks.ai. Queries like "vs Xero", "QuickBooks alternative", "GSTR-3B due", "UPI invoice", "1099 article", or "agentic accounting" surface the matching entry near the top.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max results to return (default 20).
`query`	Yes	Free-text query, e.g. "BAS lodgement", "multi-currency", "vs QuickBooks", "GSTR-3B due", or "UPI invoice cap".

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavioral traits. It describes search behavior and result ordering ('surface the matching entry near the top') but does not explicitly state it is read-only, mention authentication, rate limits, or what happens on empty results.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that efficiently conveys the scope and examples. It is front-loaded with the purpose, but could be slightly more structured (e.g., separate lines for examples) for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should indicate return format. It says 'surface the matching entry near the top' but does not clarify if results are a list, sorted by relevance, or include metadata. For a search tool with many siblings, more detail on response structure would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%. Description adds value by providing example queries for the 'query' parameter and stating the default for 'limit'. This supplements the schema without redundancy.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Free-text search across the marketing feature catalog...' listing many specific sources. It distinguishes from sibling list tools by being a search across those catalogs, and provides clear examples of queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example queries and contexts ('vs Xero', 'GSTR-3B due'), implying it is for free-text search. However, it lacks explicit guidance on when to use this tool versus sibling list tools (e.g., list_features, list_articles) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

free_tier_eligibilityAInspect

Return the HelloBooks Free-plan annual-invoice-turnover thresholds for the 8 supported countries (IN ₹40 lakh / US $100K / GB £90K / AU A$75K / CA C$30K / NZ NZ$60K / SG S$500K / AE AED 187.5K). Free is unlimited features and unlimited AI credits subject to the monthly allowance, but per-entity invoice turnover above the country cap forces an upgrade to Pro or Business. Call with no args to get the full table, with country for one threshold, or with country AND annualInvoiceRevenue (in the country currency, NOT USD-equivalent) for a freeEligible verdict with headroom math. Bank-feed total, cash receipts, and gross transaction volume are explicitly NOT used.

ParametersJSON Schema

Name	Required	Description	Default
`country`	No	ISO country code of the entity. Omit to return the full 8-country threshold table.
`annualInvoiceRevenue`	No	Annual invoice turnover for the entity, in its home currency (NOT USD-equivalent). Only used when `country` is also provided — the tool then returns a `freeEligible` verdict. Bank-feed total and cash receipts are explicitly NOT used by the gate.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully bears the burden of behavioral disclosure. It explains the eligibility logic, the exact currencies per country, the condition for upgrade, and explicitly excludes irrelevant revenue metrics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph. It is concise without wasted words, but could benefit from slight structuring (e.g., list of modes) for easier scanning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description covers all invocation behaviors, parameter dependencies, and the core logic. It is self-contained and leaves no ambiguity about tool behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds critical context: revenue must be in home currency, the verdict includes headroom math, and the thresholds per country. This far exceeds the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns free plan eligibility thresholds with country-specific caps and provides a verdict when revenue is supplied. It distinguishes itself from sibling analysis tools by focusing on a concrete pricing check.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies three invocation modes (no args, country only, country + revenue) and explicitly notes that bank-feed total etc. are not used. It does not contrast with sibling tools, but the usage is clearly scoped.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

how_munimji_helpsAInspect

Explain how HelloBooks and Munimji (the in-app AI assistant) help a specific business — given a free-text description of the user's own operations. Returns a curated capability knowledge base: business-operation areas (sales, purchases, banking, tax, reports, inventory, payroll, multi-entity, setup), and for each AI capability WHO does the work — autonomous (Munimji does it on its own, e.g. OCR extraction, running reports), approval (Munimji prepares the entry and you one-click approve before it posts to the ledger, e.g. AI categorization, find-and-match, creating invoices/bills by chat), assist (co-pilot, e.g. guided onboarding, voice), or manual (a software feature you run yourself). Each capability links to the backing software features. Use this when a user describes their business and asks "how can HelloBooks help me?", "what can the AI do for my shop/practice/agency?", or "what can Munimji do on its own vs what do I approve?". Pass their description in businessDescription; optionally filter by area or autonomy. The AI never posts to a ledger without approval. For the full software catalog call list_features; for pricing call list_plans.

ParametersJSON Schema

Name	Required	Description
`area`	No	Optional. Narrow to one business-operation area.
`autonomy`	No	Optional. Filter Munimji capabilities by who does the work: `autonomous` (Munimji does it alone), `approval` (Munimji prepares, you approve before it posts), `assist` (co-pilot), `manual` (software feature you run yourself).
`businessDescription`	No	Optional. The user describes their business and operations in their own words (industry, what they sell, how they get paid, who they pay, tax regime, pain points). It is echoed back as context — the calling assistant maps it to the returned areas + capabilities. No keyword scoring is done here; the LLM does the matching.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description discloses that the AI never posts to a ledger without approval, and explains how capabilities are categorized by autonomy. It is transparent about the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed but every sentence adds value. It is front-loaded with the main purpose. Could be slightly more concise, but it's well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains what is returned (curated capability knowledge base). It covers usage, parameters, return values, and provides complete context for an agent to use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions. The description adds extra context, such as that businessDescription is echoed back and no keyword scoring is done, which adds meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool explains how HelloBooks and Munimji help a specific business given a description. It lists the areas and autonomy levels, and distinguishes from sibling tools like list_features and list_plans.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use this tool (when user asks 'how can HelloBooks help me?') and when to use alternatives ('for the full software catalog call list_features; for pricing call list_plans').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_articlesAInspect

List published articles on hellobooks.ai — head-to-head compare pages and curated flagship blog posts. Filter by country, tag or free-text query. Use this when a user asks "do you have a blog/article about X?".

ParametersJSON Schema

Name	Required	Description
`tag`	No	Single tag to filter on (case-insensitive substring match against the article tag list). e.g. "gst", "1099", "tally".
`limit`	No	Max articles to return (default 20).
`query`	No	Free-text query — substring-matched against the title, excerpt and tags of each article. e.g. "QuickBooks alternative" or "audit trail".
`country`	No	ISO country code or "global". Returns articles whose countryRelevance matches OR is "global". Omit to return everything.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must cover behavioral traits. It states it lists published articles and filtering options, but does not disclose pagination, default limit (20), or what happens on no results. Adequate but could be more transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first defines purpose and scope, second provides usage context and examples. Every word is informative with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's 4 optional parameters and lack of output schema, the description covers the core functionality and when to use it. Missing details like default limit or return structure, but parameter descriptions in schema compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description reiterates filtering capabilities but does not add new meaning beyond the schema (e.g., parameter interaction logic).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action 'List' and the specific resource 'published articles on hellobooks.ai' including subtypes (compare pages, blog posts). Distinct from sibling tools like list_videos or list_features.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides an explicit, actionable usage guidance: 'Use this when a user asks "do you have a blog/article about X?"'. Does not mention when not to use or explicitly compare to alternatives, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_competitorsAInspect

Return competitor positioning entries (QuickBooks, Xero, FreshBooks, Wave, Zoho Books, Tally) with where HelloBooks wins, where the competitor wins, and pricing notes. Optional country, tier (primary / secondary), and id filters.

ParametersJSON Schema

Name	Required	Description
`id`	No	Return a single competitor by id (e.g. "quickbooks", "xero", "tally").
`tier`	No	Filter to head-on rivals (primary) or adjacent / segment-specific overlaps (secondary).
`country`	No	Only return competitors whose primary market is this country, or who are also evaluated in this market.

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the tool returns data with win/loss and pricing notes, implying a read-only operation, but it does not explicitly declare nondestructive behavior or disclose any safety, authentication, or rate-limiting aspects. The description is adequate but not enhanced beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: first sentence states purpose and return content, second sentence lists optional filters. No extraneous information, very efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (3 optional parameters, no nested objects, no output schema), the description adequately covers the return values (win/loss, pricing notes) and filter behavior. It is complete for an agent to understand and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all three parameters. The description adds value by mentioning specific competitors, return fields, and the purpose of filters (country, tier, id). This goes beyond the schema definitions that only name and enumerate the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Return', the resource 'competitor positioning entries', and enumerates specific competitors (QuickBooks, Xero, etc.). It also specifies return fields (wins, losses, pricing notes). This distinguishes it well from sibling tools which are mostly analysis functions or other list tools in different domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool (to retrieve competitor positioning with optional filters) but does not explicitly mention when not to use it or suggest alternatives. However, the sibling tools are quite distinct in purpose (balance sheet analysis, compliance, etc.), so the usage context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_credit_packsAInspect

List HelloBooks AI credit packs — one-time pay-as-you-go top-ups (Boost 5,000, Power 15,000, Mega 50,000, Ultra 150,000 credits) priced in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Credit packs stack on any plan, including Free. Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance. Filter by id (boost / power / mega / ultra) or country (ISO code).

ParametersJSON Schema

Name	Required	Description	Default
`id`	No	Restrict the response to a single credit pack.
`country`	No	ISO country code. Filters prices to one country. Omit to return all 8 markets.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It explains that the tool lists credit packs, their types, pricing in multiple currencies, and that they stack on any plan. It does not mention any destructive behaviors or side effects, which are expected for a read-only listing tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each adding value: first states purpose and examples, second explains stackability, third gives usage context and filtering. No fluff or unnecessary details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given there is no output schema, the description provides a reasonable sense of the response by mentioning pack names and regional currencies. It covers the main behaviors and filtering. It could be more explicit about the response structure, but it is sufficient for the agent to understand what the tool returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds some context (e.g., 'Filter by id (boost / power / mega / ultra) or country (ISO code)') but this largely repeats what is already in the schema's parameter descriptions. Thus, the description adds minimal additional meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'HelloBooks AI credit packs', specifying that they are one-time top-ups with specific pack names and regional pricing. It effectively distinguishes this tool from sibling tools which are mostly analysis or compliance tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance.' It also describes filtering options. While it does not explicitly mention when not to use it or provide alternatives, the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_feature_categoriesAInspect

List the 13 feature categories on the marketing site (Core Accounting, Invoicing, Banking, Reports, Tax & Compliance, Inventory, Warehouse, Manufacturing, AI, Integrations, Mobile, Operations, Industry Modules) with per-category counts by status.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides key behavioral details: it lists exactly 13 named categories and includes counts by status. No side effects or additional traits are needed for such a simple query tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that immediately states the action and lists the categories. It is front-loaded and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, no output schema), the description fully covers what the agent needs to know: the resource and the output format (counts by status). The 13 categories are enumerated for clarity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, and the description correctly avoids adding parameter information. The baseline for 0 parameters is 4, and the schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool lists the 13 feature categories on the marketing site, specifying each category and noting per-category counts by status. It distinguishes from siblings like list_features and list_integrations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Although no explicit when-to-use or when-not-to-use guidance is given, the description makes it clear this tool is for listing categories, which is distinct from other listing tools. The context is sufficient for an agent to infer appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_featuresCInspect

List the full HelloBooks marketing feature catalog (145+ items). Filter by category, tier, status, marketedOnly, or substring query.

ParametersJSON Schema

Name	Required	Description
`tier`	No	Filter by the minimum plan tier / add-on that unlocks the feature.
`limit`	No	Max number of features to return. Default 200.
`query`	No	Optional substring match against label + shortDescription.
`status`	No	Filter by rollout status. Defaults to all.
`category`	No	Filter to one feature category (core-accounting, invoicing-billing, etc.).
`marketedOnly`	No	If true, only return features marketed on the public website.

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only restates filter capabilities already in the schema and does not disclose behavioral traits like pagination, ordering, authentication needs, or data freshness beyond the default limit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that conveys the tool's purpose and key filters. It is appropriately front-loaded, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no annotations, the description covers core functionality but omits details like default filter values, response format, and ordering. It is adequate but not fully complete for a filterable list tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description enumerates filter names but adds no extra meaning or context beyond what the schema already provides for each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'full HelloBooks marketing feature catalog' with a count of 145+ items. It lists filter criteria, making the purpose clear. However, it does not explicitly differentiate from sibling tools like 'feature_search', which may cause confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'feature_search' or 'list_feature_categories'. It lacks explicit when-to-use or when-not-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_integrationsBInspect

List integrations (banks, payments, payroll, time tracking, shipping, accounting sync, ecommerce, CRM).

ParametersJSON Schema

Name	Required	Description
`status`	No	Filter by rollout status.
`country`	No	Only return integrations available in this country (or global).
`category`	No	Filter to one integration category.

Tool Definition Quality

B3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits such as read-only nature, output format, or pagination. It only restates the basic function, providing no additional context beyond the name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the tool's purpose with relevant examples. It is front-loaded and contains no unnecessary words, though it could be slightly expanded without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite low complexity, the description omits any details about the output format, such as whether it returns a list of integration names or detailed objects. While adequate for a simple listing, it leaves the agent guessing about the response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions (status, country, category). The description adds no extra meaning beyond what the schema already provides, earning the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'integrations', with examples of categories that distinguish it from sibling list tools like 'list_articles' or 'list_features'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description does not mention any conditions, prerequisites, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_plansAInspect

List HelloBooks pricing plans with monthly + annual prices in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Covers four core tiers — Free / Pro / Business / Partner Program (the cpa plan id) — plus two per-entity stackable add-ons (Warehouse, Manufacturing). Returns AI credit allowance, feature bullets (AI auto-categorization, unlimited users, multi-entity, 3-way matching, API access, etc.), and the public signup URL. Filter by plan (one of free / pro / business / cpa) or country (ISO code). Pricing follows Doc 19 v3 (Web-Fire #514, 2026-06-12): Business re-introduced as a 4th tier sized ~4× Pro to match Partner Points; the retired "$59.99/mo + $4.99/client" CPA SKU is gone and the cpa plan id now resolves to the free Partner Program (call partner_program_info for the status ladder + points math). HelloCPA Practice Management is a separate product on practice.hellobooks.ai — call practice_management_info, NOT this tool.

ParametersJSON Schema

Name	Required	Description	Default
`plan`	No	Restrict the response to a single plan tier (`cpa` = free Partner Program).
`country`	No	ISO country code. Filters prices to one country. Omit to return all 8 markets.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It transparently describes returns (prices, features, signup URL), pricing version reference (Doc 19 v3), and the deprecated/rebranded cpa plan. Implicitly read-only, but no explicit statement about safety. Slight gap in not stating it's a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with main purpose, but subsequent sentences are dense with valuable context. Could be slightly more concise, but every sentence adds necessary detail. Not overly verbose given the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all aspects: pricing tiers, currencies, add-ons, plan details, filtering, pricing version, disambiguation from related tools. No output schema, but return values are well described. Complete for a list/pricing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions, but description adds significant value: explains enum meanings (e.g., cpa = free Partner Program), notes effect of omitting country, and provides context on plan tiers (Free, Pro, Business, Partner Program).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it lists HelloBooks pricing plans with monthly and annual prices in 8 currencies, covering specific tiers and add-ons. It distinguishes itself from sibling tools like partner_program_info and practice_management_info by explicitly stating when to use those instead.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use (list plans) and when-not-to-use (e.g., for Partner Program status ladder call partner_program_info, for Practice Management call practice_management_info). Also explains filtering options by plan and country.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_tax_ratesAInspect

List statutory tax-rate slabs by jurisdiction — IN GST (5/12/18/28 + zero + exempt + composition trader/manufacturer/restaurant + compensation cess), UK VAT (20 / 5 / zero / exempt), AU GST (10 / GST-free), US sales-tax (state-administered summary, no federal rate), CA GST 5% + HST 13% ON / 15% Atlantic, SG GST 9%, NZ GST 15%, AE VAT 5%. Filter by country, taxType (GST/VAT/Sales-Tax/HST/IGST/CGST-SGST/TDS/TCS), or scheme (standard / reduced / zero / exempt / composition / cess / state-summary). Every entry carries an effective-from date and an authoritative source URL (CBIC, gov.uk, ATO, CRA, IRAS, IRD, FTA, Tax Foundation) — agents should confirm the rate against the source before quoting figures to a user. Use this when a user asks "what is the GST rate on X?", "what VAT band does Y fall into?", or "what are the composition slabs in India?". This is the public statutory reference — for an org-specific tax assignment use the authenticated books_classify_event tool.

ParametersJSON Schema

Name	Required	Description
`scheme`	No	Filter by slab category — standard, reduced, zero, exempt, composition, cess.
`country`	No	Filter to one jurisdiction. Omit to return every supported country.
`taxType`	No	Filter by statutory tax type (GST, VAT, Sales-Tax, HST, etc.).

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses that every entry has an effective-from date and authoritative source URL, and warns agents to confirm rates against the source before quoting. No destructive behavior implied.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is fairly long but front-loaded with core purpose, then detailed specifics. Every sentence adds value, but could be slightly more concise by grouping country examples more tightly. Still well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains what the output contains: slabs, rates, effective dates, source URLs. It covers all supported countries and tax types, making it complete for a list tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with enums, but description adds meaning by listing specific rates for countries (e.g., IN GST 5/12/18/28) and contextualizing what each filter achieves. It doesn't repeat schema but enriches understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists statutory tax-rate slabs by jurisdiction, using specific verb 'list' and resource 'tax-rate slabs'. It distinguishes from sibling tools like lookup_tax_rate and books_classify_event, ensuring the agent knows when to use this one.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes when to use (user asks about tax rates like 'what is the GST rate on X?') and when not to (org-specific tax assignment, directing to books_classify_event). Also advises confirming rates against provided source URLs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_videosAInspect

List HelloBooks product videos curated on the marketing site (homepage demo + feature walkthroughs) and the official @hellobooksai YouTube channel link. Each video returns title, description, category, watch URL, embed URL and thumbnail. Filter by category (demo / features / overview), featuredOnly, or free-text query. Use this when a user asks for a demo, walkthrough or video. Note: this is the curated set, not a live mirror of every channel upload — the response includes the channel URL for the full catalog.

ParametersJSON Schema

Name	Required	Description
`query`	No	Optional substring match against video title + description.
`category`	No	Filter to one video category (demo, features, overview).
`featuredOnly`	No	If true, only return videos flagged as featured on the marketing site.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses that the tool returns a curated set (not live mirror) and lists return fields. It implies read-only behavior but does not discuss authentication, rate limits, or pagination, which is acceptable for a simple list tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading the purpose and return fields, followed by filters and usage note. It is efficient but not perfectly concise; some minor redundancy exists (e.g., 'curated' mentioned twice).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (3 optional params, no output schema), the description is complete: it covers purpose, return fields, filter options, and context (curated set, channel URL for full catalog). No obvious gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description reinforces filter options by mentioning categories, featuredOnly, and query, but adds no new semantic details beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists HelloBooks product videos curated on the marketing site and YouTube channel. It specifies the resources and distinguishes from sibling tools like list_articles and list_features, which focus on other content types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises when to use this tool: 'Use this when a user asks for a demo, walkthrough or video.' It also notes the limitation of being a curated set and directs to the channel URL for the full catalog, providing an alternative for broader searches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

local_payment_methodsAInspect

List local bank-rail / wallet payment methods relevant to HelloBooks invoice collection (AR), B2B supplier payments (AP), and contractor payouts (UPI, RuPay, Razorpay, IMPS, NEFT, RTGS, BACS, FPS, CHAPS, Open Banking, Interac e-Transfer, EFT, PayID, PayTo, NPP, BPAY, ACH, Same Day ACH, Fedwire, RTP, Zelle, PayNow, FAST, GIRO, NZ Direct Credit, etc.). Returns rail (instant / same-day / next-day / multi-day), use-cases, issuing authority, HelloBooks support level, and operational notes (per-transaction caps, settlement windows, retirement timelines). Filter by country, useCase, rail, or id.

ParametersJSON Schema

Name	Required	Description
`id`	No	Return a single payment method by id (e.g. "in-upi", "au-payid", "us-rtp").
`rail`	No	Filter by settlement rail (instant, same-day, next-day, multi-day).
`country`	No	Filter to one country (IN, US, CA, GB, AU, AE, SG, NZ).
`useCase`	No	Filter by payment use-case. Defaults to HelloBooks' invoice-collection + b2b-supplier + contractor-payout scope; pass an explicit value to widen.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It discloses that the tool is a listing operation with no side effects, and it details the return fields (rail, use-cases, issuing authority, support level, operational notes). It does not mention authentication or rate limits, but for a read-only list tool this is acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph but is well-structured with the main purpose front-loaded. It includes many examples and return fields, which is necessary for a diverse set of payment methods. Could be slightly more concise, but each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description explicitly states what the tool returns (rail, use-cases, issuing authority, support level, operational notes). It covers the filtering options and default behavior, providing complete guidance for an agent to select and invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All 4 parameters have descriptions in the JSON schema (100% coverage). The description adds value beyond the schema by clarifying the default behavior of the useCase filter and listing examples of payment methods. It provides meaningful context for parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists local payment methods relevant to HelloBooks' invoice collection, B2B supplier payments, and contractor payouts. It provides specific examples (UPI, RuPay, etc.) and defines the return fields. No sibling tools have similar functionality, so it's well-distinguished.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the tool's purpose and the available filters (country, useCase, rail, id). It mentions a default scope for useCase, which guides typical usage. However, it does not explicitly state when not to use this tool or suggest alternatives, though sibling tools are unrelated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_tax_rateAInspect

Pick a single statutory tax-rate slab — either by exact id (e.g. IN-standard-18, GB-zero-0, CA-hst-13-on) for a deterministic lookup, or by country + free-text category (e.g. "office supplies", "restaurant", "exports", "domestic fuel") for a fuzzy best-match. Returns the matched rate, the match score, and the authoritative source URL. Use this when a user asks "what slab does X fall into in India?" or "what VAT rate applies to children's car seats?". For broader exploration (all slabs in a country / all rates of one scheme), use list_tax_rates. No customer data — public statutory reference only.

ParametersJSON Schema

Name	Required	Description
`id`	No	Exact rate id, e.g. ``IN-standard-18`` or ``GB-zero-0``. When set, country/category are ignored.
`country`	No	Country to search within. Required when ``id`` is not provided.
`category`	No	Free-text query — "office supplies", "restaurant", "exports", "domestic fuel".

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully carries the burden of behavioral disclosure. It explains the deterministic lookup for id, fuzzy matching for category, return values (rate, match score, source URL), and explicitly states 'No customer data — public statutory reference only.' This is comprehensive for a read-only lookup tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably concise but slightly verbose with examples; however, every sentence serves a purpose. It is front-loaded with the core action and then elaborates with modes and examples. Minor improvement could be made by tightening the example list, but overall well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, 1 enum, no output schema), the description is complete: it explains both usage modes, parameter semantics, return values, and differentiates from the sibling tool. The lack of output schema is compensated by stating what is returned (rate, match score, source URL).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Although schema coverage is 100%, the description adds significant meaning beyond the schema by explaining the two invocation modes, the precedence rule (id overrides country/category), and providing concrete examples for both id (e.g., 'GB-zero-0') and category (e.g., 'office supplies'). This aids the agent in constructing correct parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Pick a single statutory tax-rate slab' with two distinct modes (exact id or country+category). It distinguishes from the sibling tool list_tax_rates by specifying that this tool is for a single slab lookup, while the sibling is for broader exploration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: use when a user asks 'what slab does X fall into in India?' or 'what VAT rate applies to children's car seats?', and explicitly advises against using for broader exploration, directing to list_tax_rates instead. This clearly delineates when and when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

partner_program_infoAInspect

Return the HelloBooks Partner Program structure — free reseller channel for accountants/CPAs/bookkeepers who put their clients on Pro/Business plans. Earned wholesale discount model (no commission): Bronze 5% at 25 pts / Silver 10% at 75 / Gold 15% at 300 / Platinum 20% at 1,000+. Pro client = 1 pt/mo, Business client = 4 pts/mo. Call with no args for the full ladder + meta; with points for current status + how many points to next tier; with proClients and/or businessClients to derive points from the client book and then return the same status verdict. The Partner Program is the cpa plan id in list_plans (the retired $59.99 SKU is gone). HelloCPA Practice Management at practice.hellobooks.ai is a different product and NOT the Partner Program.

ParametersJSON Schema

Name	Required	Description
`points`	No	Current Partner Points total. If supplied, the response includes the partner's status + how many more points are needed to reach the next tier.
`proClients`	No	Active Pro-plan clients (1 point each per month). Combined with `businessClients` to derive points if `points` is not supplied directly.
`businessClients`	No	Active Business-plan clients (4 points each per month). Combined with `proClients` to derive points.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses the tool is read-only, describes the point derivation from client counts, and lists tier thresholds. It does not mention error handling or auth requirements, but the behavior is well-explained for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph. It is efficient and front-loaded with the core purpose, but could benefit from bullet points or structured sections for readability. However, every sentence adds value with no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so the description must cover return values. It mentions returning full ladder, current status, points to next tier, and status verdict. It does not detail the exact response structure, but given the simple nature of the tool, it is nearly complete. Sibling tools are mostly unrelated analysis tools, so context is adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have descriptions in the schema, but the description adds significant value beyond those: it explains point calculations (1 pt for Pro, 4 pts for Business), tier breakpoints (Bronze 5% at 25, etc.), and how parameters interact. This is a strong enhancement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns the Partner Program structure, specifies the audience (accountants/CPAs/bookkeepers), and distinguishes from sibling tools like list_plans (noting the plan id) and practice_management_info (explicitly stating it is a different product).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains three parameter combinations (no args, with points, with proClients/businessClients) and what each returns. It lacks explicit when-not-to-use instructions but provides sufficient context for usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

practice_management_infoAInspect

Return HelloCPA Practice Management info — the standalone product at practice.hellobooks.ai for running a CPA / CA / bookkeeping practice (proposals + CPQ, workflow, time tracking, billing, 6-role RBAC, Gmail/Outlook/Calendar sync, CSV migration from TaxDome / Karbon / Canopy). NOT the Partner Program and NOT a tier in list_plans. Per-user pricing model — US shipped at $9.99/user/month (free up to 2 users + 10 clients, 90-day trial, enterprise at 50+ users). 7 other markets (IN, GB, AU, CA, AE, SG, NZ) are roadmap as of 2026-06-12. Call with no args for the full 8-region matrix + features + meta, or with country for one region's status + pricing + competitor frame.

ParametersJSON Schema

Name	Required	Description	Default
`country`	No	ISO country code. Omit to get the full 8-region matrix. US is shipped; the other 7 are roadmap as of 2026-06-12.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses the tool returns an 8-region matrix with status, pricing, and competitor framing, and notes that 7 markets are roadmap as of a specific date. It also mentions per-user pricing, free tier, trial, and enterprise. No behavioral traits are hidden.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is packed with useful information but is relatively long. It front-loads the core function, then distinguishes from siblings, then provides pricing and usage details. Each sentence adds value, though slight trimming could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description fully explains return values: full matrix or per-region status/pricing/competitor frame. It also covers pricing model, free tier, trial, and roadmap. The tool has only one optional parameter, and all relevant details are covered.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (country parameter documented with enum and description). The description adds significant context beyond the schema, such as 'US is shipped; the other 7 are roadmap as of 2026-06-12' and explains behavior when omitted vs. provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns HelloCPA Practice Management info, specifies the standalone product at practice.hellobooks.ai, lists features, and distinguishes it from the Partner Program and plan tiers. The verb 'Return' plus explicit resource makes purpose unmistakable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use it (for Practice Management info) and what it is NOT (not Partner Program, not a plan tier). It provides guidance on calling without args for full matrix or with country for one region, and references sibling tools like list_plans and partner_program_info indirectly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?