HelloBooks AI Agents
Server Details
agents.hellobooks.ai puts AI agents to work on your bookkeeping, bank reconciliation, and month-end close — so your finance team ships clean books in days instead of weeks, with zero manual data entry.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.3/5 across 26 of 26 tools scored. Lowest: 3/5.
Each tool has a clearly distinct purpose, with detailed descriptions that specify the exact input and use case. Even similar tools like QBO vs Xero journal analyzers are differentiated by source system and specific checks, leaving no ambiguity.
The majority of tools follow a verb_noun snake_case pattern (e.g., analyze_balance_sheet, list_features). A few exceptions like how_munimji_helps and feature_search deviate slightly, but overall the convention is consistent and readable.
With 26 tools covering financial analysis, compliance, migration, and product info, the count is on the higher side but justified by the breadth of the accounting domain. Each tool serves a specific need without redundancy.
The tool set covers all major financial statement checks (balance sheet, P&L, trial balance, journal entries), compliance, migration, and product details. A notable gap is the lack of cash flow analysis, but the set is otherwise comprehensive for its scope.
Available Tools
26 toolsanalyze_balance_sheetAInspect
Take a Balance Sheet CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected) and run three checks: (1) bs.equation_broken — the fundamental accounting equation Assets = Liabilities + Equity does not hold (every downstream ratio analysis is invalid until fixed); (2) bs.negative_asset — Cash / AR / Inventory line items with negative balances (reconciliation error signal); (3) bs.negative_equity — Total Equity < 0 (insolvency signal). Input is raw CSV text of a Balance Sheet (Reports → Balance Sheet in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, totals (totalAssets, totalLiabilities, totalEquity, equationBalances boolean), and a shareable URL. Use this when a user pastes a Balance Sheet and asks "does my balance sheet balance?", "is the accounting equation satisfied?", or "is my company solvent on paper?". A Balance Sheet that fails Assets = Liabilities + Equity invalidates every downstream financial-ratio analysis — this is the single most important check for any BS.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Balance Sheet report. Works with QuickBooks Online (Reports → Balance Sheet), Xero (Reports → Balance Sheet), Zoho Books (Reports → Balance Sheet), and Wave (Reports → Balance Sheet). Statement should include Total Assets, Total Liabilities, and Total Equity rows. Source is auto-detected from section name signatures. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses auto-detection of source, three checks, return fields (flags, totals, URL), and constraints (max rows, max size). It lacks explicit statement about read-only nature but since it's analysis-only, it's adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with core action and checks, followed by input details and usage prompts. While verbose, each sentence serves a purpose. Minor reduction could improve conciseness but structure is logical.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description sufficiently covers return values (flags, totals, URL) and constraints. It answers key agent questions about input format, limits, and typical use cases. Lacks details on severity levels but otherwise complete for an analysis tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and baseline is 3. The description adds significant value beyond schema descriptions by detailing supported accounting software, required row structure, and auto-detection logic. The fileName parameter is clarified as optional label input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool takes a Balance Sheet CSV and runs three specific checks (equation broken, negative asset, negative equity). It uniquely identifies the resource and verb, and distinguishes from sibling tools like analyze_profit_loss by focusing on balance sheet analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists example user queries that trigger use this tool, such as 'does my balance sheet balance?' and 'is my company solvent?'. It also warns about downstream invalidity if equation fails. However, it does not contrast with sibling tools like analyze_trial_balance or state when not to use, preventing a perfect score.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_journal_varianceAInspect
Compare two periods of journal-entry data (QBO or Xero — source auto-detected from headers) and flag accounts whose movement deviates materially between periods. Aggregates lines per account into a net total for each period, then surfaces accounts where the period-over-period change crosses a materiality threshold (≥5% relative AND ≥$100 absolute; severity high at ≥50%, medium at ≥20%, low at ≥5%). Inputs are two CSV exports — periodACsv (earlier period) and periodBCsv (later period). Optional periodALabel / periodBLabel for human-readable flag messages (e.g. "Q1 FY2024" vs "Q2 FY2024"). Max 5,000 rows per period; max 5 MB each. Use this when a user pastes two periods and asks "what changed?", "show me variances", "what jumped period-over-period". Returns a flag list ordered by largest delta, a roll-up, and a shareable URL. Both periods must be the same source — mixing QBO + Xero in one call returns an error.
| Name | Required | Description | Default |
|---|---|---|---|
| periodACsv | Yes | Raw CSV text of the EARLIER period's journal-entry export (QBO Journal Entries or Xero Manual Journals). Source is auto-detected from the headers. | |
| periodBCsv | Yes | Raw CSV text of the LATER period's journal-entry export. Source is auto-detected from the headers (must match periodACsv). | |
| periodALabel | No | Optional human label for the earlier period — e.g. "Q1 FY2024". Used in flag messages. | |
| periodBLabel | No | Optional human label for the later period — e.g. "Q2 FY2024". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: auto-detection of source, aggregation logic, materiality thresholds (≥5%, ≥100), severity levels, row/ size limits, and return type. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Reasonably concise for the detail provided, but dense. Could benefit from bullet points for readability, but not a major issue for AI use.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description covers inputs, behavior, constraints, output (flag list, roll-up, shareable URL), and error condition. Complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds significant value: explains periodACsv/BCsv as earlier/later, optional labels for human-readable output, and auto-detection. Goes beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool compares two periods of journal-entry data and flags material deviations. It uses specific verbs ('compare', 'flag') and resources ('journal-entry data'), and distinguishes well from sibling tools like analyze_balance_sheet.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly mentions when to use (e.g., 'what changed?', 'show me variances'), and includes constraints like same source required. Lacks explicit 'when not to use' or alternatives, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_profit_lossAInspect
Take a Profit & Loss / Income Statement CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from section names) and run three checks: (1) pnl.subtotal_mismatch — each "Total Section" subtotal equals the sum of its preceding line items (catches missing or duplicated rows); (2) pnl.negative_expense — flags expense-section line items with negative amounts (usually sign-flips or refunds posted to the wrong side); (3) pnl.margin_red_flag — gross-profit margin < 5% or > 95%, or negative total revenue. Input is raw CSV text of a P&L report (Reports → Profit and Loss in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, a summary with totalRevenue / totalCogs / grossProfit / grossMarginPct / netIncome (when detected), and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a P&L and asks "does my P&L look right?", "any sign errors?", "what is my gross margin?", or "anything suspicious in my income statement?". For period-over-period comparison use analyze_journal_variance with two periods of journal-entry data; this tool is single-period only.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Profit & Loss / Income Statement report. Works with QuickBooks Online (Reports → Profit and Loss), Xero (Reports → Profit and Loss), Zoho Books (Reports → Profit & Loss), and Wave (Reports → Profit & Loss). Source is auto-detected from section names. Statement should include section headers, line items, "Total X" subtotals, and a Net Income / Net Profit row at the bottom. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It details the three checks performed (subtotal_mismatch, negative_expense, margin_red_flag), constraints (max 5000 rows, max 5 MB), and output (flags with severity, summary, shareable URL). It does not mention any destructive effects or authentication requirements, but the tool is read-only by nature, so the description is sufficiently transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is detailed but efficient, front-loading the core action and checks. Each sentence adds value: sources, three checks, input format, limits, output, usage guidance. It could be slightly shortened without losing substance, but it is well-structured and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is complete for a tool with no output schema. It covers the input format, supported sources, the three checks with their purposes, constraints, output fields (flags, summary with financial metrics, shareable URL), and explicit usage examples. It also provides a clear alternative for multi-period analysis, making it self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already documents both parameters. The description adds significant value by explaining the expected format of csvText (source auto-detection, required sections) and the optional nature of fileName. This goes beyond the schema's minimal descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it analyzes a P&L CSV from specific sources (QuickBooks, Xero, Zoho, Wave) and runs three specific checks. It distinguishes itself from sibling tools like analyze_balance_sheet and analyze_journal_variance by explicitly mentioning the tool's scope (single-period P&L) and alternatives for period-over-period analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool (user pastes a P&L and asks about correctness, sign errors, gross margin, or suspicious items) and when not to (period-over-period comparison should use analyze_journal_variance). This provides clear guidance for an AI agent to select the correct tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_qbo_journal_anomaliesAInspect
Scan a QuickBooks Online "Journal Entries" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Round numbers are statistically rare in real bookkeeping and frequently indicate estimates, plugs, or fraud signals worth review. Input is raw CSV text from QBO Reports → Accountant → Journal. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes QBO data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Tier-0 subset — HelloBooks Phase 3.0 anomaly detection in the paid product additionally catches GL outliers vs entity history, vendor-history mismatches, archived-vendor activity, and AI-narrated suspicious lines (which require the live HelloBooks account).
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It details the scanning behavior (round-number lines, multiples of $1,000 above $1,000 threshold), severity levels, and output including a shareable URL. It also states limits (5k rows, 5 MB). It does not cover authentication or data storage, but the description is sufficiently transparent for the tool's purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured, starting with the main action, then details, usage cues, constraints, and relationship to paid product. Every sentence adds value without redundancy. It is concise yet comprehensive.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains output (flagged lines with severity, URL) and input constraints. It covers the main use cases. However, it does not differentiate from sibling tools like analyze_qbo_journal_cleanup, which is a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds significant meaning beyond the schema: csvText is specified as raw CSV text from QBO export, and fileName is an optional label. This provides essential context for correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans a QBO Journal Entries CSV for anomalies, specifically round-number lines. It specifies the input format, constraints, output (flagged lines with severity and URL), and includes usage cues. This distinguishes it from sibling tools by focusing on journal entries and round-number detection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use this tool: when a user pastes QBO data and asks about anomalies, round numbers, or suspicious items. It notes this is a Tier-0 subset and mentions the paid product for broader detection. However, it does not explicitly state when not to use it or compare to sibling tools like analyze_qbo_journal_cleanup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_qbo_journal_cleanupAInspect
Scan a QuickBooks Online "Journal Entries" CSV export for cleanup issues — unbalanced journals (debits ≠ credits, with severity by deviation), duplicate journals (same date + same totals, likely posted twice), and schema problems (invalid dates, malformed amounts, missing accounts, missing journal numbers). Input is the raw CSV content the user pastes after exporting from QBO via Reports → Accountant → Journal → Export. Max 5,000 rows; max 5 MB. Returns a structured flag list with severity (high/medium/low), a roll-up summary by category and severity, parse diagnostics (column mapping + unmapped columns), and a shareable URL at agents.hellobooks.ai/r/{slug} (7-day TTL) that renders a branded analysis page suitable for sending to a CA or bookkeeper. Use this when a user pastes QBO journal data, asks "check my books", "find issues in my QBO journal", or "what is wrong with my journal entries". Each flag includes a fixableInHellobooks boolean — true means HelloBooks can resolve it automatically in the paid product.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses behavioral traits: max 5,000 rows, max 5 MB input, output structure (flag list, summary, diagnostics, shareable URL with 7-day TTL), and the 'fixableInHellobooks' boolean. It leaves no ambiguity about what the tool does and its constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is comprehensive but slightly verbose; however, it is well-structured with clear sections and front-loaded with the primary purpose. Every sentence adds value, though some details could be streamlined without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return format (structured flag list, roll-up summary, parse diagnostics, shareable URL). It covers input limits, output details, and use cases, making it complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema coverage is 100%, and the description adds value beyond the schema by explaining the origin of the CSV (QBO export path) and that the user pastes the content. While the schema descriptions already include the export instructions, the main description reinforces the context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Scan', 'for cleanup issues') and identifies three distinct categories of issues (unbalanced, duplicates, schema problems). It clearly distinguishes the tool's focus on cleanup of QBO journal exports, differentiating it from siblings like 'analyze_qbo_journal_anomalies' which likely addresses anomalies rather than cleanup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool: when a user pastes QBO journal data or asks specific queries like 'check my books' or 'find issues in my QBO journal'. It provides concrete examples but does not mention when not to use it or list alternative tools, missing a clear exclusionary criterion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_trial_balanceAInspect
Take a Trial Balance CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from headers — YTD columns indicate Xero, Opening Balance indicates Zoho, etc.) and run three checks: (1) tb.unbalanced — debits ≠ credits (every downstream P&L / BS / cash-flow report built from this TB is wrong until fixed); (2) tb.wrong_sign — accounts whose name suggests a class (Revenue / COGS / Expense / AR / AP) carrying a balance on the wrong side (classic posting-error signal); (3) tb.round_balance — exact-multiple-of-$10,000 balances (plug-entry signal). Input is raw CSV text of a Trial Balance report. Max 5,000 rows; max 5 MB. Returns flagged accounts with severity, a roll-up showing whether the TB balances, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a Trial Balance and asks "does my TB balance?", "are there sign errors?", "what looks suspicious?", or "is this TB clean?". The Trial Balance is the foundation document for every other financial statement — if it does not balance, every downstream report is invalid.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Trial Balance report. Works with QuickBooks Online (Reports → Trial Balance), Xero (Reports → Trial Balance), Zoho Books (Reports → Accountant → Trial Balance), and Wave (Reports → Trial Balance). Source is auto-detected from column headers. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, description fully discloses behavior: three checks with implications, output components (flagged accounts, roll-up, diagnostics, URL), and limits (5000 rows, 5MB). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is informative but lengthy, especially the detailed explanation of each check's implications. While well-structured and front-loaded, it could be more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers input, auto-detection, checks, output (including shareable URL), and usage context. Without an output schema, it provides adequate detail on return values, though some specifics of the response structure are omitted.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; both parameters have descriptions. The description reinforces the csvText parameter's source details but adds little beyond schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool analyzes a Trial Balance CSV from specific sources, runs three distinct checks, and distinguishes itself from sibling tools like analyze_balance_sheet or analyze_profit_loss.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly specifies when to use ('does my TB balance?', 'are there sign errors?') and implies alternatives for journal-related queries. Does not explicitly state when not to use, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_xero_journal_anomaliesAInspect
Scan a Xero "Manual Journals" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Input is raw CSV text from Xero Accounting → Advanced → Manual Journals → Export. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes Xero data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Same Tier-0 / paid-product split as the QBO variant — history-aware anomaly checks (GL outliers, vendor history, archived-vendor activity, LLM-narrated suspicious) live in the authenticated MCP / paid product.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses input constraints (max rows, size), the type of anomalies detected, and the severity classification. It also clarifies what it does not cover (advanced checks).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the purpose and includes necessary details about input, usage, and limitations. It is structured but somewhat lengthy; however, each sentence contributes meaningful information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the output (flagged lines with severity levels and shareable URL) and provides context about anomaly criteria and tool scope. It covers input format and constraints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover both parameters at 100%. The description adds value by specifying the Xero export path and file size/row limits, supplementing the schema beyond basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans a Xero Manual Journals CSV for anomalies like round-number lines with a specific threshold. It specifies the source format and what it detects, and differentiates from QBO variant and cleanup tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases ('when a user pastes Xero data and asks...') and distinguishes basic anomaly detection from advanced checks in the paid product. Could be more explicit about when to use sibling tools like analyze_xero_journal_cleanup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_xero_journal_cleanupAInspect
Scan a Xero "Manual Journals" CSV export for cleanup issues — unbalanced journals, duplicate journals (same date + same totals), and schema problems (invalid dates, malformed amounts, missing account code/name, missing group key). Input is the raw CSV content the user pastes after exporting from Xero via Accounting → Advanced → Manual Journals → Export. Xero-specific idioms handled: signed Amount column (positive = credit, negative = debit), explicit Debit/Credit fallback shape, Reference-or-Narration+Date grouping, account code preferred over name. Max 5,000 rows; max 5 MB. Returns structured flags with severity, a roll-up summary, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes Xero manual-journal data, asks "check my Xero books", or "find issues in my Xero journal". The funnel CTA routes to /migrate/from-xero for users who want to fix at scale.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses important behaviors: max 5,000 rows, max 5 MB, Xero-specific idioms (signed Amount, Debit/Credit fallback), input format, and output structure (flags, summary, diagnostics, shareable URL). It lacks explicit error handling or idempotency info but is still strong.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly long but every sentence is informative. It is front-loaded with the main purpose, followed by details, constraints, and output. It could be slightly more concise, but the thoroughness is justified given the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and moderate complexity, the description covers all essential aspects: what it does, input format, specific checks, constraints, output structure, and even a funnel CTA. It is fully sufficient for an agent to invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions. The description adds extra context: explains the csvText is raw CSV from Xero export via specific path, and the fileName is an optional label. It also details the Xero-specific data format, which helps the agent prepare correct input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Scan a Xero Manual Journals CSV export for cleanup issues', specifying verb (scan) and resource (Manual Journals CSV). It lists specific issues (unbalanced, duplicate, schema) and distinguishes from sibling tools like 'analyze_xero_journal_anomalies' by focusing on cleanup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when a user pastes Xero manual-journal data' and gives example queries ('check my Xero books', 'find issues in my Xero journal'). It provides clear context, though it does not explicitly mention alternatives or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_books_to_hellobooksAInspect
Take a QBO or Xero journal-entry CSV (source auto-detected), run the full Tier-0 detection set (imbalance + duplicates + round-number + schema), and return a structured side-by-side comparison — "your books have X issues; here is how HelloBooks resolves each phase". This is the direct funnel tool: the response includes per-category counts mapped to HelloBooks Phases 1, 2, 3.0, 3.1, with exclusive-advantage bullets (command-center dashboard, conversational interface, one-prompt JE posting, cross-phase orchestration, auto ID resolution). Use this when a user is evaluating HelloBooks vs their current QBO/Xero, asks "should I migrate?", or pastes data while comparing accounting software. Output is suitable for the host LLM to narrate as a positioning argument; the share URL points at a branded landing page with the issue breakdown and a 1-click migrate CTA.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from the headers. | |
| fileName | No | Optional filename label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description fully discloses behavior: auto-detects source, runs Tier-0 detection (imbalance, duplicates, round-number, schema), returns per-category counts mapped to phases, includes advantage bullets, and provides a share URL. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is informative but somewhat lengthy with promotional language. However, it is well-structured, front-loading the main action, and every sentence adds value for its purpose as a migration funnel tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description fully explains the return: structured side-by-side comparison with per-category counts, phases, advantage bullets, and share URL. It also notes output is suitable for host LLM narration. Comprehensive for a comparison tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for both parameters (csvText, fileName). Description adds little beyond schema details; it reiterates auto-detection and optional label. Baseline 3 is appropriate as the schema already explains parameters adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool compares a QBO/Xero journal-entry CSV against HelloBooks, running detection and returning a structured comparison. It distinguishes itself from sibling analysis tools by being a 'direct funnel tool' for migration evaluation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: when evaluating HelloBooks vs QBO/Xero, asking about migration, or comparing accounting software. Does not mention when not to use or list alternatives, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_capabilitiesAInspect
Return supported compliance frameworks for a country (BAS, STP, GST, MTD, 1099, etc.) with version and certification info.
| Name | Required | Description | Default |
|---|---|---|---|
| country | Yes | Required ISO country code. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It discloses that the tool returns version and certification info with examples, but does not explicitly state that it is a read-only operation, nor does it discuss error handling or authorization requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single sentence that is front-loaded with the action and resource, includes illustrative examples, and contains no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input (one enum parameter), full schema coverage, and no output schema, the description is largely complete. It could be improved by clarifying the structure of the response, but it sufficiently conveys what the tool does.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The sole parameter 'country' has full schema coverage (enum values with description). The description adds context by listing example frameworks, but does not add meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Return'), the resource ('supported compliance frameworks for a country'), and provides specific examples (BAS, STP, GST, etc.). It distinguishes itself from siblings like compliance_deadlines.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates the tool is for retrieving compliance frameworks for a country, but does not explicitly state when to use it versus alternatives, nor does it mention any exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_deadlinesAInspect
When statutory returns and payroll filings are due, per country. Covers IN (GSTR-1/3B/9/9C, CMP-08, Form 24Q, Form 16, PF ECR, ESI), AU (BAS, STP, Super Guarantee), GB (VAT MTD, RTI, Self Assessment), US (1099-NEC/MISC, W-2, Form 941/940), CA (T4, GST/HST). Optional country, frequency, and form filters. Note: dates rotate annually — every response carries a disclaimer with the per-deadline source URL for authority confirmation.
| Name | Required | Description | Default |
|---|---|---|---|
| form | No | Substring match against form name, e.g. "GSTR-3B", "BAS", "1099", "T4". Case-insensitive. | |
| country | No | ISO country code. Filters to one country (IN, AU, GB, US, CA covered today). | |
| frequency | No | Filing cadence. Useful for "what are my monthly returns" style queries. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that dates rotate annually and responses include a disclaimer with source URL. No annotations exist, so this adds value. However, it doesn't mention whether the data is real-time or cached, or any authentication or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that efficiently front-loads the purpose, then covers scope and filters. Every sentence adds information. It could be slightly improved with bullet points for the form lists, but it is already concise and scannable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately tells what the tool does, its parameters, and notable behavior (rotating dates, disclaimer). It lacks a description of the response format, but for a list-returning tool, this is generally acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description lists the three optional filters (country, frequency, form) but provides minimal additional meaning beyond the schema's own descriptions. It doesn't repeat the substring match detail for form, so value is marginal.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns due dates for statutory returns and payroll filings per country, listing specific forms for multiple countries. This distinguishes it from all sibling tools, which focus on analysis or other compliance matters.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on what the tool covers (countries and forms) and the optional filters. While it doesn't explicitly exclude alternative uses, the sibling tool list shows no competing deadlines tool, making usage guidance adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
country_supportAInspect
Return features available per supported country (AU, IN, UK, US, CA, AE, SG, NZ).
| Name | Required | Description | Default |
|---|---|---|---|
| country | No | Single ISO country code. Omit for the full matrix. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. Description indicates a read-only operation, but lacks details on caching, rate limits, or data freshness. With no behavioral context, it is minimally adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is concise and front-loaded. However, could include a brief usage hint to improve completeness without adding much length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one optional parameter, no output schema), the description covers the essential points: what it returns and for which countries. Adequate for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The tool description adds the list of countries and mentions 'Omit for the full matrix', which duplicates the schema's parameter description. No additional semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'Return' and the resource 'features available per supported country', listing all supported countries. Distinguishes from sibling tools that focus on analysis and compliance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like 'feature_search' or 'list_features'. Usage is implied but not clarified.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
estimate_migration_effortAInspect
Take a QBO or Xero journal-entry CSV (source auto-detected) and return a structured migration-effort estimate — row counts, unique-account count, period span, complexity classification (low / medium / high), human-hours estimate, assisted-hours estimate, and an indicative price quote in USD. Heuristic-based — refined against the live entity once the user signs up. Accepts larger files than the other analytical tools (up to 50,000 rows / 20 MB) because no detection runs here, just sizing. Use this when a user is weighing the cost of moving books to HelloBooks, pastes data and asks "how long will migration take?", "what would this cost?", or "is it worth migrating?". The funnel CTA points at /migrate/?ref= to start the assisted flow with the parsed sizing pre-populated.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from headers. | |
| fileName | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It discloses that the estimation is heuristic-based and refined later, and that the tool only does sizing (no detection runs). This provides good transparency, though it could explicitly state it doesn't modify data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense but not verbose, front-loading the core purpose and adding necessary details (file size limits, heuristic nature, use cases) in a well-organized manner. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description enumerates output fields clearly. It covers heuristic behavior, file size limits, and use cases. Could mention error handling for invalid CSVs, but overall it's sufficiently complete for an agent to understand what the tool does and returns.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 50%, and the description adds meaning beyond schema: source auto-detection, larger file capacity (50k rows/20 MB), and clarifies that the tool only sizes. For 'fileName', no additional info is provided, but overall the description compensates for schema gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool takes a QBO or Xero journal-entry CSV, auto-detects the source, and returns a structured migration-effort estimate with specific fields (row counts, unique-account count, period span, complexity classification, hours estimates, price quote). This distinctively differentiates it from sibling analytical tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'Use this when a user is weighing the cost of moving books to HelloBooks, pastes data and asks...'. Also mentions it accepts larger files than other analytical tools because no detection runs, providing clear context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
feature_searchAInspect
Free-text search across the marketing feature catalog, plan features, integrations, country features, compliance frameworks, competitor positioning, statutory deadlines, local payment methods, and published articles on hellobooks.ai. Queries like "vs Xero", "QuickBooks alternative", "GSTR-3B due", "UPI invoice", "1099 article", or "agentic accounting" surface the matching entry near the top.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return (default 20). | |
| query | Yes | Free-text query, e.g. "BAS lodgement", "multi-currency", "vs QuickBooks", "GSTR-3B due", or "UPI invoice cap". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must disclose behavioral traits. It describes search behavior and result ordering ('surface the matching entry near the top') but does not explicitly state it is read-only, mention authentication, rate limits, or what happens on empty results.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that efficiently conveys the scope and examples. It is front-loaded with the purpose, but could be slightly more structured (e.g., separate lines for examples) for readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description should indicate return format. It says 'surface the matching entry near the top' but does not clarify if results are a list, sorted by relevance, or include metadata. For a search tool with many siblings, more detail on response structure would help.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. Description adds value by providing example queries for the 'query' parameter and stating the default for 'limit'. This supplements the schema without redundancy.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Free-text search across the marketing feature catalog...' listing many specific sources. It distinguishes from sibling list tools by being a search across those catalogs, and provides clear examples of queries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides example queries and contexts ('vs Xero', 'GSTR-3B due'), implying it is for free-text search. However, it lacks explicit guidance on when to use this tool versus sibling list tools (e.g., list_features, list_articles) or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
how_munimji_helpsAInspect
Explain how HelloBooks and Munimji (the in-app AI assistant) help a specific business — given a free-text description of the user's own operations. Returns a curated capability knowledge base: business-operation areas (sales, purchases, banking, tax, reports, inventory, payroll, multi-entity, setup), and for each AI capability WHO does the work — autonomous (Munimji does it on its own, e.g. OCR extraction, running reports), approval (Munimji prepares the entry and you one-click approve before it posts to the ledger, e.g. AI categorization, find-and-match, creating invoices/bills by chat), assist (co-pilot, e.g. guided onboarding, voice), or manual (a software feature you run yourself). Each capability links to the backing software features. Use this when a user describes their business and asks "how can HelloBooks help me?", "what can the AI do for my shop/practice/agency?", or "what can Munimji do on its own vs what do I approve?". Pass their description in businessDescription; optionally filter by area or autonomy. The AI never posts to a ledger without approval. For the full software catalog call list_features; for pricing call list_plans.
| Name | Required | Description | Default |
|---|---|---|---|
| area | No | Optional. Narrow to one business-operation area. | |
| autonomy | No | Optional. Filter Munimji capabilities by who does the work: `autonomous` (Munimji does it alone), `approval` (Munimji prepares, you approve before it posts), `assist` (co-pilot), `manual` (software feature you run yourself). | |
| businessDescription | No | Optional. The user describes their business and operations in their own words (industry, what they sell, how they get paid, who they pay, tax regime, pain points). It is echoed back as context — the calling assistant maps it to the returned areas + capabilities. No keyword scoring is done here; the LLM does the matching. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description discloses that the AI never posts to a ledger without approval, and explains how capabilities are categorized by autonomy. It is transparent about the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is detailed but every sentence adds value. It is front-loaded with the main purpose. Could be slightly more concise, but it's well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description fully explains what is returned (curated capability knowledge base). It covers usage, parameters, return values, and provides complete context for an agent to use the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with good descriptions. The description adds extra context, such as that businessDescription is echoed back and no keyword scoring is done, which adds meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool explains how HelloBooks and Munimji help a specific business given a description. It lists the areas and autonomy levels, and distinguishes from sibling tools like list_features and list_plans.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use this tool (when user asks 'how can HelloBooks help me?') and when to use alternatives ('for the full software catalog call list_features; for pricing call list_plans').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_articlesAInspect
List published articles on hellobooks.ai — head-to-head compare pages and curated flagship blog posts. Filter by country, tag or free-text query. Use this when a user asks "do you have a blog/article about X?".
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | Single tag to filter on (case-insensitive substring match against the article tag list). e.g. "gst", "1099", "tally". | |
| limit | No | Max articles to return (default 20). | |
| query | No | Free-text query — substring-matched against the title, excerpt and tags of each article. e.g. "QuickBooks alternative" or "audit trail". | |
| country | No | ISO country code or "global". Returns articles whose countryRelevance matches OR is "global". Omit to return everything. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must cover behavioral traits. It states it lists published articles and filtering options, but does not disclose pagination, default limit (20), or what happens on no results. Adequate but could be more transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first defines purpose and scope, second provides usage context and examples. Every word is informative with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's 4 optional parameters and lack of output schema, the description covers the core functionality and when to use it. Missing details like default limit or return structure, but parameter descriptions in schema compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description reiterates filtering capabilities but does not add new meaning beyond the schema (e.g., parameter interaction logic).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action 'List' and the specific resource 'published articles on hellobooks.ai' including subtypes (compare pages, blog posts). Distinct from sibling tools like list_videos or list_features.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides an explicit, actionable usage guidance: 'Use this when a user asks "do you have a blog/article about X?"'. Does not mention when not to use or explicitly compare to alternatives, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_competitorsAInspect
Return competitor positioning entries (QuickBooks, Xero, FreshBooks, Wave, Zoho Books, Tally) with where HelloBooks wins, where the competitor wins, and pricing notes. Optional country, tier (primary / secondary), and id filters.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Return a single competitor by id (e.g. "quickbooks", "xero", "tally"). | |
| tier | No | Filter to head-on rivals (primary) or adjacent / segment-specific overlaps (secondary). | |
| country | No | Only return competitors whose primary market is this country, or who are also evaluated in this market. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool returns data with win/loss and pricing notes, implying a read-only operation, but it does not explicitly declare nondestructive behavior or disclose any safety, authentication, or rate-limiting aspects. The description is adequate but not enhanced beyond basic functionality.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: first sentence states purpose and return content, second sentence lists optional filters. No extraneous information, very efficient and to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (3 optional parameters, no nested objects, no output schema), the description adequately covers the return values (win/loss, pricing notes) and filter behavior. It is complete for an agent to understand and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all three parameters. The description adds value by mentioning specific competitors, return fields, and the purpose of filters (country, tier, id). This goes beyond the schema definitions that only name and enumerate the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Return', the resource 'competitor positioning entries', and enumerates specific competitors (QuickBooks, Xero, etc.). It also specifies return fields (wins, losses, pricing notes). This distinguishes it well from sibling tools which are mostly analysis functions or other list tools in different domains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on when to use the tool (to retrieve competitor positioning with optional filters) but does not explicitly mention when not to use it or suggest alternatives. However, the sibling tools are quite distinct in purpose (balance sheet analysis, compliance, etc.), so the usage context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_credit_packsAInspect
List HelloBooks AI credit packs — one-time pay-as-you-go top-ups (Boost 500, Power 1,500, Mega 5,000, Ultra 15,000 credits) priced in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Credit packs stack on any plan, including Free. Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance. Filter by id (boost / power / mega / ultra) or country (ISO code).
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Restrict the response to a single credit pack. | |
| country | No | ISO country code. Filters prices to one country. Omit to return all 8 markets. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Describes the tool as a list operation returning credit pack details and pricing. Adds context about stacking on any plan and filtering capabilities. Lacks mention of authentication, rate limits, or data recency, but is sufficient for a simple read-only list.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, no fluff. Front-loads the main purpose, then provides filtering details. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple list functionality with two optional parameters and no output schema, the description fully covers what the tool does, when to use it, and how to filter. It also explains the credit packs and currencies, making it self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with enum descriptions, but the description adds significant meaning: explains what credit packs are, lists specific pack names with credit amounts, and maps country codes to currencies. This enriches the schema enums with practical context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists credit packs, specifies the four pack names and credit amounts, and mentions regional currencies. It distinguishes from siblings like list_plans by clarifying these are one-time top-ups, not plans.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance: 'Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance.' Does not explicitly mention when not to use or compare to alternative tools, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_feature_categoriesAInspect
List the 13 feature categories on the marketing site (Core Accounting, Invoicing, Banking, Reports, Tax & Compliance, Inventory, Warehouse, Manufacturing, AI, Integrations, Mobile, Operations, Industry Modules) with per-category counts by status.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description provides key behavioral details: it lists exactly 13 named categories and includes counts by status. No side effects or additional traits are needed for such a simple query tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that immediately states the action and lists the categories. It is front-loaded and contains no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, no output schema), the description fully covers what the agent needs to know: the resource and the output format (counts by status). The 13 categories are enumerated for clarity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters, and the description correctly avoids adding parameter information. The baseline for 0 parameters is 4, and the schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool lists the 13 feature categories on the marketing site, specifying each category and noting per-category counts by status. It distinguishes from siblings like list_features and list_integrations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Although no explicit when-to-use or when-not-to-use guidance is given, the description makes it clear this tool is for listing categories, which is distinct from other listing tools. The context is sufficient for an agent to infer appropriate usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_featuresAInspect
List the full HelloBooks marketing feature catalog (145+ items). Filter by category, tier, status, marketedOnly, or substring query.
| Name | Required | Description | Default |
|---|---|---|---|
| tier | No | Filter by the minimum plan tier / add-on that unlocks the feature. | |
| limit | No | Max number of features to return. Default 200. | |
| query | No | Optional substring match against label + shortDescription. | |
| status | No | Filter by rollout status. Defaults to all. | |
| category | No | Filter to one feature category (core-accounting, invoicing-billing, etc.). | |
| marketedOnly | No | If true, only return features marketed on the public website. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the burden. It indicates the tool lists items and allows filtering, which is appropriate for a read operation. However, it does not mention pagination, rate limits, or the actual behavior of the limit parameter (defaults to 200), which would be useful.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys the tool's core function and filtering capabilities. No unnecessary words, and the key information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 6 optional parameters and no output schema, the description provides a high-level overview but omits details like the default limit of 200 or the return format. While adequate for simple use, it could be more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema provides 100% coverage of parameter descriptions, so the baseline is 3. The description mentions the main filter parameters (category, tier, status, marketedOnly, substring query) but does not add significant new meaning beyond what the schema already states.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists the full HelloBooks marketing feature catalog and specifies filtering options, making the purpose straightforward. However, it does not explicitly differentiate from the sibling tool 'feature_search', which could cause confusion about which tool to use for searching vs listing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies the tool is used to list features with filters, but it lacks explicit guidance on when to use this tool versus alternatives like 'feature_search'. No exclusions or when-not-to-use advice is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_integrationsBInspect
List integrations (banks, payments, payroll, time tracking, shipping, accounting sync, ecommerce, CRM).
| Name | Required | Description | Default |
|---|---|---|---|
| status | No | Filter by rollout status. | |
| country | No | Only return integrations available in this country (or global). | |
| category | No | Filter to one integration category. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits such as read-only nature, output format, or pagination. It only restates the basic function, providing no additional context beyond the name.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys the tool's purpose with relevant examples. It is front-loaded and contains no unnecessary words, though it could be slightly expanded without losing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite low complexity, the description omits any details about the output format, such as whether it returns a list of integration names or detailed objects. While adequate for a simple listing, it leaves the agent guessing about the response structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions (status, country, category). The description adds no extra meaning beyond what the schema already provides, earning the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and the resource 'integrations', with examples of categories that distinguish it from sibling list tools like 'list_articles' or 'list_features'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description does not mention any conditions, prerequisites, or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_plansAInspect
List HelloBooks pricing plans with monthly + annual prices in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Covers three core tiers — Free, Pro, CPA/CA Partner — plus two per-entity stackable add-ons (Warehouse, Manufacturing). Returns AI credit allowance, feature bullets (AI auto-categorization, unlimited users, multi-entity, 3-way matching, API access, etc.), and the public signup URL. Filter by plan (one of free / pro / cpa) or country (ISO code). Pricing follows Doc 19 v2 (2026-05-08): Free-first + single Pro tier; the previous Business tier was merged into Pro.
| Name | Required | Description | Default |
|---|---|---|---|
| plan | No | Restrict the response to a single plan tier. | |
| country | No | ISO country code. Filters prices to one country. Omit to return all 8 markets. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description discloses key behaviors: returns AI credit allowance, feature bullets, signup URL, and references pricing version (Doc 19 v2) and a plan merge, providing rich behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main purpose, then details, then filtering, then version info. It is somewhat verbose but well-organized with no redundant sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description thoroughly explains return contents (plans, prices, currencies, features, URL) and filtering, making it complete for an agent to understand the tool's output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already describes both parameters with enums and descriptions, but the description adds context (filtering logic, country meaning, plan enum values) and version info, adding some value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists pricing plans with details (prices, currencies, tiers, add-ons) and distinguishes itself from sibling analysis tools by focusing on plan information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates when to use (need pricing info, filter by plan/country) but does not explicitly exclude overlapping tools like list_credit_packs or list_features, though sibling names suggest they are distinct.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tax_ratesAInspect
List statutory tax-rate slabs by jurisdiction — IN GST (5/12/18/28 + zero + exempt + composition trader/manufacturer/restaurant + compensation cess), UK VAT (20 / 5 / zero / exempt), AU GST (10 / GST-free), US sales-tax (state-administered summary, no federal rate), CA GST 5% + HST 13% ON / 15% Atlantic, SG GST 9%, NZ GST 15%, AE VAT 5%. Filter by country, taxType (GST/VAT/Sales-Tax/HST/IGST/CGST-SGST/TDS/TCS), or scheme (standard / reduced / zero / exempt / composition / cess / state-summary). Every entry carries an effective-from date and an authoritative source URL (CBIC, gov.uk, ATO, CRA, IRAS, IRD, FTA, Tax Foundation) — agents should confirm the rate against the source before quoting figures to a user. Use this when a user asks "what is the GST rate on X?", "what VAT band does Y fall into?", or "what are the composition slabs in India?". This is the public statutory reference — for an org-specific tax assignment use the authenticated books_classify_event tool.
| Name | Required | Description | Default |
|---|---|---|---|
| scheme | No | Filter by slab category — standard, reduced, zero, exempt, composition, cess. | |
| country | No | Filter to one jurisdiction. Omit to return every supported country. | |
| taxType | No | Filter by statutory tax type (GST, VAT, Sales-Tax, HST, etc.). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses that every entry has an effective-from date and authoritative source URL, and warns agents to confirm rates against the source before quoting. No destructive behavior implied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is fairly long but front-loaded with core purpose, then detailed specifics. Every sentence adds value, but could be slightly more concise by grouping country examples more tightly. Still well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains what the output contains: slabs, rates, effective dates, source URLs. It covers all supported countries and tax types, making it complete for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with enums, but description adds meaning by listing specific rates for countries (e.g., IN GST 5/12/18/28) and contextualizing what each filter achieves. It doesn't repeat schema but enriches understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists statutory tax-rate slabs by jurisdiction, using specific verb 'list' and resource 'tax-rate slabs'. It distinguishes from sibling tools like lookup_tax_rate and books_classify_event, ensuring the agent knows when to use this one.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly describes when to use (user asks about tax rates like 'what is the GST rate on X?') and when not to (org-specific tax assignment, directing to books_classify_event). Also advises confirming rates against provided source URLs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_videosAInspect
List HelloBooks product videos curated on the marketing site (homepage demo + feature walkthroughs) and the official @hellobooksai YouTube channel link. Each video returns title, description, category, watch URL, embed URL and thumbnail. Filter by category (demo / features / overview), featuredOnly, or free-text query. Use this when a user asks for a demo, walkthrough or video. Note: this is the curated set, not a live mirror of every channel upload — the response includes the channel URL for the full catalog.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Optional substring match against video title + description. | |
| category | No | Filter to one video category (demo, features, overview). | |
| featuredOnly | No | If true, only return videos flagged as featured on the marketing site. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses that the tool returns a curated set (not live mirror) and lists return fields. It implies read-only behavior but does not discuss authentication, rate limits, or pagination, which is acceptable for a simple list tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, front-loading the purpose and return fields, followed by filters and usage note. It is efficient but not perfectly concise; some minor redundancy exists (e.g., 'curated' mentioned twice).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (3 optional params, no output schema), the description is complete: it covers purpose, return fields, filter options, and context (curated set, channel URL for full catalog). No obvious gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description reinforces filter options by mentioning categories, featuredOnly, and query, but adds no new semantic details beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists HelloBooks product videos curated on the marketing site and YouTube channel. It specifies the resources and distinguishes from sibling tools like list_articles and list_features, which focus on other content types.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises when to use this tool: 'Use this when a user asks for a demo, walkthrough or video.' It also notes the limitation of being a curated set and directs to the channel URL for the full catalog, providing an alternative for broader searches.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
local_payment_methodsAInspect
List local bank-rail / wallet payment methods relevant to HelloBooks invoice collection (AR), B2B supplier payments (AP), and contractor payouts (UPI, RuPay, Razorpay, IMPS, NEFT, RTGS, BACS, FPS, CHAPS, Open Banking, Interac e-Transfer, EFT, PayID, PayTo, NPP, BPAY, ACH, Same Day ACH, Fedwire, RTP, Zelle, PayNow, FAST, GIRO, NZ Direct Credit, etc.). Returns rail (instant / same-day / next-day / multi-day), use-cases, issuing authority, HelloBooks support level, and operational notes (per-transaction caps, settlement windows, retirement timelines). Filter by country, useCase, rail, or id.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Return a single payment method by id (e.g. "in-upi", "au-payid", "us-rtp"). | |
| rail | No | Filter by settlement rail (instant, same-day, next-day, multi-day). | |
| country | No | Filter to one country (IN, US, CA, GB, AU, AE, SG, NZ). | |
| useCase | No | Filter by payment use-case. Defaults to HelloBooks' invoice-collection + b2b-supplier + contractor-payout scope; pass an explicit value to widen. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It discloses that the tool is a listing operation with no side effects, and it details the return fields (rail, use-cases, issuing authority, support level, operational notes). It does not mention authentication or rate limits, but for a read-only list tool this is acceptable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single dense paragraph but is well-structured with the main purpose front-loaded. It includes many examples and return fields, which is necessary for a diverse set of payment methods. Could be slightly more concise, but each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description explicitly states what the tool returns (rail, use-cases, issuing authority, support level, operational notes). It covers the filtering options and default behavior, providing complete guidance for an agent to select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 4 parameters have descriptions in the JSON schema (100% coverage). The description adds value beyond the schema by clarifying the default behavior of the useCase filter and listing examples of payment methods. It provides meaningful context for parameter usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists local payment methods relevant to HelloBooks' invoice collection, B2B supplier payments, and contractor payouts. It provides specific examples (UPI, RuPay, etc.) and defines the return fields. No sibling tools have similar functionality, so it's well-distinguished.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the tool's purpose and the available filters (country, useCase, rail, id). It mentions a default scope for useCase, which guides typical usage. However, it does not explicitly state when not to use this tool or suggest alternatives, though sibling tools are unrelated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_tax_rateAInspect
Pick a single statutory tax-rate slab — either by exact id (e.g. IN-standard-18, GB-zero-0, CA-hst-13-on) for a deterministic lookup, or by country + free-text category (e.g. "office supplies", "restaurant", "exports", "domestic fuel") for a fuzzy best-match. Returns the matched rate, the match score, and the authoritative source URL. Use this when a user asks "what slab does X fall into in India?" or "what VAT rate applies to children's car seats?". For broader exploration (all slabs in a country / all rates of one scheme), use list_tax_rates. No customer data — public statutory reference only.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Exact rate id, e.g. ``IN-standard-18`` or ``GB-zero-0``. When set, country/category are ignored. | |
| country | No | Country to search within. Required when ``id`` is not provided. | |
| category | No | Free-text query — "office supplies", "restaurant", "exports", "domestic fuel". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully carries the burden of behavioral disclosure. It explains the deterministic lookup for id, fuzzy matching for category, return values (rate, match score, source URL), and explicitly states 'No customer data — public statutory reference only.' This is comprehensive for a read-only lookup tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is reasonably concise but slightly verbose with examples; however, every sentence serves a purpose. It is front-loaded with the core action and then elaborates with modes and examples. Minor improvement could be made by tightening the example list, but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, 1 enum, no output schema), the description is complete: it explains both usage modes, parameter semantics, return values, and differentiates from the sibling tool. The lack of output schema is compensated by stating what is returned (rate, match score, source URL).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Although schema coverage is 100%, the description adds significant meaning beyond the schema by explaining the two invocation modes, the precedence rule (id overrides country/category), and providing concrete examples for both id (e.g., 'GB-zero-0') and category (e.g., 'office supplies'). This aids the agent in constructing correct parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Pick a single statutory tax-rate slab' with two distinct modes (exact id or country+category). It distinguishes from the sibling tool list_tax_rates by specifying that this tool is for a single slab lookup, while the sibling is for broader exploration.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: use when a user asks 'what slab does X fall into in India?' or 'what VAT rate applies to children's car seats?', and explicitly advises against using for broader exploration, directing to list_tax_rates instead. This clearly delineates when and when not to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!