HelloBooks AI Agents MCP Server
Server Details
AI agents for bookkeeping, reconciliation, and financial close for SMBs.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.4/5 across 29 of 29 tools scored. Lowest: 3.4/5.
Each tool targets a very specific input (e.g., balance sheet CSV, QBO journal CSV) or query (e.g., tax rate lookup, competitor info). Even the multiple 'analyze_*' and 'list_*' tools have clearly differentiated purposes based on their detailed descriptions, making confusion unlikely.
Most tools follow a `verb_noun` pattern (e.g., `analyze_balance_sheet`, `compare_books_to_hellobooks`) or `noun_noun` (e.g., `compliance_capabilities`). However, `how_munimji_helps` deviates from the pattern, and `feature_search` is a verb_noun while many info tools use `list_*` - a minor inconsistency.
With 29 tools, the surface is extensive but justified by the server's broad domain: financial analysis, compliance, product information, migration, and partner programs. While slightly high, the tools are modular and each addresses a distinct need without redundancy.
The tool set covers the full lifecycle of accounting analysis (balance sheet, P&L, trial balance, journal anomalies, cleanup), compliance (deadlines, rates, capabilities), product information (features, plans, integrations), migration estimation, and partner/practice management. There are no obvious gaps for an informational/analytical agent server.
Available Tools
29 toolsanalyze_balance_sheetAInspect
Take a Balance Sheet CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected) and run three checks: (1) bs.equation_broken — the fundamental accounting equation Assets = Liabilities + Equity does not hold (every downstream ratio analysis is invalid until fixed); (2) bs.negative_asset — Cash / AR / Inventory line items with negative balances (reconciliation error signal); (3) bs.negative_equity — Total Equity < 0 (insolvency signal). Input is raw CSV text of a Balance Sheet (Reports → Balance Sheet in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, totals (totalAssets, totalLiabilities, totalEquity, equationBalances boolean), and a shareable URL. Use this when a user pastes a Balance Sheet and asks "does my balance sheet balance?", "is the accounting equation satisfied?", or "is my company solvent on paper?". A Balance Sheet that fails Assets = Liabilities + Equity invalidates every downstream financial-ratio analysis — this is the single most important check for any BS.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Balance Sheet report. Works with QuickBooks Online (Reports → Balance Sheet), Xero (Reports → Balance Sheet), Zoho Books (Reports → Balance Sheet), and Wave (Reports → Balance Sheet). Statement should include Total Assets, Total Liabilities, and Total Equity rows. Source is auto-detected from section name signatures. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Though no annotations exist, description fully discloses behavior: it's read-only, processes CSV, enforces limits (5k rows, 5MB), outputs flags with severity and totals, and creates a shareable URL. It explains consequences of each check (e.g., 'every downstream ratio analysis is invalid until fixed').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is dense but well-structured: starts with core action, then enumerates checks, then input specs, then output, then usage context. It could be slightly shorter, but every sentence adds unique value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without an output schema, description clearly explains return structure (flags, totals, shareable URL) and the criticality of the checks. It provides complete context for an agent to understand the tool's purpose and output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers both parameters with descriptions, but description adds value by listing compatible sources (QBO, Xero, etc.) and specifying expected row names (Total Assets, etc.) beyond the schema's generic description. The baseline of 3 is exceeded due to the added context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool takes a Balance Sheet CSV from specific sources (QuickBooks, Xero, Zoho, Wave) and runs three named checks. It distinguishes from sibling tools like analyze_trial_balance by focusing on balance-sheet-specific accounting equation and solvency checks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this when a user pastes a Balance Sheet and asks...' followed by example questions, giving clear when-to-use guidance. It also explains the importance of the equation check, helping agents prioritize.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_journal_varianceAInspect
Compare two periods of journal-entry data (QBO or Xero — source auto-detected from headers) and flag accounts whose movement deviates materially between periods. Aggregates lines per account into a net total for each period, then surfaces accounts where the period-over-period change crosses a materiality threshold (≥5% relative AND ≥$100 absolute; severity high at ≥50%, medium at ≥20%, low at ≥5%). Inputs are two CSV exports — periodACsv (earlier period) and periodBCsv (later period). Optional periodALabel / periodBLabel for human-readable flag messages (e.g. "Q1 FY2024" vs "Q2 FY2024"). Max 5,000 rows per period; max 5 MB each. Use this when a user pastes two periods and asks "what changed?", "show me variances", "what jumped period-over-period". Returns a flag list ordered by largest delta, a roll-up, and a shareable URL. Both periods must be the same source — mixing QBO + Xero in one call returns an error.
| Name | Required | Description | Default |
|---|---|---|---|
| periodACsv | Yes | Raw CSV text of the EARLIER period's journal-entry export (QBO Journal Entries or Xero Manual Journals). Source is auto-detected from the headers. | |
| periodBCsv | Yes | Raw CSV text of the LATER period's journal-entry export. Source is auto-detected from the headers (must match periodACsv). | |
| periodALabel | No | Optional human label for the earlier period — e.g. "Q1 FY2024". Used in flag messages. | |
| periodBLabel | No | Optional human label for the later period — e.g. "Q2 FY2024". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses auto-detection of source, materiality thresholds with severity levels, row and size limits, and error conditions for mixing sources. This is highly transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is information-rich but slightly long. It is front-loaded with the core purpose and then provides details. Every sentence adds value, though minor trimming could improve conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description explains the return format (flag list, roll-up, shareable URL). It covers all inputs, constraints, and error conditions, making it complete for a complex tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so each parameter already has a description. The tool description adds context about auto-detection, optional label usage, and constraints, adding value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it compares two periods of journal-entry data, flags accounts with material changes, and specifies the inputs and outputs. It distinguishes itself from sibling tools by focusing on period-over-period variance analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases (e.g., 'what changed?', 'show me variances') and constraints (e.g., max rows, same source). It does not explicitly mention when NOT to use or alternative tools, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_profit_lossAInspect
Take a Profit & Loss / Income Statement CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from section names) and run three checks: (1) pnl.subtotal_mismatch — each "Total Section" subtotal equals the sum of its preceding line items (catches missing or duplicated rows); (2) pnl.negative_expense — flags expense-section line items with negative amounts (usually sign-flips or refunds posted to the wrong side); (3) pnl.margin_red_flag — gross-profit margin < 5% or > 95%, or negative total revenue. Input is raw CSV text of a P&L report (Reports → Profit and Loss in QBO / Xero / Zoho / Wave). Max 5,000 rows; max 5 MB. Returns flags with severity, a summary with totalRevenue / totalCogs / grossProfit / grossMarginPct / netIncome (when detected), and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a P&L and asks "does my P&L look right?", "any sign errors?", "what is my gross margin?", or "anything suspicious in my income statement?". For period-over-period comparison use analyze_journal_variance with two periods of journal-entry data; this tool is single-period only.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Profit & Loss / Income Statement report. Works with QuickBooks Online (Reports → Profit and Loss), Xero (Reports → Profit and Loss), Zoho Books (Reports → Profit & Loss), and Wave (Reports → Profit & Loss). Source is auto-detected from section names. Statement should include section headers, line items, "Total X" subtotals, and a Net Income / Net Profit row at the bottom. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description fully carries the burden. It details the three checks, return values (flags, summary fields, shareable URL), constraints (max rows, size), and auto-detection of source. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but front-loads the main purpose, checks, and examples. All sentences add value, though it could be slightly more structured for readability. No waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description compensates by listing output fields, constraints, and usage examples. It is fully self-contained and covers all necessary context for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, and the description adds extra context for both parameters (csvText: sources, structure; fileName: share-page label), providing value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool analyzes P&L CSVs from specific sources and performs three named checks. It distinguishes from siblings by explicitly mentioning when to use a different tool for period-over-period comparison.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases with example user queries and directs when not to use (period-over-period) with an alternative tool. Also notes it is single-period only.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_qbo_journal_anomaliesAInspect
Scan a QuickBooks Online "Journal Entries" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Round numbers are statistically rare in real bookkeeping and frequently indicate estimates, plugs, or fraud signals worth review. Input is raw CSV text from QBO Reports → Accountant → Journal. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes QBO data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Tier-0 subset — HelloBooks Phase 3.0 anomaly detection in the paid product additionally catches GL outliers vs entity history, vendor-history mismatches, archived-vendor activity, and AI-narrated suspicious lines (which require the live HelloBooks account).
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses behavior: it checks for round-number lines, applies a materiality threshold, returns flagged lines with severity levels and a shareable URL, and imposes limits of 5,000 rows and 5 MB. No behavioral traits are hidden.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that efficiently conveys core information first (scanning for round numbers), then details. It is informative without being overly verbose, though it could be more structured (e.g., bullet points) for easier scanning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input (CSV text) and no output schema, the description covers all necessary context: what anomalies are detected, input constraints, output format (flagged lines with severity and URL), and usage scenarios. It also references the broader product context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions already cover both parameters (csvText and fileName) with details on origin and constraints. The tool description adds value by clarifying the CSV source (QBO Reports → Accountant → Journal) and reiterating size limits, providing context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans QBO Journal Entries CSV for round-number anomalies, specifying the exact criteria (multiples of $1,000, above $1,000 threshold). It distinguishes from siblings by noting this is a Tier-0 subset and mentioning the paid product's broader detection capabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use this tool: when a user asks about anomalies, round numbers, or suspicious entries. It also clarifies limitations by indicating that more advanced detection (GL outliers, vendor-history mismatches, etc.) requires the paid HelloBooks product, guiding the agent to alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_qbo_journal_cleanupAInspect
Scan a QuickBooks Online "Journal Entries" CSV export for cleanup issues — unbalanced journals (debits ≠ credits, with severity by deviation), duplicate journals (same date + same totals, likely posted twice), and schema problems (invalid dates, malformed amounts, missing accounts, missing journal numbers). Input is the raw CSV content the user pastes after exporting from QBO via Reports → Accountant → Journal → Export. Max 5,000 rows; max 5 MB. Returns a structured flag list with severity (high/medium/low), a roll-up summary by category and severity, parse diagnostics (column mapping + unmapped columns), and a shareable URL at agents.hellobooks.ai/r/{slug} (7-day TTL) that renders a branded analysis page suitable for sending to a CA or bookkeeper. Use this when a user pastes QBO journal data, asks "check my books", "find issues in my QBO journal", or "what is wrong with my journal entries". Each flag includes a fixableInHellobooks boolean — true means HelloBooks can resolve it automatically in the paid product.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a QuickBooks Online "Journal Entries" report. Export from QBO: Reports → Accountant → Journal → Export as CSV. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: input format, source path, size limits, output structure (flags, roll-up, diagnostics, shareable URL with TTL, fixable boolean), and relation to paid product. No hidden traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy but each sentence serves a purpose: action, specifics, usage hints. It is well-structured with clear sections. Could be slightly tighter but no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description thoroughly explains all output components, input source, constraints, and shareable URL details. For a complex analysis tool, it provides a complete picture sufficient for an agent to invoke and use the result.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters. Description adds value by clarifying csvText provenance (specific QBO export path) and fileName role as a label. This goes beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it scans a QBO journal CSV for three specific cleanup issues (unbalanced, duplicates, schema problems). It identifies the exact input and output, and distinguishes itself from siblings like analyze_qbo_journal_anomalies by focusing on 'cleanup' rather than general anomaly detection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: when a user pastes QBO journal data or asks specific questions like 'check my books'. Also provides constraints (max rows, size) and export path. However, it doesn't explicitly mention when not to use or compare to alternatives like analyze_xero_journal_cleanup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_trial_balanceAInspect
Take a Trial Balance CSV export from QuickBooks Online, Xero, Zoho Books, or Wave (source auto-detected from headers — YTD columns indicate Xero, Opening Balance indicates Zoho, etc.) and run three checks: (1) tb.unbalanced — debits ≠ credits (every downstream P&L / BS / cash-flow report built from this TB is wrong until fixed); (2) tb.wrong_sign — accounts whose name suggests a class (Revenue / COGS / Expense / AR / AP) carrying a balance on the wrong side (classic posting-error signal); (3) tb.round_balance — exact-multiple-of-$10,000 balances (plug-entry signal). Input is raw CSV text of a Trial Balance report. Max 5,000 rows; max 5 MB. Returns flagged accounts with severity, a roll-up showing whether the TB balances, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes a Trial Balance and asks "does my TB balance?", "are there sign errors?", "what looks suspicious?", or "is this TB clean?". The Trial Balance is the foundation document for every other financial statement — if it does not balance, every downstream report is invalid.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Trial Balance report. Works with QuickBooks Online (Reports → Trial Balance), Xero (Reports → Trial Balance), Zoho Books (Reports → Accountant → Trial Balance), and Wave (Reports → Trial Balance). Source is auto-detected from column headers. | |
| fileName | No | Optional filename for the share-page label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully explains the three checks, row/ size limits, and return values (flagged accounts, severity, roll-up, diagnostics, shareable URL). It adds behavioral context beyond what structured fields provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single dense paragraph that conveys all key information without unnecessary words. It could benefit from bullet points for readability, but it remains concise given the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description fully specifies return value components (flagged accounts, severity, roll-up, diagnostics, shareable URL) and constraints (max rows, size). It provides complete context for an agent to select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. The description adds value by explaining auto-detection of source from headers and the purpose of the optional fileName parameter, going beyond the schema's basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool analyzes Trial Balance CSVs from QuickBooks, Xero, Zoho, or Wave, and specifies three distinct checks (unbalanced, wrong sign, round balance). It distinguishes itself from siblings like analyze_balance_sheet which target different financial statements.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists when to use this tool (e.g., when a user asks 'does my TB balance?') and emphasizes the foundational importance of the Trial Balance. It does not explicitly mention when not to use, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_xero_journal_anomaliesAInspect
Scan a Xero "Manual Journals" CSV export for anomalies — currently round-number lines (debit or credit amounts that are exact multiples of $1,000, above a $1,000 materiality threshold). Input is raw CSV text from Xero Accounting → Advanced → Manual Journals → Export. Max 5,000 rows; max 5 MB. Returns flagged lines with severity ($100K+ high, $10K+ medium, else low) and a shareable URL. Use this when a user pastes Xero data and asks "any anomalies?", "look for round numbers", or "anything suspicious". Same Tier-0 / paid-product split as the QBO variant — history-aware anomaly checks (GL outliers, vendor history, archived-vendor activity, LLM-narrated suspicious) live in the authenticated MCP / paid product.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses max rows (5,000), max file size (5 MB), output format (flagged lines with severity and shareable URL), and scope (only round-number anomalies). Could be more explicit about being read-only and safe, but overall sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured: first sentence defines purpose, then details on anomalies, limits, output, usage context, and comparison with paid product. Some tangential detail about paid product features could be trimmed, but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and simple input (2 params), the description covers all needed aspects: purpose, input source, limitations, output summary, and usage triggers. Sets expectations about what it does not cover (advanced checks). Complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions. The tool description reinforces the schema but adds minimal new meaning (e.g., confirming the export path). Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool scans Xero Manual Journals CSV for round-number anomalies above $1,000. Differentiates from siblings by mentioning Xero-specific scope and a paid product for advanced checks, and its name contrasts with the QBO variant.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: when user pastes Xero data and asks about anomalies, round numbers, or suspicious items. Mentions that advanced checks are in the paid product, implying when not to rely on this tool. However, does not differentiate from sibling tools like analyze_xero_journal_cleanup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_xero_journal_cleanupAInspect
Scan a Xero "Manual Journals" CSV export for cleanup issues — unbalanced journals, duplicate journals (same date + same totals), and schema problems (invalid dates, malformed amounts, missing account code/name, missing group key). Input is the raw CSV content the user pastes after exporting from Xero via Accounting → Advanced → Manual Journals → Export. Xero-specific idioms handled: signed Amount column (positive = credit, negative = debit), explicit Debit/Credit fallback shape, Reference-or-Narration+Date grouping, account code preferred over name. Max 5,000 rows; max 5 MB. Returns structured flags with severity, a roll-up summary, parse diagnostics, and a shareable URL at agents.hellobooks.ai/r/{slug}. Use this when a user pastes Xero manual-journal data, asks "check my Xero books", or "find issues in my Xero journal". The funnel CTA routes to /migrate/from-xero for users who want to fix at scale.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a Xero "Manual Journals" report. Export from Xero: Accounting → Advanced → Manual Journals → Export. Paste the file contents directly. | |
| fileName | No | Optional original filename, used only as a label on the share page. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses constraints (5k rows, 5MB), Xero-specific handling (signed Amount column, account code preference), and return structure (flags, summary, diagnostics, shareable URL). Highly transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Reasonably concise given the complexity. Front-loads purpose, then provides necessary details. Every sentence contributes; no fluff. Slightly long but justified by content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Thoroughly covers input constraints, processing specifics, output details (flags, summary, diagnostics, shareable URL), and usage context. No output schema, but description sufficiently explains return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for both parameters. Description adds significant value: specifies that input is raw CSV from exact Xero export path, Xero idioms, and constraints, going beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it scans a Xero Manual Journals CSV for cleanup issues, listing specific issues like unbalanced journals, duplicates, and schema problems. It distinguishes from siblings by detailing Xero-specific idioms and exact export path.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use: when user pastes Xero data or asks 'check my Xero books'. Provides context of export path. Could be improved with explicit exclusions, but sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_books_to_hellobooksAInspect
Take a QBO or Xero journal-entry CSV (source auto-detected), run the full Tier-0 detection set (imbalance + duplicates + round-number + schema), and return a structured side-by-side comparison — "your books have X issues; here is how HelloBooks resolves each phase". This is the direct funnel tool: the response includes per-category counts mapped to HelloBooks Phases 1, 2, 3.0, 3.1, with exclusive-advantage bullets (command-center dashboard, conversational interface, one-prompt JE posting, cross-phase orchestration, auto ID resolution). Use this when a user is evaluating HelloBooks vs their current QBO/Xero, asks "should I migrate?", or pastes data while comparing accounting software. Output is suitable for the host LLM to narrate as a positioning argument; the share URL points at a branded landing page with the issue breakdown and a 1-click migrate CTA.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from the headers. | |
| fileName | No | Optional filename label. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that the tool auto-detects source, runs detection sets (imbalance, duplicates, round-number, schema), and returns structured output with phases and advantage bullets. It does not mention any destructive actions or side effects, implying read-only analysis. The description adds behavioral context beyond schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences but front-loaded with the core action. It includes necessary details about detection set, output structure, and use cases. Every sentence adds value, though it could be slightly more concise. The structure is logical and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, so the description must compensate. It adequately describes the output as a structured side-by-side comparison with per-category counts mapped to HelloBooks phases and a share URL. Given the complexity and sibling tools, this is sufficient for an agent to understand what the tool returns.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description mentions CSV text and auto-detection, but does not add new parameter semantics beyond what the schema already provides. The schema already describes csvText and fileName adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies the tool takes a QBO or Xero journal-entry CSV, runs a detection set, and returns a side-by-side comparison with per-category counts mapped to HelloBooks phases. It clearly differentiates from sibling tools like analyze_qbo_journal_anomalies by positioning itself as a direct funnel tool for comparison and migration evaluation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states explicit use cases: when a user is evaluating HelloBooks vs current QBO/Xero, asks about migration, or pastes data while comparing software. It does not explicitly mention when not to use, but the context is clear. The sibling tools are analytical, while this is comparative, so guidance is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_capabilitiesAInspect
Return supported compliance frameworks for a country (BAS, STP, GST, MTD, 1099, etc.) with version and certification info.
| Name | Required | Description | Default |
|---|---|---|---|
| country | Yes | Required ISO country code. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It reveals the output contains frameworks with version and certification info, but does not state if the operation is read-only, requires auth, or has any side effects. For a simple query tool, the description is adequate but not fully transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero wasted words. Efficient and to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with one required parameter and no output schema, the description adequately conveys the return content (supported compliance frameworks with version and certification info). It does not specify the return format (e.g., array) but is sufficient for the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema covers 100% of the single parameter 'country' with a clear enum and description. The description only reiterates 'for a country', adding no new meaning beyond what the schema provides. Baseline 3 due to high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'return' and the resource 'supported compliance frameworks for a country', with examples (BAS, STP, etc.) and output scope (version and certification info). It is easily distinguishable from sibling tools like compliance_deadlines or lookup_tax_rate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use this tool (to check compliance frameworks for a country) but provides no explicit guidance on when not to use it or alternatives among siblings. Usage is implied from the purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_deadlinesAInspect
When statutory returns and payroll filings are due, per country. Covers IN (GSTR-1/3B/9/9C, CMP-08, Form 24Q, Form 16, PF ECR, ESI), AU (BAS, STP, Super Guarantee), GB (VAT MTD, RTI, Self Assessment), US (1099-NEC/MISC, W-2, Form 941/940), CA (T4, GST/HST). Optional country, frequency, and form filters. Note: dates rotate annually — every response carries a disclaimer with the per-deadline source URL for authority confirmation.
| Name | Required | Description | Default |
|---|---|---|---|
| form | No | Substring match against form name, e.g. "GSTR-3B", "BAS", "1099", "T4". Case-insensitive. | |
| country | No | ISO country code. Filters to one country (IN, AU, GB, US, CA covered today). | |
| frequency | No | Filing cadence. Useful for "what are my monthly returns" style queries. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It explains that dates rotate annually and every response includes a disclaimer with a source URL for authority confirmation, which provides important behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (3 sentences) and front-loads the purpose. It efficiently conveys scope and constraints without excessive detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains that it returns deadlines and includes a disclaimer with source URLs. It covers key usage context, though return format could be clearer.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by giving examples of form names (e.g., 'GSTR-3B', 'BAS') and specifying substring match and case-insensitivity, which extends beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool provides due dates for statutory returns and payroll filings across multiple countries. It lists specific forms for each country, distinguishing it from sibling tools like compliance_capabilities or country_support.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates the tool is for finding deadlines and includes optional filters (country, frequency, form). However, it does not explicitly mention when to use this tool over alternatives or any exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
country_supportAInspect
Return features available per supported country (AU, IN, UK, US, CA, AE, SG, NZ).
| Name | Required | Description | Default |
|---|---|---|---|
| country | No | Single ISO country code. Omit for the full matrix. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully carries behavioral disclosure. It clearly states the action (return features), scope (per supported country), and the special behavior of omitting the country parameter for the full matrix. No contradictory or missing key behaviors are evident.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, zero wasted words, immediate front-loading of purpose. No unnecessary elaboration or duplication of schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-parameter tool with no output schema or nested objects, the description covers the core behavior and parameter usage. It does not specify the return format, but the simplicity makes it adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%; the schema already documents parameter 'country' with enum and description. The description adds the list of countries and the omit-for-matrix hint, which adds some context but does not significantly extend meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies the verb 'return' and the resource 'features available per supported country', listing the exact countries (AU, IN, UK, US, CA, AE, SG, NZ). This clearly distinguishes it from sibling tools like 'feature_search' or 'list_features'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage (for retrieving features by country or full matrix), but does not explicitly state when to use this tool over alternatives, nor does it mention exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
estimate_migration_effortAInspect
Take a QBO or Xero journal-entry CSV (source auto-detected) and return a structured migration-effort estimate — row counts, unique-account count, period span, complexity classification (low / medium / high), human-hours estimate, assisted-hours estimate, and an indicative price quote in USD. Heuristic-based — refined against the live entity once the user signs up. Accepts larger files than the other analytical tools (up to 50,000 rows / 20 MB) because no detection runs here, just sizing. Use this when a user is weighing the cost of moving books to HelloBooks, pastes data and asks "how long will migration take?", "what would this cost?", or "is it worth migrating?". The funnel CTA points at /migrate/?ref= to start the assisted flow with the parsed sizing pre-populated.
| Name | Required | Description | Default |
|---|---|---|---|
| csvText | Yes | Raw CSV text of a journal-entry export from QBO ("Journal Entries") or Xero ("Manual Journals"). Source is auto-detected from headers. | |
| fileName | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavioral traits: heuristic-based, refined against live entity after signup, accepts up to 50k rows/20MB because no detection runs, and that it only performs sizing. This gives the agent clear expectations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that efficiently covers purpose, inputs, outputs, usage guidelines, and behavioral traits. It is front-loaded with essential info. While slightly long, every sentence adds value and there is no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and two parameters, the description fully explains the return values (row counts, account count, complexity, hours, price) and the input requirements (file size, source auto-detection). It also mentions the heuristic and later refinement, providing complete context for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 50% (csvText has description, fileName does not). The description adds meaning: it explains the source of csvText (QBO 'Journal Entries' or Xero 'Manual Journals'), auto-detection from headers, and file size limits. This compensates well for the schema gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool takes a QBO or Xero journal-entry CSV and returns a structured migration-effort estimate with specific outputs (row counts, complexity, hours, price). It distinguishes from sibling analytical tools by noting it accepts larger files and performs sizing, not analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage scenarios: when a user asks about migration cost/time, or when evaluating cost of moving books to HelloBooks. It implies when not to use (for other analytical tasks) and offers context for the funnel CTA. However, it does not directly name an alternative tool for other tasks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
feature_searchAInspect
Free-text search across the marketing feature catalog, plan features, integrations, country features, compliance frameworks, competitor positioning, statutory deadlines, local payment methods, and published articles on hellobooks.ai. Queries like "vs Xero", "QuickBooks alternative", "GSTR-3B due", "UPI invoice", "1099 article", or "agentic accounting" surface the matching entry near the top.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return (default 20). | |
| query | Yes | Free-text query, e.g. "BAS lodgement", "multi-currency", "vs QuickBooks", "GSTR-3B due", or "UPI invoice cap". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the behavioral disclosure burden. It mentions the scope and relevance ranking but does not clarify read-only nature, rate limits, or result structure. More detail on return format would improve transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph with no wasted words, front-loading the main action. It could be slightly better structured with sentence separations, but overall it's concise and informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's broad scope and no output schema, the description should explain what the response looks like. It mentions 'matching entry near the top' but not the format of results (e.g., snippets, titles, URLs). This is a significant gap for a search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by providing example query strings and enumerating content types, going beyond the schema's brief parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly defines the tool's purpose: free-text search across a marketing feature catalog with specific content types. It uses specific verbs like 'search' and lists many resources, making it distinct from sibling tools that are more focused.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides examples of queries and mentions result ordering but does not explicitly state when to use this tool vs alternative list tools (e.g., list_features, list_articles) for targeted queries. No when-not or alternative guidance is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
free_tier_eligibilityAInspect
Return the HelloBooks Free-plan annual-invoice-turnover thresholds for the 8 supported countries (IN ₹40 lakh / US $100K / GB £90K / AU A$75K / CA C$30K / NZ NZ$60K / SG S$500K / AE AED 187.5K). Free is unlimited features and unlimited AI credits subject to the monthly allowance, but per-entity invoice turnover above the country cap forces an upgrade to Pro or Business. Call with no args to get the full table, with country for one threshold, or with country AND annualInvoiceRevenue (in the country currency, NOT USD-equivalent) for a freeEligible verdict with headroom math. Bank-feed total, cash receipts, and gross transaction volume are explicitly NOT used.
| Name | Required | Description | Default |
|---|---|---|---|
| country | No | ISO country code of the entity. Omit to return the full 8-country threshold table. | |
| annualInvoiceRevenue | No | Annual invoice turnover for the entity, in its home currency (NOT USD-equivalent). Only used when `country` is also provided — the tool then returns a `freeEligible` verdict. Bank-feed total and cash receipts are explicitly NOT used by the gate. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully informs the agent that the tool is a read-only query returning thresholds or eligibility verdicts with math. It details the three behavioral modes and explicitly states what is not considered, ensuring no hidden side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by detailed parameter guidance. Every sentence conveys essential information without redundancy. It is appropriately sized for the complexity of the tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of an output schema, the description adequately explains the return values (thresholds table or eligibility verdict with headroom math). It covers all three call scenarios and explicitly states what is not used, leaving no gaps in understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already covers both parameters with descriptions, but the description adds critical semantics: currency unit (home currency not USD-equivalent), the logic for the eligibility verdict, and the explicit exclusion of other financial metrics. This adds significant value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns HelloBooks Free-plan annual-invoice-turnover thresholds for 8 countries. It specifies three distinct call patterns (no args, country only, country plus revenue) and explicitly distinguishes from unrelated concepts like bank-feed total. Sibling tools are all different financial analysis tools, so this tool's purpose is unique and well-defined.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives explicit instructions on how to call the tool with different parameters, including the condition that revenue must be in home currency. It also clarifies what inputs are not used (bank-feed total etc.). However, it does not explicitly compare to sibling tools or advise when to use alternatives, though the tool is self-contained and the context makes it clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
how_munimji_helpsAInspect
Explain how HelloBooks and Munimji (the in-app AI assistant) help a specific business — given a free-text description of the user's own operations. Returns a curated capability knowledge base: business-operation areas (sales, purchases, banking, tax, reports, inventory, payroll, multi-entity, setup), and for each AI capability WHO does the work — autonomous (Munimji does it on its own, e.g. OCR extraction, running reports), approval (Munimji prepares the entry and you one-click approve before it posts to the ledger, e.g. AI categorization, find-and-match, creating invoices/bills by chat), assist (co-pilot, e.g. guided onboarding, voice), or manual (a software feature you run yourself). Each capability links to the backing software features. Use this when a user describes their business and asks "how can HelloBooks help me?", "what can the AI do for my shop/practice/agency?", or "what can Munimji do on its own vs what do I approve?". Pass their description in businessDescription; optionally filter by area or autonomy. The AI never posts to a ledger without approval. For the full software catalog call list_features; for pricing call list_plans.
| Name | Required | Description | Default |
|---|---|---|---|
| area | No | Optional. Narrow to one business-operation area. | |
| autonomy | No | Optional. Filter Munimji capabilities by who does the work: `autonomous` (Munimji does it alone), `approval` (Munimji prepares, you approve before it posts), `assist` (co-pilot), `manual` (software feature you run yourself). | |
| businessDescription | No | Optional. The user describes their business and operations in their own words (industry, what they sell, how they get paid, who they pay, tax regime, pain points). It is echoed back as context — the calling assistant maps it to the returned areas + capabilities. No keyword scoring is done here; the LLM does the matching. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses behavioral traits: it returns a knowledge base with specific structure, states that the AI never posts to ledger without approval, and explains how the business description is echoed back and mapped by the LLM. This gives the agent full understanding of the tool's operation and safety.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Although lengthy, the description is efficient and well-structured: it starts with the core purpose, then explains the return format, usage scenarios, parameter details, and alternative tools. Every sentence adds necessary value, and the information is front-loaded for quick scanning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description fully covers what the tool does, how to use each parameter, the structure of the return, and related tools. There is no output schema, but the description compensates by detailing the response shape. Given the complexity (3 parameters, multi-field return), it is thoroughly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, but the description adds significant meaning: for 'area' it explains the purpose of filtering, for 'autonomy' it defines each enum value with detailed examples, and for 'businessDescription' it clarifies that it is echoed back for context with no keyword scoring. This rich context goes well beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: it explains how HelloBooks and Munimji help a specific business by returning a curated capability knowledge base broken down by operational areas and autonomy levels. It explicitly lists example user queries to trigger the tool, making the purpose unmistakable and distinct from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives explicit when-to-use guidance with example queries like 'how can HelloBooks help me?' and also states when not to use it: for full software catalog call list_features, for pricing call list_plans. This provides clear direction on alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_articlesAInspect
List published articles on hellobooks.ai — head-to-head compare pages and curated flagship blog posts. Filter by country, tag or free-text query. Use this when a user asks "do you have a blog/article about X?".
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | Single tag to filter on (case-insensitive substring match against the article tag list). e.g. "gst", "1099", "tally". | |
| limit | No | Max articles to return (default 20). | |
| query | No | Free-text query — substring-matched against the title, excerpt and tags of each article. e.g. "QuickBooks alternative" or "audit trail". | |
| country | No | ISO country code or "global". Returns articles whose countryRelevance matches OR is "global". Omit to return everything. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It correctly states that the tool lists published articles and supports filtering by country, tag, or query. However, it does not disclose behavior such as pagination (though the limit parameter exists), default ordering, or what happens when no results are found.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first states the purpose and scope, the second gives a usage hint. It is front-loaded, efficient, and contains no unnecessary text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that this is a straightforward list tool with no output schema, the description is reasonably complete. It covers the tool's function, the resource, supported filters, and a typical use case. It could mention return format or examples, but it is adequate for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, meaning all four parameters have descriptions in the schema. The tool description adds minimal extra meaning beyond the schema (e.g., mentioning the types of articles). With high coverage, a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list' and resource 'published articles' on hellobooks.ai, and specifies the scope as head-to-head compare pages and curated flagship blog posts. This distinguishes it from sibling tools like list_competitors or list_features.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when a user asks "do you have a blog/article about X?"'. This provides clear context for when to use the tool, though it does not mention when not to use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_competitorsAInspect
Return competitor positioning entries (QuickBooks, Xero, FreshBooks, Wave, Zoho Books, Tally) with where HelloBooks wins, where the competitor wins, and pricing notes. Optional country, tier (primary / secondary), and id filters.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Return a single competitor by id (e.g. "quickbooks", "xero", "tally"). | |
| tier | No | Filter to head-on rivals (primary) or adjacent / segment-specific overlaps (secondary). | |
| country | No | Only return competitors whose primary market is this country, or who are also evaluated in this market. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must cover behavioral traits. It states what is returned (wins, pricing notes) but does not mention if the data is static, paginated, or if any authentication is needed. For a simple list/read tool, this is adequate but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the main purpose and quickly cover optional filters. Every sentence adds value with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains the return shape (wins, pricing notes) and lists the competitors. This is fairly complete for a list tool with simple filters. The only minor gap is not clarifying if the response is a list of objects or a single entry when id is used.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by listing the competitor names and indicating that 'id' matches those names, but it does not add significant meaning beyond what the schema already provides for each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Return') and resource ('competitor positioning entries') and lists the exact competitors and returned fields. It clearly distinguishes from sibling tools like 'analyze_balance_sheet' or 'feature_search' which serve different purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions optional filters (country, tier, id), giving context on how to narrow down results. However, it does not explicitly state when to use this tool over others or provide negative guidance (e.g., 'Do not use for pricing calculations'). The context from sibling names implies its scope, but explicit usage guidelines would improve it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_credit_packsAInspect
List HelloBooks AI credit packs — one-time pay-as-you-go top-ups (Boost 5,000, Power 15,000, Mega 50,000, Ultra 150,000 credits) priced in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Credit packs stack on any plan, including Free. Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance. Filter by id (boost / power / mega / ultra) or country (ISO code).
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Restrict the response to a single credit pack. | |
| country | No | ISO country code. Filters prices to one country. Omit to return all 8 markets. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It explains behavior: lists credit packs, filters by id/country, and notes that packs stack on any plan. However, it does not detail the response format or any authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states what it does and the packs/currencies, second gives usage guidance and filtering. No redundant information, front-loaded with key details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 parameters with 100% schema coverage and no output schema, the description provides sufficient context: purpose, usage scenario, filtering, and the fact that credit packs are pay-as-you-go top-ups. It is complete for an agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already describes parameters; description adds value by listing specific pack names and credit amounts, and explaining the currencies and filtering use. This enriches understanding beyond the enum values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists HelloBooks AI credit packs with specific pack names, credit amounts, and pricing in multiple currencies. It also mentions filtering options, distinguishing it from sibling tools that list other entities like plans, tax rates, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance: 'Use this when a user asks how to buy more AI credits or top up after exhausting a plan allowance.' While it doesn't explicitly state when not to use, the context of sibling tools implies alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_feature_categoriesAInspect
List the 13 feature categories on the marketing site (Core Accounting, Invoicing, Banking, Reports, Tax & Compliance, Inventory, Warehouse, Manufacturing, AI, Integrations, Mobile, Operations, Industry Modules) with per-category counts by status.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the burden. It does not explicitly state that the operation is read-only or free of side effects, though the nature of listing categories implies it. A more explicit statement would improve transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise and front-loaded with the main action. However, listing all 13 categories inline makes it slightly lengthy; a simplified version could still be clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, no output schema, no annotations), the description fully explains what the tool does and what it returns. It is complete and sufficient for an agent to understand its purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters, and the input schema is empty with 100% coverage. The description correctly omits parameter details as none are needed. The baseline score for zero parameters is 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List'), the resource ('13 feature categories on the marketing site'), and the output ('per-category counts by status'). It explicitly lists all categories, distinguishing it from sibling tools like list_features or list_integrations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, such as list_features or list_integrations. The description does not mention any prerequisites or scenarios where this tool is preferred.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_featuresAInspect
List the full HelloBooks marketing feature catalog (145+ items). Filter by category, tier, status, marketedOnly, or substring query.
| Name | Required | Description | Default |
|---|---|---|---|
| tier | No | Filter by the minimum plan tier / add-on that unlocks the feature. | |
| limit | No | Max number of features to return. Default 200. | |
| query | No | Optional substring match against label + shortDescription. | |
| status | No | Filter by rollout status. Defaults to all. | |
| category | No | Filter to one feature category (core-accounting, invoicing-billing, etc.). | |
| marketedOnly | No | If true, only return features marketed on the public website. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Describes a read-only operation, but lacks details on permissions, rate limits, or behavior when filters return no results. Adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with key information front-loaded. No wasted words, perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description provides catalog size (145+), list of filters, and default limit. Lacks pagination details beyond limit parameter and return format, but is sufficient for a listing tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description mentions filter types but adds no new meaning beyond what the schema already provides for each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists the full feature catalog (145+ items) and specifies available filters (category, tier, status, marketedOnly, query). This distinguishes it from siblings like `feature_search` which likely does more targeted search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing with filters but does not explicitly state when to use this tool versus alternatives like `feature_search`. No exclusions or context on when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_integrationsBInspect
List integrations (banks, payments, payroll, time tracking, shipping, accounting sync, ecommerce, CRM).
| Name | Required | Description | Default |
|---|---|---|---|
| status | No | Filter by rollout status. | |
| country | No | Only return integrations available in this country (or global). | |
| category | No | Filter to one integration category. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the burden of behavioral disclosure. It only mentions listing integrations and example categories, omitting details about pagination, rate limits, ordering, or response format. This is insufficient for understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with a parenthetical list. It is front-loaded and contains no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of integration categories and three optional filter parameters with no output schema or annotations, the description is too brief. It does not explain what the response looks like, the nature of the integrations listed, or any constraints. More context would be needed for an agent to use it effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions for status, country, and category. The description adds the list of integration types as examples, which slightly complements the category enum. However, it does not add significant meaning beyond the schema, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and the resource 'integrations', followed by a parenthetical list of example categories. This distinguishes it from sibling tools, which are primarily analytical or compliance-related.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives is provided. The sibling tools are mostly different in purpose, so usage is implied, but the description lacks any exclusions or context about filtering.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_plansAInspect
List HelloBooks pricing plans with monthly + annual prices in 8 regional currencies (USD, INR, CAD, GBP, AUD, AED, SGD, NZD). Covers four core tiers — Free / Pro / Business / Partner Program (the cpa plan id) — plus two per-entity stackable add-ons (Warehouse, Manufacturing). Returns AI credit allowance, feature bullets (AI auto-categorization, unlimited users, multi-entity, 3-way matching, API access, etc.), and the public signup URL. Filter by plan (one of free / pro / business / cpa) or country (ISO code). Pricing follows Doc 19 v3 (Web-Fire #514, 2026-06-12): Business re-introduced as a 4th tier sized ~4× Pro to match Partner Points; the retired "$59.99/mo + $4.99/client" CPA SKU is gone and the cpa plan id now resolves to the free Partner Program (call partner_program_info for the status ladder + points math). HelloCPA Practice Management is a separate product on practice.hellobooks.ai — call practice_management_info, NOT this tool.
| Name | Required | Description | Default |
|---|---|---|---|
| plan | No | Restrict the response to a single plan tier (`cpa` = free Partner Program). | |
| country | No | ISO country code. Filters prices to one country. Omit to return all 8 markets. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the tool's read-only nature (listing), the data source (Doc 19 v3), and important behavioral details: that the 'cpa' plan id now resolves to the free Partner Program and that the old CPA SKU is retired. It accurately describes the output content without contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy (multiple sentences) and includes detailed asides (e.g., pricing model reference, historical notes) that could be simplified or moved. However, it is front-loaded with the core purpose and structured logically. It earns a spot for being informative but loses points for not being more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 optional parameters, no required, no output schema), the description covers all necessary context: what is returned (prices, credits, features, URL), filter usage, and clear boundaries with sibling tools. It is fully sufficient for an agent to select and invoke this tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value beyond the schema by explaining the meaning of enum values (e.g., 'cpa' = free Partner Program) and clarifying that omitting country returns all 8 markets, which the schema does not explicitly state. This aids correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists HelloBooks pricing plans with specific details (monthly+annual prices in 8 currencies, four core tiers plus add-ons, AI credit allowance, feature bullets, signup URL). It explicitly distinguishes from sibling tools like partner_program_info and practice_management_info by noting what those cover instead.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool (to list pricing plans) and when to use alternatives (e.g., call partner_program_info for CPA status ladder, practice_management_info for the separate Practice Management product). It also explains how filtering works (by plan or country) and that omitting country returns all 8 markets.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tax_ratesAInspect
List statutory tax-rate slabs by jurisdiction — IN GST (5/12/18/28 + zero + exempt + composition trader/manufacturer/restaurant + compensation cess), UK VAT (20 / 5 / zero / exempt), AU GST (10 / GST-free), US sales-tax (state-administered summary, no federal rate), CA GST 5% + HST 13% ON / 15% Atlantic, SG GST 9%, NZ GST 15%, AE VAT 5%. Filter by country, taxType (GST/VAT/Sales-Tax/HST/IGST/CGST-SGST/TDS/TCS), or scheme (standard / reduced / zero / exempt / composition / cess / state-summary). Every entry carries an effective-from date and an authoritative source URL (CBIC, gov.uk, ATO, CRA, IRAS, IRD, FTA, Tax Foundation) — agents should confirm the rate against the source before quoting figures to a user. Use this when a user asks "what is the GST rate on X?", "what VAT band does Y fall into?", or "what are the composition slabs in India?". This is the public statutory reference — for an org-specific tax assignment use the authenticated books_classify_event tool.
| Name | Required | Description | Default |
|---|---|---|---|
| scheme | No | Filter by slab category — standard, reduced, zero, exempt, composition, cess. | |
| country | No | Filter to one jurisdiction. Omit to return every supported country. | |
| taxType | No | Filter by statutory tax type (GST, VAT, Sales-Tax, HST, etc.). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It discloses key behaviors: each entry has an `effective_from` date and `source` URL, advises confirming with source before quoting, and clarifies this is a public statutory reference vs. org-specific tool. This is thorough and honest.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is lengthy but well-structured: opening with jurisdiction examples, then filtering options, then usage guidance. Every sentence adds value, though some could be slightly more concise. It's organized for quick scanning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description explains what the response contains (effective-from date, source URL) and provides comprehensive coverage of supported countries, tax types, and schemes. It also gives guidance on confirming sources, making it complete for the tool's purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, but the description adds significant context beyond schema enums by listing example countries and tax types, and explaining the meaning of `scheme` values (e.g., composition, cess). This helps the agent understand parameter semantics beyond just the enum values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists statutory tax-rate slabs by jurisdiction, provides concrete examples of rates and countries, and distinguishes from sibling tools like `lookup_tax_rate`. The verb 'list' and resource 'tax-rate slabs' is specific and informative.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit usage guidance: use when a user asks specific tax rate questions (e.g., 'what is the GST rate on X?'), and contrast with an alternative tool for org-specific tax assignment. This effectively helps the agent decide when to invoke this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_videosAInspect
List HelloBooks product videos curated on the marketing site (homepage demo + feature walkthroughs) and the official @hellobooksai YouTube channel link. Each video returns title, description, category, watch URL, embed URL and thumbnail. Filter by category (demo / features / overview), featuredOnly, or free-text query. Use this when a user asks for a demo, walkthrough or video. Note: this is the curated set, not a live mirror of every channel upload — the response includes the channel URL for the full catalog.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Optional substring match against video title + description. | |
| category | No | Filter to one video category (demo, features, overview). | |
| featuredOnly | No | If true, only return videos flagged as featured on the marketing site. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description discloses the curated nature, filter behavior, returned fields, and includes the channel URL for full catalog. Does not mention rate limits or auth, but sufficient given simplicity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with main purpose, no wasted words. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description enumerates returned fields (title, description, category, watch URL, embed URL, thumbnail). Covers filtering, curated nature, and provides channel link for full catalog. Complete for a filtered list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers all parameters with descriptions (100%). The description reiterates filter options but adds little beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it lists HelloBooks product videos curated on the marketing site and YouTube channel, specifying returned fields and filter options. Distinguishes from a live mirror of all channel uploads.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this when a user asks for a demo, walkthrough or video.' Notes it's a curated set, implying when not to use (for full catalog). No explicit alternatives but siblings are unrelated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
local_payment_methodsAInspect
List local bank-rail / wallet payment methods relevant to HelloBooks invoice collection (AR), B2B supplier payments (AP), and contractor payouts (UPI, RuPay, Razorpay, IMPS, NEFT, RTGS, BACS, FPS, CHAPS, Open Banking, Interac e-Transfer, EFT, PayID, PayTo, NPP, BPAY, ACH, Same Day ACH, Fedwire, RTP, Zelle, PayNow, FAST, GIRO, NZ Direct Credit, etc.). Returns rail (instant / same-day / next-day / multi-day), use-cases, issuing authority, HelloBooks support level, and operational notes (per-transaction caps, settlement windows, retirement timelines). Filter by country, useCase, rail, or id.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Return a single payment method by id (e.g. "in-upi", "au-payid", "us-rtp"). | |
| rail | No | Filter by settlement rail (instant, same-day, next-day, multi-day). | |
| country | No | Filter to one country (IN, US, CA, GB, AU, AE, SG, NZ). | |
| useCase | No | Filter by payment use-case. Defaults to HelloBooks' invoice-collection + b2b-supplier + contractor-payout scope; pass an explicit value to widen. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses return values: rail, use-cases, issuing authority, support level, and operational notes. It also mentions default filtering scope, providing sufficient behavioral insight.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but covers purpose, usage, and outputs efficiently. It is front-loaded with the main action. Slightly verbose due to listing many payment methods, but still clear and focused.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, no output schema, and no annotations, the description is complete. It explains what the response includes (rail, use-cases, etc.), compensating for missing output schema. All necessary information for an AI agent to use the tool is present.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds value by providing examples for the 'id' parameter (e.g., 'in-upi', 'au-payid') and clarifying the default scope for 'useCase', enhancing parameter understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and the resource 'local bank-rail / wallet payment methods'. It specifies relevance to HelloBooks' invoice collection, B2B supplier payments, and contractor payouts, distinguishing it from sibling tools which are analysis-focused or list other resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates the tool can be used to list payment methods and filter by country, useCase, rail, or id. It does not explicitly state when not to use or provide alternatives, but the sibling tools do not overlap in purpose, so the usage context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_tax_rateAInspect
Pick a single statutory tax-rate slab — either by exact id (e.g. IN-standard-18, GB-zero-0, CA-hst-13-on) for a deterministic lookup, or by country + free-text category (e.g. "office supplies", "restaurant", "exports", "domestic fuel") for a fuzzy best-match. Returns the matched rate, the match score, and the authoritative source URL. Use this when a user asks "what slab does X fall into in India?" or "what VAT rate applies to children's car seats?". For broader exploration (all slabs in a country / all rates of one scheme), use list_tax_rates. No customer data — public statutory reference only.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Exact rate id, e.g. ``IN-standard-18`` or ``GB-zero-0``. When set, country/category are ignored. | |
| country | No | Country to search within. Required when ``id`` is not provided. | |
| category | No | Free-text query — "office supplies", "restaurant", "exports", "domestic fuel". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and excels: it discloses deterministic vs fuzzy behavior, return fields (matched rate, match score, source URL), and states 'No customer data — public statutory reference only.' No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded paragraph with no wasted words. Every sentence serves a purpose: action, modes, examples, alternatives, and data handling note.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description thoroughly explains return values and behavior. It covers both modes, provides usage examples, and addresses data sensitivity, making it fully self-contained for an agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant value by explaining the interaction between parameters (id overrides country/category), providing concrete examples for id format and category free text, and clarifying the logic of the two modes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool picks a single statutory tax-rate slab and distinguishes between deterministic lookup by id and fuzzy match by country+category. It explicitly differentiates from the sibling tool list_tax_rates for broader exploration.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use examples (e.g., 'what slab does X fall into in India?') and when-not-to-use guidance ('For broader exploration... use list_tax_rates'). It covers both modes and their appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
partner_program_infoAInspect
Return the HelloBooks Partner Program structure — free reseller channel for accountants/CPAs/bookkeepers who put their clients on Pro/Business plans. Earned wholesale discount model (no commission): Bronze 5% at 25 pts / Silver 10% at 75 / Gold 15% at 300 / Platinum 20% at 1,000+. Pro client = 1 pt/mo, Business client = 4 pts/mo. Call with no args for the full ladder + meta; with points for current status + how many points to next tier; with proClients and/or businessClients to derive points from the client book and then return the same status verdict. The Partner Program is the cpa plan id in list_plans (the retired $59.99 SKU is gone). HelloCPA Practice Management at practice.hellobooks.ai is a different product and NOT the Partner Program.
| Name | Required | Description | Default |
|---|---|---|---|
| points | No | Current Partner Points total. If supplied, the response includes the partner's status + how many more points are needed to reach the next tier. | |
| proClients | No | Active Pro-plan clients (1 point each per month). Combined with `businessClients` to derive points if `points` is not supplied directly. | |
| businessClients | No | Active Business-plan clients (4 points each per month). Combined with `proClients` to derive points. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: it is a read operation that returns program structure, computes status based on points or client count, and explains the earning model. No destructive actions implied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is comprehensive but slightly verbose. It front-loads the main purpose and then details parameter usage, making it well-structured. Could be tightened without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description fully covers the tool's functionality for all invocation modes, even without an output schema. It explains what the response contains for each case (full ladder, status with next tier, status verdict).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant value by explaining the point system, tier thresholds, and how to derive points from client counts. This goes far beyond the schema's basic parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns the Partner Program structure, with specific verbs ('Return') and resources ('HelloBooks Partner Program structure'). It distinguishes from sibling tool 'practice_management_info' by clarifying that tool is a different product, avoiding confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly details when to use each parameter combination (no args for full ladder, 'points' for status, client counts to derive points). It also tells what the tool is NOT (the practice management product), providing clear usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
practice_management_infoAInspect
Return HelloCPA Practice Management info — the standalone product at practice.hellobooks.ai for running a CPA / CA / bookkeeping practice (proposals + CPQ, workflow, time tracking, billing, 6-role RBAC, Gmail/Outlook/Calendar sync, CSV migration from TaxDome / Karbon / Canopy). NOT the Partner Program and NOT a tier in list_plans. Per-user pricing model — US shipped at $9.99/user/month (free up to 2 users + 10 clients, 90-day trial, enterprise at 50+ users). 7 other markets (IN, GB, AU, CA, AE, SG, NZ) are roadmap as of 2026-06-12. Call with no args for the full 8-region matrix + features + meta, or with country for one region's status + pricing + competitor frame.
| Name | Required | Description | Default |
|---|---|---|---|
| country | No | ISO country code. Omit to get the full 8-region matrix. US is shipped; the other 7 are roadmap as of 2026-06-12. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that the tool is for retrieving information (read-only), mentions pricing model, markets, and call behavior. While it doesn't explicitly state no side effects, the context implies it's safe. It adds behavioral context beyond the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the key purpose and each sentence adds value. It could be slightly more concise (e.g., combining pricing and markets), but overall it's well-structured and information-dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains what the tool returns: 'full 8-region matrix + features + meta' or 'one region's status + pricing + competitor frame'. This is sufficient for an info retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds significant meaning: it explains the effect of omitting vs. providing the country parameter, lists the valid countries, and gives context (shipped vs. roadmap). This goes beyond the enum description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'HelloCPA Practice Management info' and distinguishes it from sibling tools like partner_program_info and list_plans. It specifies the standalone product at a specific URL and lists features, providing a specific verb and resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use the tool (to get info on the standalone product) and what it is NOT (Partner Program, tier in list_plans). It also provides guidance on calling with or without arguments ('Call with no args... or with `country`'), effectively guiding the agent on usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!