agent-receipts
Server Quality Checklist
- Disambiguation4/5
Tools are generally distinct, though `create_receipt` and `track_action` overlap in purpose (both create receipts) with subtle differences in hashing control that require careful reading. The parallel completion workflows (`complete_receipt` vs `complete_judgment`) are clearly separated by domain (regular receipts vs judgments), minimizing confusion.
Naming Consistency4/5Most tools follow a consistent verb_noun pattern (e.g., `create_receipt`, `verify_receipt`, `generate_invoice`). Minor deviations exist: `cleanup` lacks a noun object (should be `cleanup_receipts`), and `get_started` uses a participle instead of a resource name. Pluralization is consistent with CRUD conventions (singular for single get, plural for list).
Tool Count5/5Fourteen tools is well-scoped for a cryptographic receipt auditing system. The set includes creation (two modes), completion, verification, querying (single, list, chain, judgment), evaluation workflows, billing, key management, and maintenance—each earning its place without bloat.
Completeness5/5Full lifecycle coverage for immutable audit receipts: creation (manual or automatic hashing), two-phase or atomic completion, cryptographic verification, chain tracking, judgment evaluation with criteria, TTL-based cleanup, and invoice generation. No arbitrary updates/deletions is appropriate for the audit-trail domain.
Average 4.4/5 across 12 of 14 tools scored.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v0.2.11
Tools from this server were used 8 times in the last 30 days.
This repository includes a glama.json configuration file.
- This server provides 14 tools. View schema
No known security issues or vulnerabilities reported.
This server has been verified by its author.
Add related servers to improve discoverability.
Tool Scores
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses cryptographic behavior ('re-signed with Ed25519'), linkage mechanism ('linked to the original receipt via parent_receipt_id'), and return values ('Returns the judgment receipt ID, verdict, score, and chain ID'). Missing only safety/destructiveness context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Five tightly constructed sentences with zero waste: action/scope, data recorded, cryptographic processing, return values, and usage timing. Information is front-loaded with the primary verb, and every sentence earns its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 6-parameter mutation tool with complex nested array structures and no output schema, the description adequately covers the return payload, cryptographic side effects, and workflow prerequisites. Minor gap regarding idempotency or error handling scenarios prevents a 5.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description lists the conceptual data being submitted (verdict, score, criteria_results, confidence) but adds no semantic meaning, validation rules, or format guidance beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific action ('Submit evaluation results to finalize a pending judgment receipt') and explicitly names the sibling tool 'judge_receipt' to establish the workflow relationship, clearly distinguishing this finalization step from receipt creation or other operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit temporal guidance ('Use immediately after evaluating the prompt returned by judge_receipt') that establishes the sequence in the workflow. Lacks explicit 'when not to use' exclusions or alternative tools, but clearly positions the tool in the judgment lifecycle.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses critical behavioral traits: ordering ('timestamp ascending'), return structure ('complete receipt objects for every step'), and grouping semantics. Missing explicit read-only declaration and error handling (e.g., invalid chain_id), but 'Retrieve' implies safe read operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: (1) core action + ordering, (2) domain definition, (3) return format, (4) use cases. Front-loaded with the essential operation and ordering constraint. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Strong coverage for a tool with no output schema. Explains return values ('complete receipt objects'), domain concept (chain as workflow grouping), and practical applications. Would need error handling or pagination notes for a 5, but adequate for the complexity level.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed format specification for chain_id. Description adds conceptual context that a chain represents a 'multi-step agent workflow,' which helps agents understand what the ID refers to, but does not add syntax or constraint details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Retrieve' with clear resource 'receipts belonging to a chain' and distinguishes from siblings (get_receipt for single receipts, list_receipts for ungrouped lists) by defining a chain as 'related receipts from a multi-step agent workflow' with explicit ordering 'timestamp ascending'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit positive use cases: 'audit a complete workflow, calculate total chain cost and duration, or identify which step in a pipeline failed.' Lacks explicit 'when not to use' guidance or named alternatives, but the use cases clearly signal this is for sequential workflow analysis rather than single-receipt lookup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the destructive nature (Delete), explains the dry_run safety mechanism ('preview deletions without committing'), and documents return values ('Returns count of deleted receipts and remaining total'). It could be improved by mentioning whether deletions are permanent or reversible.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences total with the main action front-loaded. There is slight redundancy between sentence 1 and 2 (both explain expiration logic), but sentence 2 adds necessary technical specificity about the metadata field. Otherwise efficiently structured with no filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description explicitly documents the return structure ('count of deleted receipts and remaining total'). It also explains the domain-specific expiration logic (expires_at field) and operational context (TTL policies), making it complete for a single-parameter maintenance tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a complete technical description of the dry_run parameter. The description adds valuable conceptual framing ('preview deletions without committing') that complements the schema's technical definition, helping the agent understand the parameter's purpose beyond its boolean type.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Delete receipts), the target resource (expired receipts), and the condition (expires_at field in metadata earlier than current time). It clearly distinguishes from sibling tools like create_receipt, get_receipt, or verify_receipt by focusing on batch cleanup of expired data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context ('Use periodically to manage storage and enforce TTL policies') and explains the dry_run safety mechanism. While it doesn't explicitly name alternatives or exclusions, it clearly frames when and why to invoke this tool (periodic maintenance).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden and succeeds well: explains aggregation logic (by agent/action/day), calculations performed (costs, tokens, counts), output format options, and cryptographic verification behavior. Minor gap regarding persistence (file vs. returned data) or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Five well-structured sentences that are front-loaded (core action first) and waste-free. Each sentence earns its place: source data, aggregation logic, output formats, audit trail, and use case guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a 14-parameter tool with complex domain logic (cryptographic billing). Covers aggregation, filtering, and format options implied by the schema. Minor deduction for no output schema requiring some return value description, though input complexity is well-addressed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. Description adds valuable semantic context by mapping narrative elements to parameters: 'date range' (from/to), 'agent, action, or day' (group_by enum), and output formats (format enum). Explains domain significance of include_receipts via 'cryptographic proof' context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (generate), resource (client invoice), and source data (cryptographically signed receipts). Clearly distinguishes from siblings like list_receipts or create_receipt by emphasizing aggregation into a billing document with cryptographic audit trails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('Use to bill clients for AI agent work with cryptographic proof'), providing clear context. Lacks explicit 'when not to use' statements or named alternatives (e.g., use list_receipts for simple querying without billing).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses return behavior: 'Returns an array... ordered by timestamp', specific fields included ('verdict, score, criteria results, and confidence'), and edge case handling ('Returns empty array if no judgments exist'). Missing only safety profile hints (read-only status) which would make it a 5.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Five tightly constructed sentences with zero waste. Front-loaded with the core action, followed by linkage explanation, return structure, use cases, and edge case. Every sentence earns its place; no redundancy with the schema or title.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter retrieval tool without an output schema, the description comprehensively compensates by detailing the return array structure, ordering, contents, and empty-state behavior. It also explains the domain-specific parent_receipt_id concept. Minor gap: does not explicitly state the read-only/idempotent nature of the operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While the schema has 100% coverage describing the receipt_id parameter, the description adds valuable semantic context by explaining the relationship mechanism ('Judgment receipts are linked via parent_receipt_id'), clarifying that the input is the original receipt ID, not the judgment ID. This domain context exceeds the baseline 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Retrieve') and resource ('judgment receipts'), clearly defining the scope as 'all judgment receipts associated with a given receipt ID'. It distinguishes itself from siblings like `get_receipt` (which gets a single receipt) and `judge_receipt` (which likely creates judgments) through explicit mention of the parent_receipt_id linkage and retrieval action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear positive use cases: 'Use to review the evaluation history of a receipt, compare multiple judgments, or audit AI quality assessments.' This contextualizes when to use the tool versus alternatives like `get_receipt` (for the receipt itself) or `judge_receipt` (for creating new judgments). Lacks explicit 'when not to use' exclusions, but the context is clear enough for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It adds valuable context about what the guide contains (usage examples for specific workflows), but does not disclose behavioral traits such as side effects, idempotency, authentication requirements, or the output format of the guide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficiently structured sentences where every clause earns its place. The first sentence defines the tool's function and content scope, while the second provides clear usage timing, with no redundant or wasteful language.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no output schema) and its role as a discovery helper, the description provides appropriate completeness by explaining what the guide contains and when to invoke it. A minor gap remains in not specifying the output format of the displayed guide.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema contains zero parameters (empty object). According to the baseline rules, 0 parameters warrants a baseline score of 4, as there are no parameter semantics to clarify beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Display' with the resource 'getting-started guide' and explicitly distinguishes this from operational siblings by positioning it as the setup/reference tool. It maps the guide's contents to specific sibling functions (record agent actions, verify receipts, use receipt chains, evaluate with constraints, generate invoices), clearly differentiating it from tools like track_action or verify_receipt.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit temporal guidance ('Call this tool first when setting up Agent Receipts') and situational guidance ('when you need a reference for available tools'), clearly indicating this is an onboarding/discovery tool to be used before or alongside operational tools rather than for actual receipt operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It effectively describes the state change (creates 'pending judgment receipt'), return value ('structured evaluation prompt'), security properties ('Ed25519-signed for auditability'), and the critical behavioral expectation that 'the host model (you) evaluates.' However, it lacks details on error conditions, idempotency, or the specific structure of the returned prompt.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description comprises four sentences with zero waste: sentence 1 defines the core action, sentence 2 establishes the workflow involving the host model and complete_judgment, sentence 3 specifies usage context (beyond pass/fail), and sentence 4 notes security properties. Information is front-loaded and every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complex nested schema and multi-step workflow, the description provides excellent context by explaining the two-phase process (start then complete) and auditability features. However, without an output schema, it could further clarify what the 'structured evaluation prompt' contains to fully prepare the agent for handling the response.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents all parameters (receipt_id, rubric, output_summary_for_review). The description references these in context ('provided rubric criteria,' 'receipt's output') but does not add significant semantic meaning beyond what the schema already provides, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool 'Start[s] an AI judgment evaluation' by creating a 'pending judgment receipt,' using specific verbs and resources. It clearly distinguishes itself from sibling tools by explicitly mentioning the subsequent step requires calling 'complete_judgment with the results,' establishing its role in a multi-step workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit workflow guidance: 'Use to assess output quality beyond simple pass/fail constraints' (when to use vs. simple verification alternatives), and explicitly names the required follow-up action 'then calls complete_judgment.' This clearly maps the tool's place in the evaluation pipeline relative to siblings like verify_receipt or get_receipt.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses local SQLite storage (persistence layer), pagination mechanics (default 50/page, has_next/has_prev flags), and return structure (total count, page info). Missing explicit read-only safety declaration, though implied by 'Query'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: first defines core functionality and filters, second covers pagination/defaults/returns, third provides use cases. Information is front-loaded and dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 10-parameter tool with no annotations or output schema, the description compensates well by detailing pagination behavior, filter combinations, and return flags. Minor gap: could specify what fields appear in the receipt objects returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by enumerating valid sort fields ('timestamp, cost, or latency') not explicitly listed in the schema's sort parameter description, helping agents construct valid sort strings.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: states exact verb ('Query and paginate'), resource ('receipts from local SQLite storage'), and scope (optional filtering by 7 dimensions). Distinguishes from sibling get_receipt by emphasizing multiple results and filtering.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear positive guidance ('Use to audit agent activity, generate reports, or find specific receipts') establishing appropriate contexts. However, lacks explicit negative guidance or contrast with sibling get_receipt for single-record lookups.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully explains the SHA-256 hashing behavior (privacy protection) and Ed25519 signing mechanism, plus return value structure ('signed receipt with receipt_id'). Minor deduction for omitting idempotency, failure modes, or persistence guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four tightly constructed sentences: purpose with technical specs, recorded metrics, return value, and usage guidance. Each sentence delivers unique information without redundancy. Front-loaded with the core action and cryptographic details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For an 18-parameter tool with complex nested objects (constraints) and no output schema, the description adequately covers the return structure and primary workflow. Minor gap regarding the chaining mechanism (parent_receipt_id, chain_id) for multi-step receipts, though the single-step preference hint partially compensates.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. While the description mentions 'automatic SHA-256 hashing,' this largely repeats the schema's property descriptions ('automatically hashed with SHA-256'). It adds semantic grouping ('Records model usage, costs, latency') but no parameter-specific syntax beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Create') and resource ('completed Ed25519-signed receipt'), clearly distinguishing this as a cryptographic audit tool for AI actions. It explicitly differentiates from siblings by naming 'create_receipt + complete_receipt' as the alternative multi-step approach.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use this as the primary tool for recording agent actions' and provides clear when-not-to-use guidance ('prefer over create_receipt + complete_receipt for single-step actions'). This directly maps the tool to its optimal workflow context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden. It successfully explains the internal mechanism (extracts 12-field signable payload, canonicalizes, verifies against stored public key) and return value structure ('Returns verified: true if the signature is valid'). However, it omits error handling behavior (what happens if receipt_id doesn't exist) and doesn't explicitly state whether this is read-only or has side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste. Front-loaded with the core action (cryptographic verification), followed by technical implementation details (12-field payload, canonicalization), return value specification, and usage guidance. Every sentence earns its place by conveying distinct information about mechanism, output, or context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a single-parameter tool with no output schema and no annotations, the description is remarkably complete. It explains the cryptographic process, return value structure, and usage contexts. However, without an output schema, it could have elaborated on error cases (invalid signature vs missing receipt) or the full return object structure beyond the 'verified' field.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (receipt_id is fully documented in schema as 'The receipt ID to verify — must exist in local storage'). The description mentions 'stored receipt' which aligns with the parameter but doesn't add additional semantic meaning, syntax guidance, or examples beyond what the schema already provides. Baseline 3 is appropriate given the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Cryptographically verify an Ed25519 signature on a stored receipt' providing specific verb (verify), resource (receipt), mechanism (Ed25519), and scope (tamper detection). It clearly distinguishes from siblings like create_receipt, get_receipt, or judge_receipt by emphasizing cryptographic integrity checking versus creation, retrieval, or content evaluation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'Use to audit receipts before using them as evidence or before completing payments based on agent work.' This provides clear context for invocation (audit scenarios, payment completion) and implicitly distinguishes from other receipt operations by focusing on tamper-evidence verification.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It successfully adds critical security context ('The private key never leaves your machine') and output format details ('64-character hex string') that would otherwise be unknown. Does not mention idempotency or performance characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: (1) purpose and resource, (2) return format, (3) usage guidance, (4) security guarantee. Information is front-loaded and every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple parameter-less getter, the description is comprehensive. It compensates for the missing output_schema by specifying the return format (hex string) and explains the cryptographic context (Ed25519, signing receipts) and security model sufficiently for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema contains zero parameters. According to scoring rules, 0 params establishes a baseline of 4. The description correctly requires no additional parameter clarification.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Export') and resource ('Ed25519 public key') and clearly identifies its role in the receipt-signing ecosystem. It distinguishes itself from sibling tools like create_receipt (which would use the private key) by specifying this retrieves the public component only.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear guidance on when to use ('Share this key with clients or third parties so they can independently verify receipt signatures'), establishing the tool's purpose for external verification workflows. Lacks explicit 'when not to use' or named alternatives (e.g., contrasting with verify_receipt for internal verification).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral burden. It excellently discloses the return structure ('full receipt object including all 27 fields' with categorical breakdown), error behavior ('Returns an error message if the receipt ID does not exist'), and storage context ('local SQLite storage').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each earning its place: purpose declaration, return value details, error handling, and usage guidelines. Information is front-loaded with the core action, and no words are wasted.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description compensates by enumerating the 27 field categories returned. Error cases are covered, the single parameter is well-documented in schema, and sibling relationships are implied. Fully complete for a single-parameter retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing the receipt_id format ('rcpt_' + 12 alphanumeric characters). The description references the ID parameter ('by its ID') but does not add semantic meaning beyond the schema's detailed format specification. Baseline 3 is appropriate given the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb (Retrieve) + resource (receipt) + scope (single by ID from local SQLite storage). It distinguishes from siblings like list_receipts by emphasizing 'single receipt by its ID' and relates to verify_receipt by mentioning retrieval 'before verification'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit positive guidance: 'Use to inspect a specific receipt or retrieve it before verification.' This establishes the primary use cases and hints at the relationship with verify_receipt. However, it lacks explicit negative guidance (when not to use) or explicit mention of list_receipts as the alternative for batch retrieval.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully communicates the cryptographic re-signing behavior ('re-signs with Ed25519'), state transition mechanics (pending to completed/failed/timeout), and return value ('updated signed receipt'). Minor gap: does not specify error behavior when attempting to complete non-existent or already-completed receipts.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences with zero redundancy: purpose, mechanism, usage context, constraint, and return value. Each sentence earns its place. The description is front-loaded with the core action and maintains high information density throughout.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex 13-parameter stateful tool with cryptographic operations, the description adequately covers the workflow (two-phase tracking), state machine constraints, and return value despite the absence of annotations or output schema. Could be improved with error condition specifics, but sufficient for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds value by semantically grouping the 13 parameters into 'execution results, costs, and output data,' providing conceptual structure that helps an agent understand what information to aggregate when calling the tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb-resource combination ('Finalize a pending receipt') and scope ('recording execution results, costs, and output data'). It clearly distinguishes from sibling tool create_receipt by positioning itself as the second phase of a two-phase tracking workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('Use after create_receipt... two-phase tracking') and critical constraints ('Cannot complete an already-completed receipt'). The temporal sequencing relative to create_receipt provides clear decision criteria for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and discloses key behaviors: signing algorithm (Ed25519), storage mechanism (SQLite locally), return structure (signed receipt object with receipt_id), and mutability (can be completed later). Lacks only explicit mention of idempotency or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste. Front-loaded with the essential technical characteristics (Ed25519, SHA-256), followed by usage conditions, explicit alternative, and return/storage details. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite 20 parameters and no output schema, the description is complete. It compensates for missing output schema by describing the return value, covers the complex parameter set by referencing the hash requirements, and explains the receipt lifecycle (creation → completion) referencing the sibling 'complete_receipt'.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds crucial semantic context beyond the schema by emphasizing that hashes must be 'pre-computed' externally and specifying the cryptographic standard (SHA-256), helping the agent understand the expected parameter format and preparation requirements.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the core action ('Create an Ed25519-signed receipt') and identifies the specific cryptographic resources involved (Ed25519 signatures, SHA-256 hashes). It clearly distinguishes from sibling tool 'track_action' by specifying this version requires 'pre-computed' hashes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('when you have already hashed the input/output data externally or need full control') and explicitly names the alternative sibling tool 'track_action' for automatic hashing. Also clarifies lifecycle integration with 'complete_receipt'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
GitHub Badge
Glama performs regular codebase and documentation scans to:
- Confirm that the MCP server is working as expected.
- Confirm that there are no obvious security issues.
- Evaluate tool definition quality.
Our badge communicates server capabilities, safety, and installation instructions.
Card Badge
Copy to your README.md:
Score Badge
Copy to your README.md:
How to claim the server?
If you are the author of the server, you simply need to authenticate using GitHub.
However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.
{
"$schema": "https://glama.ai/mcp/schemas/server.json",
"maintainers": [
"your-github-username"
]
}Then, authenticate using GitHub.
Browse examples.
How to make a release?
A "release" on Glama is not the same as a GitHub release. To create a Glama release:
- Claim the server if you haven't already.
- Go to the Dockerfile admin page, configure the build spec, and click Deploy.
- Once the build test succeeds, click Make Release, enter a version, and publish.
This process allows Glama to run security checks on your server and enables users to deploy it.
How to add a LICENSE?
Please follow the instructions in the GitHub documentation.
Once GitHub recognizes the license, the system will automatically detect it within a few hours.
If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.
How to sync the server with GitHub?
Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.
To manually sync the server, click the "Sync Server" button in the MCP server admin interface.
How is the quality score calculated?
The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).
Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.
Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).
Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/webaesbyamin/agent-receipts'
If you have feedback or need assistance with the MCP directory API, please join our Discord server