Score | Inxmail MCP

Server Quality Checklist

Profile completionA complete profile improves this server's visibility in search results.

Disambiguation4/5
Tools are clearly distinguished by email pipeline (relay vs. event-triggered) and operation type, with excellent cross-referencing in descriptions (e.g., 'use X instead of Y'). Minor overlap exists between composite checkers (check_email_blocked, check_email_delivery) and granular getters, but descriptions explicitly guide preferred usage.
Naming Consistency5/5
Strict adherence to verb_noun snake_case throughout (add_to_blacklist, list_relay_bounces, mark_error_log_read). Vocabulary is consistent across parallel tool sets (relay vs. transactional variants use identical verb patterns), making the 29-tool surface predictable.
Tool Count3/5
At 29 tools, the surface exceeds the ideal range and feels dense, though justified by the dual-pipeline domain (event-triggered vs. SMTP relay). The consistent naming patterns help manage the complexity, but the count approaches the threshold where agent selection becomes challenging.
Completeness4/5
Excellent lifecycle coverage for blacklists/blocklists and comprehensive monitoring (bounces, complaints, reactions, deliveries, error logs) across both email pipelines. Minor gaps include lack of single-item getters for bounces/complaints and no event type management (create/update), which appear to be admin-only functions.
Average 4.5/5 across 29 of 29 tools scored.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v1.0.4
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
This repository includes a glama.json configuration file.
View server inspector
This server provides 29 tools. View schema
No known security issues or vulnerabilities reported.
Report a security issue
This server has been verified by its author.
Add related servers to improve discoverability.

Tool Scores

Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses read-only nature and return values ('state details including any error information'). Does not mention rate limits or specific error codes, but covers essential behavioral traits for a lookup operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: purpose (sentence 1), workflow context (2), return values (3), safety profile (4). Front-loaded with action verb and appropriately sized for tool complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a single-parameter lookup tool without output schema. Compensates for missing output schema by describing return content ('state details including any error information'). Could further distinguish from check_email_delivery sibling, but adequately covers core functionality.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with transactionId fully described. Description reinforces the workflow relationship ('returned from trigger_event') but does not add technical details (format, length) beyond the schema definition. Baseline 3 appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description provides specific verb ('Check') + resource ('processing state of a previously triggered event') and distinguishes from siblings by specifying this checks email states (accepted, sent, bounced, failed) and is used after trigger_event.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('Use after trigger_event') and implies workflow sequence. Lacks explicit 'when not to use' or named alternatives like check_email_delivery, though the trigger_event reference provides clear contextual guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It explicitly states 'Read-only' (critical safety info) and discloses return value behavior: 'Returns the blacklist entry with block date if found, or a 'not blacklisted' response if absent.' Could improve by mentioning auth requirements or rate limits, but covers essential behavioral traits well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four efficient sentences: purpose, return values, usage guidelines, and read-only declaration. Every sentence earns its place. Information is front-loaded with the core action, followed by returns, then alternatives.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but description compensates by detailing return values ('blacklist entry with block date' vs 'not blacklisted' response). No annotations exist, but description states read-only status. Given the tool's simplicity (single parameter lookup), the description provides adequate context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Email address to check'), establishing baseline 3. The description mentions 'specific email' but adds no syntax details, format constraints, or examples beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with 'Check if a specific email is on the explicit blacklist' — a specific verb (Check) + resource (blacklist) combination. It distinguishes from siblings by specifying 'explicit blacklist' versus the blocklist checked by sibling get_blocklist_entry, and explicitly contrasts with list_blacklist and check_email_blocked in the usage guidance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance: 'Use this for a single-email lookup; use list_blacklist to browse all entries; use check_email_blocked to check both blocklist and blacklist at once.' This clearly maps tools to their specific use cases and alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses key behavioral traits: return format ('HAL+JSON'), structure ('_links to all available resources'), and safety profile ('Read-only, no side effects'). Could improve by mentioning rate limits or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences efficiently structured: (1) core action, (2) return format/content, (3) usage context, (4) safety properties. Every sentence earns its place with zero redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description compensates by detailing the return structure (HAL+JSON with _links) and enumerating example resources. Adequate for a simple discovery endpoint, though rate limit or caching details would strengthen it further.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema contains zero parameters. Per the baseline rule for zero-parameter tools, this earns a default score of 4. The description correctly omits parameter discussion as none exist.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get the Inxmail Commerce API entry point') and distinguishes this discovery/entry-point tool from its operational siblings (add_to_blacklist, check_email_blocked, etc.) by positioning it as the root resource for API exploration.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use the tool: 'to discover available endpoints or verify API connectivity.' This provides clear context for an agent to invoke it before attempting specific resource operations, though it doesn't explicitly name alternative tools to avoid.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses read-only nature, return format (HAL+JSON), and contained fields (complaint type, email, timestamps). Defines what complaints are (spam marking). Lacks rate limits, auth requirements, or error behavior details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: purpose (sentence 1), definition (sentence 2), use case (sentence 3), alternative+returns+constraints (sentence 4). Information is front-loaded and every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a list tool with 10 parameters: explains domain concept (complaints), distinguishes from sibling, discloses pagination and return format despite lack of output schema. Minor gap: could note that all filters are optional (0 required params) or explain pagination mechanics beyond 'paginated'.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (all 10 params documented), establishing baseline of 3. Description adds no parameter-specific syntax, validation rules, or examples beyond what schema provides, but does not need to given comprehensive schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: 'List feedback loop complaints for transactional emails' provides exact verb, resource, and scope. Explicitly distinguishes from sibling 'list_relay_complaints' by limiting scope to transactional emails vs mail relay.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit when-to-use guidance: 'Use this to identify recipients who reported transactional emails as unwanted' provides clear intent. Explicitly names alternative: 'For mail relay complaints, use list_relay_complaints instead' prevents tool selection errors.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries full behavioral disclosure burden. It successfully discloses 'Read-only' status (safety profile), 'paginated' behavior, and return format 'HAL+JSON'. Deducting one point because it omits potential rate limits or specific filtering constraints, but coverage is strong for a list endpoint.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five tightly constructed sentences with zero waste. Information is properly front-loaded with purpose first, followed by usage guidelines, alternative reference, return format, and safety profile. Every sentence earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, and the description compensates by stating the return format (HAL+JSON) and content type (delivery details). With only two optional pagination parameters and clear sibling differentiation, the description provides sufficient context for invocation, though it could enumerate what specific delivery fields are returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for both 'size' and 'page' parameters, establishing a baseline of 3. The description mentions 'paginated' generally but does not add semantic detail about the pagination mechanics, cursor behavior, or specific validation rules beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the specific action 'List delivery status records' and the resource 'final delivery outcome for sent emails'. It clearly distinguishes from sibling tool 'check_email_delivery' by contrasting 'high-level overview' versus 'detailed per-recipient diagnostics', satisfying sibling differentiation requirements.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('for a high-level overview of delivery success/failure across all emails') and explicitly names the alternative tool for different use cases ('For detailed per-recipient diagnostics, use check_email_delivery instead'). Provides clear guidance on tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses key traits: returns 'paginated HAL+JSON' format and declares 'Read-only' status. It also clarifies what error types are included vs excluded, though it omits rate limits or specific permission requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Every sentence serves a distinct purpose: definition, scope inclusion/exclusion, use case, related workflow tools, return format, and safety property. Well-structured and front-loaded without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description compensates by disclosing the return format (paginated HAL+JSON) and read-only nature. Given the tool's limited complexity (2 optional pagination params), the description provides sufficient context for invocation, though explicit auth requirements would improve it to a 5.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (size and page are fully documented with defaults and limits). The description adds no explicit parameter guidance, which aligns with the baseline score of 3 when schema coverage is high (>80%).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the specific action and resource ('List system error log entries from Inxmail') and explicitly distinguishes from sibling tools by clarifying the scope includes 'API failures, rendering issues, and processing problems — not bounce/delivery errors (use list_bounces for those)'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('Use this to diagnose server-side issues'), when-not-to-use (bounce/delivery errors), and named alternatives ('use list_bounces', 'Use get_error_log for full details', 'mark_error_log_read to acknowledge it').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full disclosure burden. It declares 'Read-only' (safety), 'Returns paginated HAL+JSON' (format), and 'paginated' (behavioral trait). Minor gap: doesn't mention rate limits or error conditions, but covers essential behavioral traits well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences, zero waste. Front-loaded with purpose ('List all...'), followed by domain context, usage workflow, return format, and safety declaration. Every sentence earns its place; no redundancy with structured fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but description compensates by stating 'Returns paginated HAL+JSON with event type details.' Covers key sibling relationship (trigger_event) and read-only nature. Adequate for a 3-parameter list tool with 100% schema coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 3 params fully described). Description adds no explicit parameter guidance, but with complete schema coverage, baseline 3 is appropriate. The schema handles the heavy lifting for 'name', 'size', and 'page' semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('List') + resource ('transactional event types') + scope ('in Inxmail'). Explicitly distinguishes from sibling 'trigger_event' by positioning this as the discovery step for IDs needed by that tool, and implies distinction from 'get_event_type' (singular) via 'List all'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit workflow guidance: 'Use this to discover available event type IDs before calling trigger_event.' This names the specific sibling tool and establishes the correct invocation sequence, functioning as both when-to-use and alternative identification.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses 'Read-only' status, pagination behavior ('paginated'), and output format ('HAL+JSON with reaction type, timestamp, and recipient details'). Could mention error conditions or rate limits for a 5.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences with zero waste: purpose, use case, filters, sibling alternative, and output format. Every sentence earns its place with optimal information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Compensates well for missing output schema by describing return format and pagination. Covers read-only nature without annotations. Adequate for 12-parameter tool, though could note that all parameters are optional (0 required).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, establishing baseline of 3. Description groups parameters into categories ('reaction type, recipient, event, sending, or date range') which aids comprehension but doesn't add critical semantic detail beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity with 'List recipient reactions (opens and clicks) for transactional emails' - clear verb, resource, and scope. Distinguishes from siblings by specifying 'transactional' vs relay.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit usage context ('track engagement — who opened or clicked links in event-triggered emails') and clear sibling differentiation ('For mail relay reactions, use list_relay_reactions instead').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden: discloses read-only safety status, explains domain semantics distinguishing hardbounces (permanent) from softbounces (temporary), and specifies return format (paginated HAL+JSON). Minor gap: no rate limit or auth details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences covering purpose, domain concepts, usage boundaries, filtering scope, return format, and safety. No redundancy with schema details; every sentence adds distinct value. Well front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 10 parameters with full schema coverage and no output schema, description adequately covers pagination behavior, bounce type semantics, and tool boundaries. Minor gap: doesn't describe specific response fields, though 'HAL+JSON' provides structural context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, establishing baseline 3. Description categorizes filterable fields ('Filter by bounce type, recipient... date range') confirming capabilities without adding semantic depth (e.g., doesn't explain what correlation IDs represent or relay sending ID references).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description opens with specific verb and resource ('List bounced mail relay emails'), clearly distinguishing from sibling 'list_bounces' by specifying this is for 'relay-sent emails only' versus event-triggered transactional emails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states scope constraint ('Use this for relay-sent emails only') and names the correct alternative tool for other use cases ('for event-triggered transactional bounces, use list_bounces instead'), providing clear when-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses: read-only nature ('Read-only'), return format ('Returns paginated HAL+JSON'), and scope limitations. It could improve by mentioning default sorting or time range limits if any exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences, zero waste. Front-loaded with purpose (List recipient reactions...), followed by usage context, sibling differentiation, parameter summary, and technical metadata. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 11 parameters (all optional), no output schema, and no annotations, the description adequately covers the tool's purpose, filtering capabilities, output format, and safety profile. Minor gap: doesn't mention that all parameters are optional or explain correlation ID purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description summarizes available filters ('Filter by reaction type, recipient...') but adds no semantic depth beyond the schema's own descriptions (e.g., doesn't explain what correlation IDs represent or how to obtain a relaySendingId).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the action ('List recipient reactions'), the specific resource ('mail relay emails sent via SMTP relay'), and the reaction types ('opens and clicks'). It clearly distinguishes from the sibling tool 'list_reactions' by specifying this is for 'relay-sent emails' vs 'event-triggered transactional emails'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('Use this to track engagement on relay-sent emails') and explicitly names the alternative tool for different contexts ('For reactions on event-triggered transactional emails, use list_reactions instead').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description compensates well by declaring 'Read-only' status and return format 'paginated HAL+JSON'. Mentions pagination behavior. Could improve by mentioning rate limits or default behavior when no filters applied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four dense sentences with zero waste: scope definition, use case, alternatives guidance, and technical contract. Well front-loaded with the critical 'event-triggered only' constraint.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Strong compensation for missing output schema by describing return type (HAL+JSON) and pagination. All 8 parameters documented in schema. Sibling differentiation is clear. Minor gap: doesn't describe what 'sending details' includes in the response.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description maps parameters to concepts ('Filter by recipient email, event type...') but adds minimal semantic detail beyond schema descriptions. Does not clarify relationship between 'event' and 'eventId' parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity with 'List transactional email sendings' and clear scope delimitation '(event-triggered emails only, not mail relay)'. Explicitly distinguishes from sibling tool list_relay_sendings by domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('audit which transactional emails were sent') and names two specific alternatives: 'For mail relay sendings, use list_relay_sendings instead' and 'For a comprehensive delivery report... use check_email_delivery'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses: (1) write operation nature, (2) specific field being updated (read status), and (3) return value format (empty response). Minor gap: doesn't mention idempotency or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: purpose (sentence 1), behavior (sentence 2), usage guideline (sentence 3), return value (sentence 4). Well front-loaded and appropriately sized for tool complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for a single-parameter state-change tool. Compensates for missing output schema by explicitly stating 'Returns empty response on success.' Could improve by mentioning error handling or idempotency.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for the single parameter (errorLogId). Description doesn't add parameter-specific semantics, but with complete schema documentation, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Mark') + resource ('error log entry') + state ('read/acknowledged'). Distinctly differentiates from sibling tools like get_error_log (retrieval) and list_error_logs (listing).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'after reviewing an error via get_error_log to track which errors have been addressed.' Names the sibling tool (get_error_log) as the prerequisite step, establishing clear workflow sequencing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden and succeeds: it explicitly labels this a 'write operation,' states the permanent effect, notes reversibility via the sibling tool, and documents the return value ('Returns the created blacklist entry'). Minor gap: does not address idempotency or behavior when adding an already-blacklisted address.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: sentence 1 defines the action and effect, sentence 2 clarifies write-semantics and scope, sentence 3 provides usage context, and sentence 4 documents the return. Information is front-loaded and logically sequenced.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter mutation tool without output schema, the description is complete: it explains the side effects, the return value, the permanent nature of the operation, and cross-references the sibling removal tool. No gaps remain for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Email address to blacklist'), so the baseline is 3. The description does not add syntax details, validation rules, or format requirements beyond what the schema already provides, but none are needed given the simple single string parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the specific action ('Explicitly blacklist') and resource ('email address'), immediately clarifying scope with 'permanently preventing all future emails.' It distinguishes from siblings by referencing the inverse operation 'remove_from_blacklist' and differentiates from the separate 'blocklist' tool family by consistent terminology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use the tool ('Use this for opt-out requests or compliance blocks') and contrasts it with the removal workflow ('until removed via remove_from_blacklist'). It also clarifies the impact scope ('transactional or relay emails'), helping agents decide between this and delivery-checking tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the read-only nature, distinguishes between auto-blocked (hard bounces) and manually blocked entries, and describes the return structure ('summary with blocked status... full entry details'). Minor deduction for not specifying behavior when an email is not blocked.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: (1) purpose and scope, (2) usage context, (3) return value, (4) preference/safety. Information is front-loaded and every clause earns its place. No redundant or filler text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter diagnostic tool without output schema, the description is complete. It compensates for missing output schema by detailing return contents, covers safety (read-only) absent annotations, and differentiates from relevant siblings. No significant gaps given the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the 'email' parameter fully described as 'Email address to check.' The description implies this is the target for block checking but does not add validation rules, format constraints, or examples beyond the schema. Baseline 3 is appropriate given complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('queries') and clearly identifies the dual resources accessed ('blocklist' and 'blacklist'). It explicitly distinguishes this consolidated check from sibling tools 'get_blocklist_entry' and 'get_blacklist_entry' by noting it queries both 'in a single call.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('first step when investigating why an email isn't being delivered') and clear preference over alternatives ('Prefer this over separate get_blocklist_entry/get_blacklist_entry calls'). This eliminates ambiguity about tool selection order.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden, disclosing the read-only nature, return format (block date/email if found, 'not blocked' if not), and specific resource accessed (hard-bounce blocklist). Minor gap on rate limits or caching behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured sentences: purpose, return values, and usage guidelines. Zero redundancy—every sentence provides distinct value not available in structured fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple lookup tool with 1 parameter and no output schema, the description is complete. It compensates for missing output schema by detailing the response format, and adequately covers scope and siblings for proper agent selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the email parameter fully documented as 'Email address to check'. The description references the email but does not add semantic details (format, case sensitivity) beyond the schema. Baseline 3 appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the action ('Check'), resource ('hard-bounce blocklist'), and scope ('specific email'), clearly distinguishing it from sibling tools like get_blacklist_entry and list_blocklist.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('for a single-email lookup') and names two specific alternatives with their use cases ('use list_blocklist to browse all entries; use check_email_blocked to check both blocklist and blacklist at once').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Explicitly states 'Read-only' (critical safety trait) and describes return payload ('error message, timestamp, and context') compensating for missing output schema. Minor gap: doesn't mention idempotency or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences, each earning its place: purpose, return values (compensating for no output schema), prerequisite workflow, post-action workflow, and behavioral safety. Front-loaded with the specific action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for a simple single-parameter tool. Compensates for missing annotations by stating 'Read-only' and for missing output schema by describing return fields. Sibling relationships clarify the complete error log workflow.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% ('Error log ID'), establishing baseline 3. Description references 'by its ID' reinforcing the parameter usage but doesn't add format constraints, validation rules, or examples beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb ('Get'), resource ('error log entry'), and scope ('full details by its ID'). Explicitly distinguishes from sibling 'list_error_logs' (finding entries vs getting details) and 'mark_error_log_read' (retrieving vs acknowledging).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit workflow guidance: 'Use this after finding an entry via list_error_logs' establishes prerequisites, and 'Use mark_error_log_read to acknowledge it after review' defines the post-action step. Clear when-to-use relative to siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral burden. It discloses the read-only nature ('Read-only') and return payload structure ('Returns the event type name, description, and configuration'). Could improve by mentioning error cases if ID not found, but adequate for tool selection.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each earning its place: (1) purpose, (2) return value, (3) usage guidance, (4) safety property. Well-structured with high information density and no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple single-parameter retrieval tool without output schema, the description adequately covers the return structure and operational constraints. No significant gaps given the tool's complexity level.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (eventTypeId is documented in schema), establishing baseline 3. The description implies the ID sourcing strategy ('use list_event_types to discover') but doesn't add parameter format constraints or examples beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Get') + resource ('detailed configuration of a single event type') + scope constraint ('by its ID'). It clearly distinguishes from sibling list_event_types by contrasting single-item retrieval against list discovery.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the precondition ('when you already have an event type ID') and names the exact alternative tool ('use list_event_types to discover available IDs'), providing clear when-to-use vs when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses safety profile ('Read-only') and return value structure ('recipient, event type, timestamps, and delivery status'). Does not mention error handling or rate limits, but covers essential behavioral traits for a lookup operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences structured as: purpose → returns → usage condition → alternative/safety. Zero redundancy; each sentence provides distinct value (what it does, what it returns, when to use it, what to use instead). Efficiently front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has low complexity (1 required parameter, no nested objects) and no output schema. Description compensates by listing specific return fields (recipient, event type, timestamps, delivery status). No gaps remain for this lookup tool's use case.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Sending ID'), establishing baseline of 3. Description reinforces with 'by its ID' and contextualizes origin ('from list_sendings or get_event_state'), but does not add syntax details, format specifications, or validation rules beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description opens with specific verb ('Get') + resource ('transactional email sending') + scoping ('by its ID'). Explicitly distinguishes from sibling 'get_relay_sending' by specifying 'transactional' vs 'mail relay' sendings, preventing tool selection errors.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('when you have a specific sending ID from list_sendings or get_event_state') and clear alternative routing ('For mail relay sendings, use get_relay_sending instead'). This prevents incorrect usage for relay mail vs transactional mail lookups.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description discloses the safety profile ('Read-only'), return format ('paginated HAL+JSON with email and block date'), and business impact ('cannot receive any emails'). Lacks only error-state or rate-limit details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: (1) core action, (2) sibling distinction + provenance, (3) business effect, (4) alternatives + return format + safety. Front-loaded and dense with signal.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking annotations and output schema, the description fully compensates by specifying read-only status, return structure (HAL+JSON), pagination behavior, and sibling relationships. Complete for a filtered list operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (ISO 8601 dates, page size/number fully described), establishing baseline 3. Description adds pagination context ('Returns paginated') linking size/page params to behavior but does not elaborate on date range semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb+resource ('List email addresses...blacklisted'), explicitly distinguishes from sibling 'blocklist' via the contrast clause, and clarifies the manual vs. automated population distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly names two alternatives with distinct use cases: 'Use get_blacklist_entry to check a specific email' and 'check_email_blocked for a combined blocklist+blacklist check', clearly delineating when to use this list tool versus single-entry lookups.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden and successfully states the read-only safety profile, paginated nature of results, and HAL+JSON return format. It also explains the domain concept (complaints = spam markings), though it omits rate limits or specific HAL structure details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five efficient sentences with zero waste: purpose statement, domain definition, usage guidelines, filtering capabilities, and behavioral constraints. Information is front-loaded with the core action first, followed by essential context and constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage and lack of output schema, the description provides sufficient context by specifying the return format (paginated HAL+JSON) and read-only nature. It adequately covers the tool's scope relative to its sibling 'list_complaints', enabling correct agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the structured schema already documents all 10 parameters. The description provides a high-level summary of filterable fields ('Filter by complaint type, recipient...') but adds no semantic detail, syntax guidance, or complex parameter relationships beyond the schema's own descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('List') and precise resource ('feedback loop complaints for mail relay emails'), clearly distinguishing it from generic complaint tools. It explicitly differentiates from sibling tool 'list_complaints' by specifying 'relay emails only'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('Use this for relay emails only') and names the specific alternative tool ('use list_complaints instead') for event-triggered transactional complaints, eliminating ambiguity between the two listing functions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses return format ('paginated HAL+JSON'), safety ('Read-only'), and filtering capabilities. Deducting one point because it doesn't mention rate limits, auth requirements, or error handling behavior that would be critical for a listing operation with 8 optional parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences, zero waste. Front-loaded with core purpose, followed by domain context, sibling differentiation, filtering capabilities, return format, and safety property. Every sentence earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for complexity level. Distinguishes from sibling list_sendings clearly, compensates for missing annotations by stating 'Read-only' and return format, and covers the 8-parameter filtering surface adequately given schema coverage. No output schema exists, but description discloses HAL+JSON pagination format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (all 8 params documented), establishing baseline of 3. Description mentions 'Filter by recipient, correlation IDs, or date range' which maps to parameters but adds minimal semantic depth beyond the schema (e.g., doesn't explain what correlation IDs represent or pagination behavior beyond schema's max 500 note).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity: 'List mail relay sendings' provides specific verb and resource, immediately distinguishes from event-triggered emails with parenthetical '(not event-triggered)', and explains the domain concept ('SMTP relay'). Explicitly contrasts with sibling tool list_sendings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit alternative: 'Use list_sendings for event-triggered transactional emails instead.' Also explains when to use this tool: 'when your application sends emails through Inxmail as an SMTP gateway.' Clear when/when-not guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and discloses: (1) mutation type ('This is a write operation'), (2) return value behavior ('Returns empty response on success'), and (3) scope limitations (manual blocks only). Deducting one point for not specifying error behavior when email is not found or idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste. Front-loaded with action/purpose, followed by operation type, sibling differentiation with conditional logic, and return value. Each sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter deletion tool without output schema or annotations, the description is complete. It compensates for missing output schema by specifying the empty response, explains the write operation nature, and clarifies domain-specific distinctions (blacklist vs blocklist).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Email address to remove'). The description references 'email' but does not add semantic details beyond the schema (e.g., format validation, case sensitivity). Baseline 3 is appropriate given the schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description specifies the exact action ('Remove an email'), target resource ('explicit blacklist'), and outcome ('allowing emails to be sent to this address again'). It clearly distinguishes from sibling tools by differentiating 'blacklist' (manual blocks) from 'blocklist' (hard bounces) and referencing remove_from_blocklist.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('Only removes from the blacklist') and provides the exact alternative for different scenarios ('if the address is also on the blocklist... use remove_from_blocklist as well'). This prevents incorrect tool selection for hard bounces.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden excellently. It discloses write-operation status, warns about the historical failure context, explains the auto-re-add behavior ('may be re-added automatically if it bounces again'), and documents the return value ('Returns empty response on success').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: sentence 1 defines purpose, sentence 2 warns about write semantics, sentence 3 discloses side effects, and sentence 4 documents return behavior. Front-loaded with the core action and appropriately sized for the tool complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tool with no output schema and no annotations, the description is complete. It covers the action, safety implications, behavioral side effects, and return value, leaving no critical gaps for an agent to invoke this tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (the 'email' parameter is described as 'Email address to unblock'), so the baseline is 3. The description implies the parameter purpose through the action description but does not add format constraints, validation rules, or examples beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Remove') and resource ('email from the hard-bounce blocklist'), clarifies the outcome ('allowing transactional emails to be sent'), and distinguishes from the sibling 'remove_from_blacklist' by specifying 'hard-bounce blocklist'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit cautionary guidance ('use with caution, as the address previously caused a permanent delivery failure') appropriate for a write operation. However, it does not explicitly contrast usage with sibling alternatives like 'check_email_blocked' or clarify when to prefer this over 'remove_from_blacklist'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Successfully declares read-only status and default time range behavior (last 30 days). Explains aggregation into a single report. Minor gap: no mention of error handling, rate limits, or report size limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste. Front-loaded with purpose ('Full delivery diagnostic'), followed by data details, usage guidance, and defaults. Each sentence earns its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but description compensates by detailing the report contents (aggregated metrics from multiple sources). Adequate for a 3-parameter diagnostic tool despite missing annotations. Slight gap: could clarify report format or pagination behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with default values documented, establishing baseline 3. Description adds value by explaining the holistic date range behavior ('if no date range is specified') and contextualizing that parameters bound all aggregated data types (sendings, bounces, etc.) rather than describing parameters in isolation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (delivery diagnostic) and resource (email address) clearly. Explicitly enumerates aggregated data types (block status, sendings, bounces, reactions) and distinguishes from siblings by stating it consolidates functionality of list_sendings, list_bounces, list_reactions, and check_email_blocked.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('primary tool for investigating delivery issues') and explicitly names four sibling tools that should not be called separately when using this one. Clear directive on tool selection vs alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full safety disclosure burden and explicitly states 'This is a write operation — it dispatches a real email'. It mentions return values ('Returns the sending result'). Lacks rate limits or error handling specifics, but covers critical mutation warning and side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences efficiently cover: purpose, input constraints, safety warning, usage guidance, and return value. No redundant phrases or tautologies. Information is front-loaded with the core action, followed by requirements and warnings.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tool with complex payload requirements, it adequately explains the RFC 5322 structure and encoding needs. Without an output schema, it acknowledges the return value ('sending result'). Could specify what the result object contains, but sufficient for invocation decisions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While the schema has 100% coverage describing the 'message' parameter, the description adds crucial constraints not in the schema: specific required headers (From, To, Subject, MIME) and the Base64 encoding requirement. This adds meaningful validation context beyond the structured schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Send') and resource ('fully composed RFC 5322 email'), and explicitly scopes it to 'Inxmail's infrastructure'. It distinguishes from siblings by specifying 'custom one-off emails that don't fit an event type template', clearly contrasting with the trigger_event/get_event_type template-based workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('custom one-off emails that don't fit an event type template') and warns about real-world effects ('dispatches a real email'). The contrast with 'event type template' implicitly identifies trigger_event as the alternative for templated sends, providing clear selection criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It explicitly states 'This is a write operation,' describes the deduplication semantics (72-hour window), and discloses return values (transaction ID and acceptance status) despite the lack of output schema. Minor gap on error handling or rate limits prevents a 5.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: 1) purpose/action, 2) operation type/side effects, 3) sibling tool reference, 4) deduplication/returns. Information is front-loaded and every clause provides actionable guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a write operation lacking annotations or output schema. Covers dispatch behavior, deduplication logic, return values, and post-action verification (get_event_state). Lacks explicit error scenarios or retry semantics, but adequately compensates for missing structured metadata.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema has 100% coverage (baseline 3), the description adds crucial semantic context: it clarifies that payload 'must contain a valid email' and explains transactionId's purpose for 'deduplication within a 72-hour window,' adding behavioral meaning beyond the schema's 'Unique ID' description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states the tool 'triggers a transactional email event' and 'dispatches an actual email' via Inxmail. It clearly distinguishes from siblings like send_raw_mail by emphasizing transactional events versus raw SMTP, and specifies the side effect (actual email dispatch).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly directs users to 'Use get_event_state with the returned transactionId to check delivery outcome,' providing clear guidance on which sibling tool to use for status checking versus triggering. Also specifies the 72-hour deduplication window constraint.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses read-only nature ('Read-only') and describes return payload structure ('Returns recipient, timestamps, correlation IDs, and delivery status'). Minor gap: no mention of error conditions or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five short sentences with zero waste. Front-loaded with core purpose, followed by return values, usage conditions, alternative tool, and safety hint. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter retrieval tool without output schema, description is complete. It compensates for missing output schema by detailing return fields, and covers the lack of annotations with explicit 'Read-only' declaration.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage (relaySendingId described), establishing baseline of 3. Description adds valuable context that the ID originates from 'list_relay_sendings', helping the agent understand the data flow.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Get full details'), resource ('mail relay sending'), and scope ('by its ID'). Clearly distinguishes from sibling tool 'get_sending' by specifying 'relay sending' and contrasting with 'event-triggered sendings'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when you have a relay sending ID from list_relay_sendings') and provides clear alternative ('For event-triggered sendings, use get_sending instead'), directly addressing sibling tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description fully discloses: management mechanism ('managed by Inxmail — addresses are added automatically'), safety profile ('Read-only'), and return format ('paginated HAL+JSON with email and block date'). Exceeds baseline requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences cover: purpose, management logic, sibling distinction, alternatives, and return format. Zero filler. Front-loaded action verb. Each sentence advances understanding of tool behavior or selection criteria.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite missing output schema, description fully specifies return format (HAL+JSON structure, pagination, fields included). Covers domain concepts (hard bounces vs manual blacklisting), safety, and pagination behavior. Complete for a list operation with 4 parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage, establishing baseline 3. Description adds contextual value by stating 'Returns paginated' (clarifying size/page purpose) and implying date filtering scope without redundantly listing parameters. Slight boost for contextual framing.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb+resource ('List email addresses automatically blocked due to hard bounces') and immediately distinguishes from sibling 'blacklist' tool ('Unlike the blacklist (manually managed)...'). Clearly defines scope as automatic hard-bounce blocks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly names alternatives with specific use cases: 'Use get_blocklist_entry to check a specific email, or check_email_blocked for a combined check.' Also contrasts blocklist (automatic) vs blacklist (manual) to guide correct selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description fully discloses critical behaviors: hardbounces auto-add to blocklist, softbounces are temporary, returns paginated HAL+JSON with specific fields, and declares read-only status. Rich behavioral context compensates completely for missing annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: defines action, explains domain concepts (hard/soft bounces), provides usage guidance with sibling reference, and specifies output format. Front-loaded with core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 10 parameters, no output schema, and no annotations, description is remarkably complete: covers domain logic, side effects (blocklist auto-add), pagination format, return fields, and read-only nature. No gaps remain for agent operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage (baseline 3). Description adds significant semantic value by explaining bounce type distinctions (hard=permanent/blocklist, soft=temporary) and return data structure, enriching understanding of bounceType and pagination parameters beyond schema labels.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb+resource ('List bounced transactional emails') and immediately distinguishes scope from sibling tool list_relay_bounces by specifying 'transactional emails' vs 'mail relay bounces'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('investigate delivery issues for event-triggered emails') and explicit alternative ('For mail relay bounces, use list_relay_bounces instead'), creating clear selection criteria against siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

GitHub Badge

Glama performs regular codebase and documentation scans to:

Confirm that the MCP server is working as expected.
Confirm that there are no obvious security issues.
Evaluate tool definition quality.

Our badge communicates server capabilities, safety, and installation instructions.

Card Badge

Copy to your README.md:

[![inxmail-mcp MCP server](https://glama.ai/mcp/servers/shahabazdev/inxmail-mcp/badges/card.svg)](https://glama.ai/mcp/servers/shahabazdev/inxmail-mcp)

Score Badge

Copy to your README.md:

[![inxmail-mcp MCP server](https://glama.ai/mcp/servers/shahabazdev/inxmail-mcp/badges/score.svg)](https://glama.ai/mcp/servers/shahabazdev/inxmail-mcp)

How to claim the server?

If you are the author of the server, you simply need to authenticate using GitHub.

However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.

{
  "$schema": "https://glama.ai/mcp/schemas/server.json",
  "maintainers": [
    "your-github-username"
  ]
}

Then, authenticate using GitHub.

Browse examples.

How to make a release?

A "release" on Glama is not the same as a GitHub release. To create a Glama release:

Claim the server if you haven't already.
Go to the Dockerfile admin page, configure the build spec, and click Deploy.
Once the build test succeeds, click Make Release, enter a version, and publish.

This process allows Glama to run security checks on your server and enables users to deploy it.

How to add a LICENSE?

Please follow the instructions in the GitHub documentation.

Once GitHub recognizes the license, the system will automatically detect it within a few hours.

If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.

How to sync the server with GitHub?

Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.

To manually sync the server, click the "Sync Server" button in the MCP server admin interface.

How is the quality score calculated?

The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).

Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.

Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).

Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.

Inxmail MCP

Server Quality Checklist

Tool Scores

GitHub Badge

Card Badge

Score Badge

Latest Blog Posts

MCP directory API