meta-mcp
Server Quality Checklist
- Disambiguation4/5
Tools follow a clear platform_prefix + verb + noun pattern that distinguishes Instagram, Threads, and Meta functions effectively. While most tools target distinct resources, the pair threads_get_posts and threads_get_user_threads are described as aliases, creating unnecessary confusion about which to use.
Naming Consistency5/5All tools use consistent snake_case formatting with structured prefixes (ig_, threads_, meta_) followed by standard REST verbs (get, delete, publish, post, etc.). The naming convention is predictable and systematic throughout the entire surface.
Tool Count2/5With 56 tools covering two distinct social platforms plus authentication, the surface significantly exceeds the practical threshold for agent usability (25+). While the scope is legitimately broad, the sheer volume approaches unmanageable for LLM context windows and tool selection accuracy.
Completeness4/5The server provides comprehensive coverage for Instagram content publishing, comment moderation, DM management, and analytics, plus full Threads capabilities including search and insights. Minor gaps exist (e.g., no caption editing, limited webhook management), but core CRUD workflows for both platforms are well represented.
Average 3.4/5 across 56 of 56 tools scored. Lowest: 2.5/5.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v3.1.0
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
This repository includes a glama.json configuration file.
- This server provides 57 tools. View schema
No known security issues or vulnerabilities reported.
This server has been verified by its author.
Add related servers to improve discoverability.
Tool Scores
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden of behavioral disclosure. It fails to specify what 'information' is returned (since no output schema exists), whether this requires specific permissions, or if there are rate limits. The word 'Get' implies read-only, but this is not explicit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 5 words. While no words are wasted, the description is arguably under-specified rather than optimally concise—it sacrifices necessary detail (return type, distinguishing features) for brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and the presence of multiple similar sibling tools, the description is insufficient. It should explain what hashtag information is returned to help the agent determine if this meets its needs, and clarify the relationship to other hashtag endpoints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage (hashtag_id with source hint). The description mentions 'by ID' which aligns with the parameter, but adds no additional semantics about the ID format, validation rules, or behavior when invalid. Baseline score applies since schema is self-documenting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose3/5Does the description clearly state what the tool does and how it differs from similar tools?
States the basic action (Get hashtag information) and scope (by ID), but 'information' is vague and doesn't specify what data is returned (metadata, statistics, etc.). The 'by ID' qualifier helps distinguish from ig_search_hashtag, but doesn't clarify how this differs from ig_get_hashtag_recent or ig_get_hashtag_top.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the three sibling hashtag tools (ig_search_hashtag, ig_get_hashtag_recent, ig_get_hashtag_top). The schema description mentions the ID comes from ig_search_hashtag, but the main description doesn't explain this workflow or when this specific endpoint is preferred.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet the description discloses no behavioral traits. Missing: whether replies create threaded conversations, rate limiting, what happens if the parent comment is deleted, or required authentication scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at only four words. While not verbose, it is appropriately structured and front-loaded; however, the extreme brevity contributes to under-specification rather than efficient communication.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich ecosystem of 40+ sibling tools with overlapping functionality (ig_post_comment, threads_reply, ig_send_message), the description fails to clarify unique aspects, return behavior, or selection criteria needed for correct tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for both parameters. The description adds no additional semantic context (e.g., message length limits, comment_id format), meeting the baseline expectation when schema documentation is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose3/5Does the description clearly state what the tool does and how it differs from similar tools?
States the basic action (reply) and target (comment), but lacks specificity about the Instagram platform context and fails to distinguish from sibling tools like ig_post_comment (which posts top-level comments) or threads_reply.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus ig_post_comment or other communication tools. Does not mention prerequisites like obtaining the comment_id from ig_get_comments or permissions required.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides minimal information. It does not clarify read-only safety, rate limits, what 'published' implies (vs. drafts), or the return data structure. The agent must infer safety from the verb 'get'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently worded with no redundancy. However, given the lack of annotations and the need to distinguish from numerous siblings, the description is arguably too minimal rather than appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete for a tool with 5 parameters, no annotations, and no output schema. Critical gaps remain regarding resource ownership (authenticated user's posts), pagination behavior (how cursors relate to time ranges), and relationship to sibling tools. The description does not compensate for missing structured metadata.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema adequately documents all five parameters (limit, since, until, after, before). The description adds no parameter-specific context, meeting the baseline expectation when the schema is self-documenting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose3/5Does the description clearly state what the tool does and how it differs from similar tools?
The description states the basic action (get/list) and resource (Threads posts), but fails to specify whose posts (authenticated user vs. arbitrary) or how it differs from siblings like 'threads_get_user_threads' or 'threads_get_post' (singular). The scope is ambiguous given the API context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus 'threads_search_posts', 'threads_get_user_threads', or 'threads_get_post'. No mention of pagination strategies (time-based vs. cursor) or prerequisites like authentication requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. While 'pending' hints at a filter state, it omits: pagination behavior (despite cursor parameter), rate limiting, required permissions, or what data structure is returned. The 'Added in December 2025' metadata is irrelevant to behavioral understanding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total. The first is functional and front-loaded. The second ('Added in December 2025') wastes space with version metadata that doesn't aid tool selection or invocation, preventing a higher score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and fails to describe return values, fields, or data structure of the invites. For a data retrieval tool, this omission leaves significant gaps in understanding what information will be available for downstream processing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with basic descriptions ('Number of results', 'Pagination cursor'). The description adds no parameter-specific context, but baseline 3 is appropriate since schema adequately documents the two parameters without need for elaboration.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Get) and resource (pending collaboration invites) with scope (Instagram account). The word 'pending' implicitly distinguishes it from sibling tool ig_respond_collaboration_invite, though it doesn't explicitly reference the workflow between them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus alternatives, prerequisites for invocation, or expected workflow (e.g., 'use this before responding to invites').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description fails to disclose read-only nature, permission requirements, rate limits, or error behavior (e.g., message not found). Only implies read operation via 'Get'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is front-loaded but undersized given lack of annotations and output schema. While not wasteful, it fails to earn completeness points due to omitted behavioral and return-value context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, yet description doesn't describe what 'details' are returned (content, sender, timestamps, media). No error handling or side-effects mentioned despite zero annotation coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for the single parameter. Description mentions 'specific' aligning with message_id but adds no semantic detail about ID format, source, or constraints beyond the schema itself.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Uses specific verb 'Get' and resource 'DM message', with 'specific' distinguishing it from the sibling 'ig_get_messages' (plural). Clear scope but doesn't explicitly name the sibling alternative.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this single-message retrieval vs the plural 'ig_get_messages', nor prerequisites for obtaining a valid message_id. Lacks when-not-to-use or prerequisite context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose error behavior (e.g., comment not found), whether replies are included in the response, rate limits, or required permissions beyond the implicit read operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise single sentence with no redundancy. However, it may be overly terse given the lack of annotations and output schema, leaving significant gaps that could have been filled with one additional sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description omits critical context: the structure/content of returned comment details, relationship to sibling tools, and error conditions. Insufficient for agent confidence in invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Comment ID'), establishing a baseline of 3. The description adds no additional context about the parameter format, valid ID patterns, or that the ID typically comes from 'ig_get_comments' or webhooks.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States a clear verb ('Get') and resource ('details of a specific comment'), but does not explicitly differentiate from sibling tool 'ig_get_comments' (plural) or clarify what 'details' encompasses.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this single-comment lookup versus 'ig_get_comments' for bulk retrieval, or versus 'ig_get_replies' for threaded conversations. No prerequisites (e.g., obtaining the comment_id) are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to deliver. It does not explain pagination mechanics, rate limits, comment ordering (chronological vs. ranked), or whether replies are included. The presence of 'ig_get_replies' as a sibling suggests this tool returns top-level comments only, but the description remains silent on this critical distinction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single 7-word sentence that is front-loaded and efficient. However, given the lack of annotations and output schema, this brevity becomes a liability rather than a virtue—it sacrifices necessary behavioral context for extreme conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter tool with pagination capabilities ('after' cursor), no output schema, and no annotations, the description is insufficient. It fails to explain the pagination workflow, distinguish from related comment tools (ig_get_comment, ig_get_replies, ig_get_mentioned_comments), or describe the expected return structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'specific Instagram media post' which aligns with the 'media_id' parameter, but adds no semantic context for 'limit' or 'after' beyond what the schema already provides (e.g., pagination patterns, max limits).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the core action ('Get') and resource ('comments') scoped to 'a specific Instagram media post.' However, it fails to distinguish from the singular sibling 'ig_get_comment' or clarify whether this retrieves top-level comments only (given 'ig_get_replies' exists separately).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'ig_get_comment' (singular) or 'ig_get_replies.' It omits pagination strategy despite having an 'after' cursor parameter, leaving agents to infer usage patterns.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While 'Check' implies a read-only operation, the description doesn't confirm safety, idempotency, expected status values (e.g., 'FINISHED', 'IN_PROGRESS'), polling frequency limits, or error states. This is inadequate for a status-checking tool in an async workflow.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at nine words. The parenthetical '(useful for videos)' earns its place by clarifying scope. However, given the lack of annotations and output schema, the brevity becomes a liability—every sentence earns its place, but there aren't enough sentences to convey necessary behavioral context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Insufficient for a tool that's part of a multi-step asynchronous publishing workflow. With no output schema and no annotations, the description should explain what status values are returned, that this tool is designed for polling after publishing media, and typical completion times. Currently agents must guess the return format and workflow integration.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage (container_id is described as 'Container ID to check'), so the baseline is 3. The description mentions 'media container' which aligns with the parameter, but adds no additional semantic context about where to obtain this ID or its format beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Check) and resource (processing status of a media container), with a helpful parenthetical clarifying it's useful for videos. However, it doesn't explicitly identify this as Instagram-specific (though the tool name does), nor does it distinguish from the sibling threads_get_container_status tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives, or prerequisites for invocation. Given this is likely part of an async publishing workflow (checking status after ig_publish_video), the description fails to mention when polling is appropriate or how it fits into the publishing sequence.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify the time window for 'recent', whether the operation is read-only (implied but not confirmed), pagination behavior, or return format. It only confirms the action is safe.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. However, given the lack of annotations and output schema, this extreme brevity results in underspecification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is inadequate. It fails to explain what constitutes 'recent', what media fields are returned, or pagination mechanics, leaving significant gaps the agent must infer from parameter names.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with basic explanations ('Hashtag ID', 'Number of results', 'Pagination cursor'). The description adds no additional semantic context about parameter formats or constraints, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Get), resource (recent media), and scope (tagged with specific hashtag). It implicitly distinguishes from sibling ig_get_hashtag_top via the word 'recent', though it could explicitly clarify the difference between recent and top content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this versus ig_get_hashtag_top, or how to handle pagination beyond the 'after' parameter existing. There are no prerequisites, rate limit warnings, or timeframe explanations for 'recent'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While 'Get' implies a read-only operation, the description does not confirm idempotency, safety, rate limits, or error conditions. It also fails to characterize the return value beyond the vague term 'details'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is efficiently structured with zero redundancy and the action verb front-loaded. However, given the absence of annotations and output schema, it errs on the side of under-specification rather than optimal information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a two-parameter tool with complete schema coverage, the description meets minimum viability but lacks context regarding the output structure (especially since no output schema exists) and does not clarify what constitutes 'details' beyond the fields parameter defaults.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, documenting both media_id and the fields parameter with its default values. The description adds no supplemental parameter context, but this aligns with the baseline expectation when the schema is self-documenting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a clear verb ('Get') and resource ('Instagram media post'), and the word 'specific' implies single-item retrieval by ID. However, it does not explicitly differentiate from sibling ig_get_media_insights (which retrieves analytics vs. basic metadata) or ig_get_tagged_media (which retrieves collections).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like ig_get_media_insights for analytics or ig_get_tagged_media for discovery. It also omits prerequisites such as required permissions or valid media_id formats.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Get' implies a read-only operation, the description does not confirm idempotency, disclose whether messages are marked as read when fetched, explain rate limits, or describe the return value structure (e.g., message content, timestamps, sender info).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundant words. However, given the lack of annotations and behavioral details, it borders on underspecification rather than optimal conciseness. Every word earns its place, but more words were needed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool supports pagination (evident from 'after' and 'limit' parameters), requires a conversation identifier likely obtained from another tool, and returns message data (no output schema provided), the description is insufficient. It omits critical context about the data retrieval workflow, pagination handling, and response expectations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents all three parameters ('conversation_id', 'limit', 'after'). The description adds minimal semantic value beyond the schema, though the phrase 'specific DM conversation' provides context for the conversation_id. It does not explain that conversation_id must be obtained from 'ig_get_conversations' or how pagination flows work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get messages') and scope ('in a specific DM conversation'), identifying the resource being accessed. However, it does not explicitly distinguish this from sibling tools like 'ig_get_message' (singular) or 'ig_get_conversations', which would help the agent understand the plural vs. singular retrieval pattern.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it explain the prerequisite workflow (e.g., that one must first call 'ig_get_conversations' to obtain the required 'conversation_id'). There is no mention of pagination strategy using the 'after' parameter.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to indicate this is a read-only operation, does not explain pagination behavior despite having 'limit' and 'after' parameters, and omits error conditions (e.g., private accounts, deleted comments) or rate limit considerations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief single sentence with no filler. While appropriately front-loaded, it arguably errs on the side of under-specification given the lack of annotations and output schema, though it efficiently communicates the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for a pagination-enabled tool with no output schema. The description omits critical operational details such as return value structure, pagination workflow (how to use 'after' from previous responses), and data limits, leaving significant gaps for proper invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents all three parameters (comment_id, limit, after). The description adds no supplemental context about parameter formats or relationships, warranting the baseline score for complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States a specific action (Get) and resource (replies to a specific comment) with clear scope. However, it does not explicitly differentiate from sibling tools like 'ig_get_comments' (top-level vs replies) or 'ig_reply_to_comment' (read vs write).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like 'ig_get_comments', no mention of prerequisites (e.g., needing a valid comment_id from 'ig_get_comments'), and no explanation of pagination workflow using the 'after' cursor.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions the media count limits but fails to disclose critical write-operation details: whether publishing is synchronous, what errors might occur, rate limits, or whether the operation is reversible. Slightly imprecise terminology referring to 'image_url or video_url' when schema uses a unified 'url' field with type discriminator.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no filler. Front-loaded with the core action and constraints. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a publishing tool with no annotations or output schema, the description lacks operational context. Missing: return value details (post ID?), error conditions, authentication scope, and platform-specific behaviors (e.g., alt_text only valid for images).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description reinforces the 2-10 item constraint and URL requirement, but inaccurately describes the schema structure as having separate image_url/video_url fields rather than a single 'url' field with a type enum. Does not elaborate on optional caption or location_id parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it publishes a carousel/album post and specifies the 2-10 media item constraint. However, it does not explicitly differentiate this from sibling publishing tools like ig_publish_photo or threads_publish_carousel.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
States the quantity constraint (2-10 items) but provides no guidance on when to use this versus single-image publishing or other platforms. No mention of prerequisites like account requirements or media specifications beyond count.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Enable or disable' implies a state change, the description fails to disclose whether existing comments remain visible when disabled, required permissions, reversibility, or whether this is a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no redundancy. It efficiently conveys the core function, though its brevity contributes to the lack of contextual guidance and behavioral details that would be helpful for an agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 2-parameter schema with full coverage and no output schema, the description is minimally adequate. However, for a mutation tool without annotations, it should ideally disclose that this modifies post settings or mention the effect on existing comments to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the parameters are fully documented in the schema itself (media_id and enabled with clear descriptions). The description echoes the boolean nature of the operation but does not add syntax details, example values, or semantic nuances beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Enable or disable') and identifies the resource ('comments on an Instagram media post'). However, it does not explicitly distinguish this tool from siblings like ig_hide_comment or ig_delete_comment, which manage individual comments rather than the post-level commenting permission.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like ig_hide_comment (which hides specific existing comments) or prerequisites such as media ownership requirements. It does not clarify that this controls the ability to add new comments rather than managing existing ones.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure. While 'Get' implies a read-only operation, the description does not clarify error handling (e.g., post not found), rate limits, authentication requirements, or what specific 'details' are returned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is appropriately brief and front-loaded with the core action. It contains no redundant or wasteful text, though its extreme brevity contributes to information gaps elsewhere.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema (2 parameters, no nesting) and lack of output schema, the description is minimally viable but incomplete. It fails to hint at the structure or content of the returned 'details', which would help an agent determine if this tool meets its needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents both 'post_id' and 'fields' parameters. The description adds no additional semantic value (e.g., example post ID formats, available field options), warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('details of a specific Threads post'), making the basic purpose understandable. However, it fails to differentiate from the sibling tool 'threads_get_posts' (plural), which could confuse an agent about when to retrieve a single post versus multiple posts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'threads_get_posts' or 'threads_search_posts'. There are no prerequisites, exclusions, or contextual triggers mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to mention that this creates public content, potential rate limits, whether it triggers notifications, or what happens if reply_to_id references a deleted post. The write nature of the operation is only implied, not stated.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficient and front-loaded with the essential verb and resource. However, given the complexity of social media publishing and lack of annotations, the description may be overly terse rather than appropriately concise, leaving critical safety and usage information unstated.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
As a write operation with no annotations and no output schema, the description inadequately prepares the agent for invocation. It omits visibility implications (public reply), media restrictions beyond URL format, authentication requirements, and error handling scenarios essential for social media publishing tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, documenting all four parameters (reply_to_id, text, image_url, video_url) including the maxLength constraint on text. The description adds no additional semantic context beyond the schema, meeting the baseline expectation when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the core action (reply) and target resource (Threads post or reply), distinguishing it from sibling tools like threads_publish_text (original posts) and ig_reply_to_comment (Instagram platform). However, it lacks specificity about the operation type (write/mutation) and scope constraints.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided for when to use this versus alternatives like threads_publish_text for original content, or prerequisites such as needing to obtain a post ID first via threads_get_posts. There are no exclusions or failure mode warnings (e.g., cannot reply to private accounts).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden but offers minimal behavioral context. It mentions 'public' scope and filtering capabilities but fails to disclose pagination mechanics (how 'after' cursor works), result set limits, rate limits, or what data structure is returned (critical given no output schema exists).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficient, front-loaded sentences with zero redundancy. Every word serves to define scope or capability, making it appropriately sized for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 8 parameters including pagination cursors, no output schema, and zero annotations, the description is insufficient. It omits how pagination functions, what the return payload contains, and any constraints on search queries (length, special characters) necessary for successful invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description maps high-level concepts to parameters ('filtered by media type' and 'author') but adds no semantic detail beyond the schema (e.g., date format specifics for 'since'/'until' or pagination behavior for 'after').
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action ('Search'), resource ('public Threads posts'), and key mechanisms ('by keyword or topic tag'). It implicitly distinguishes from sibling tools like threads_get_user_threads by specifying 'public' posts, though it does not explicitly contrast with specific alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states capabilities ('can be filtered') but provides no guidance on when to use this tool versus alternatives like threads_get_posts or threads_get_user_threads. It does not advise on when to use keyword versus tag search, or when filtering by author is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the return value ('media and comment details') but omits critical information such as pagination behavior, error handling for invalid comment IDs, rate limits, or required permissions/scopes for reading mentions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficient sentences with zero redundancy: the first declares the action (getting mentioned comments) and the second declares the return value. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has only 2 parameters and no output schema, the description adequately covers the return structure. However, for a tool with zero annotations, it lacks important context such as pagination, error scenarios, or the specific Instagram Graph API endpoint behavior that would help an agent handle responses correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, documenting the comment_id as coming from a mention notification and fields as controlling the return payload. The description adds no specific parameter semantics beyond the schema, meeting the baseline expectation when schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves 'comments where the account was @mentioned' with the verb 'Get', distinguishing it from general comment retrieval tools like ig_get_comments. However, it does not explicitly clarify when to use this versus ig_get_comment (single comment lookup) or ig_get_tagged_media.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives like ig_get_comments or ig_get_tagged_media. While the parameter schema implies a notification-based workflow ('Comment ID from a mention notification'), the description itself lacks when-to-use or prerequisite guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. It fails to mention whether this retrieves the authenticated user's profile or supports lookups, what specific fields are returned, rate limits, or error conditions. Only hints at account type constraint ('Business/Creator').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is immediately front-loaded with the core action and target. No redundant or filler text. Appropriate length for the simplicity of the tool interface.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and annotations, yet the description does not compensate by describing what profile information is returned (e.g., bio, follower count, profile picture). Does not clarify if this fetches the current authenticated account or supports arbitrary lookups. Insufficient for an API tool with 30+ siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema contains zero parameters. Per evaluation rules, 0 parameters establishes a baseline score of 4. The description implies no configuration is needed (likely retrieves authenticated user's profile), which aligns with the empty schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the verb ('Get') and target resource ('Instagram Business/Creator account profile information'). The 'Business/Creator' qualifier helps distinguish from personal account tools, though it could explicitly differentiate from the sibling 'threads_get_profile' tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus 'threads_get_profile' or other account information tools. Does not mention prerequisites like requiring Business/Creator account authentication or permissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Fails to mention pagination behavior, reply nesting depth, rate limits, or authentication requirements. Only behavioral hint is '(conversation)' suggesting threaded structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Action verb front-loaded ('Get replies'), parenthetical adds clarifying context without verbosity. Appropriately sized for the tool's scope.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Sufficient for a read-only retrieval tool with well-documented schema, but lacks output structure description (no output schema provided) and pagination guidance. Meets minimum viable threshold but leaves operational gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so descriptions for post_id, reverse, limit, and after are complete in the schema. The tool description implies the need for a specific post identifier but adds no semantic detail beyond the schema definitions. Baseline 3 appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Get replies') and resource ('Threads post'), distinguishing from siblings like 'threads_get_post' (parent post) and 'ig_get_replies' (Instagram platform). The parenthetical '(conversation)' clarifies the threaded nature of the data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use versus alternatives like 'threads_get_posts' or 'threads_get_user_threads'. No mention of pagination strategy despite the 'after' cursor parameter existing in the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full responsibility for behavioral disclosure. While 'Publish' indicates a write operation, the description omits critical behavioral details: whether publishing is synchronous or asynchronous (relevant given the threads_get_container_status sibling), return value structure, error handling, or idempotency guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero redundancy. The constraint '2-20 images/videos' is front-loaded and essential for distinguishing this from other publish tools.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a complex publishing operation with no output schema and no annotations, the description is insufficient. It fails to disclose what the tool returns (likely a container ID requiring status polling via threads_get_container_status), rate limits, or side effects like notifications generated.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description reinforces the items array constraints (2-20) but adds no semantic nuance beyond the schema, such as explaining the relationship between quote_post_id and carousel items or alt_text best practices.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Publish'), resource ('carousel post on Threads'), and unique constraint ('2-20 images/videos'). However, it does not explicitly differentiate from sibling tools like threads_publish_image or threads_publish_video, though the 2-20 constraint implies multi-media usage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The '2-20 images/videos' constraint implicitly signals when to use this tool versus single-media alternatives. However, there is no explicit guidance on when to choose this over threads_publish_image/video, nor any mention of prerequisites like authentication or media requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While 'unhide' implies reversibility, the description fails to explain visibility rules (who can see hidden comments), authentication requirements, or side effects of the operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is perfectly sized and front-loaded. Every word serves a purpose: the dual verb ('Hide or unhide'), the resource ('comment'), and the scope ('on your post'). Zero redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 2-parameter schema and lack of output schema, the description is minimally viable. However, as a moderation tool without safety annotations, it should disclose more behavioral context (e.g., visibility impact, reversibility confirmation) to fully inform agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear explanations for both parameters. The description adds minimal semantic value beyond the schema, merely framing the action contextually as operating on 'your post' without elaborating on comment_id format or validation rules.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the dual action (hide/unhide) and resource (comment on your post). It implicitly distinguishes from sibling tools like ig_delete_comment (permanent removal) and ig_get_comment (reading) by emphasizing the hide/unhide toggle functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by mentioning both hide and unhide capabilities, suggesting this is for reversible content moderation. However, it lacks explicit guidance on when to choose hiding over deletion (ig_delete_comment) or whether hidden comments remain visible to the commenter.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to indicate this is a write operation with side effects, mention required permissions (e.g., content publishing approval), rate limits, or what happens if the media_id is invalid.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It is front-loaded with the action and scope, earning its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 parameters, simple action) and 100% schema coverage, the description is minimally adequate. However, lacking annotations and an output schema, it misses opportunity to clarify the return value or confirm success indicators.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents both parameters ('Media ID to comment on', 'Comment text'). The description aligns with these ('media post', 'comment') but adds no additional semantic context such as ID format, message length limits, or content restrictions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Post') and resource ('comment') and clarifies scope with 'top-level' and 'media post'. The term 'top-level' effectively distinguishes this from the sibling tool 'ig_reply_to_comment', though it doesn't explicitly name the alternative.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'top-level comment' implies this is for commenting directly on media rather than replying to existing comments (which would use ig_reply_to_comment). However, there are no explicit when-to-use/when-not-to-use statements, prerequisites (e.g., needing the media_id first), or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses critical blocking behavior ('Waits for video processing') which is essential given no annotations indicate long-running operations. However, missing auth requirements, error handling behavior, and idempotency guarantees since annotations are absent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first defines scope, second discloses blocking behavior. Front-loaded with action verb. No redundant phrases.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimal but viable for a publishing tool. Mentions processing wait time but omits return value structure (despite no output schema), success/failure indicators, or prerequisite validation that would help an agent handle the response correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, so description appropriately relies on structured documentation rather than repeating parameter details. No additional semantic context (e.g., valid ranges for thumb_offset, URL requirements) added, warranting baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Publish') and resource ('Reel'), with parenthetical clarification '(short video)'. Distinguishes from siblings like ig_publish_photo/story/carousel by naming the specific content type, though could explicitly clarify difference from ig_publish_video.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this vs. ig_publish_video (standard video posts) or prerequisites like business account verification. No mention of alternative publishing flows or rate limit considerations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. While it states that a new token is returned, it fails to clarify whether the old token is invalidated, what error occurs if the token is already expired, rate limits, or whether the operation is idempotent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero redundancy. The action is front-loaded ('Refresh...'), and every word serves a purpose. Appropriate length for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description mentions the conceptual return value (new token), which helps. However, it omits error conditions, the fate of the old token, and output structure details that would help agents handle the response correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input parameter is fully documented in the schema itself. The description adds no additional semantic context about the parameter format or constraints, earning the baseline score for complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (refresh) and resource (long-lived token) with temporal scope ('before it expires'). However, it doesn't explicitly differentiate from the sibling 'meta_exchange_token', which could confuse agents about whether to exchange or refresh.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides an implied usage condition ('before it expires'), but lacks explicit guidance on when NOT to use it (e.g., if already expired) or how to choose between this and 'meta_exchange_token'. No alternative workflows are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. It fails to explain side effects (what happens upon acceptance/decline), whether actions are reversible, or what the tool returns. The 'Added in December 2025' timestamp adds no behavioral clarity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately brief and front-loaded with the functional statement. The second sentence (version date) adds minimal value but does not significantly detract from the overall conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter mutation tool with no output schema, the description covers the core operation adequately. However, it misses workflow context (linking to ig_get_collaboration_invites for invite discovery) that would help agents understand the full usage pattern.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description adds no specific parameter guidance beyond what the schema already documents (invite_id and action enum).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description provides specific verbs ('Accept or decline') and clear resource ('collaboration invite'). It effectively distinguishes from sibling tool 'ig_get_collaboration_invites' by indicating this is a write/response action rather than a read operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage context through the action verbs, it lacks explicit guidance on when to use this versus alternatives, and omits the prerequisite workflow step of obtaining the invite_id (likely via ig_get_collaboration_invites).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. 'List' implies a read-only operation, but the description lacks details on return format, pagination, what 'current' means (active vs all), or required permissions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the verb, and appropriately sized for a zero-parameter tool. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema and annotations, the description should ideally describe the return structure (e.g., subscription IDs, callback URLs). It provides the minimum viable context for a simple list operation but leaves significant gaps regarding the response format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters, which per guidelines sets a baseline of 4. The description does not need to compensate for missing parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('List') and resource ('webhook subscriptions') and scopes it to the 'Meta App', distinguishing it from Instagram/Threads siblings. However, it does not explicitly differentiate from the sibling 'meta_subscribe_webhook' (create vs list).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'meta_subscribe_webhook'. Missing prerequisites (e.g., app ID context) or conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'verification status' as a specific returned field, hinting at the data structure. However, it fails to confirm this is a safe read-only operation, describe the response format, mention rate limits, or clarify that it returns the authenticated user's profile given the empty parameter schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words. It is front-loaded with the action verb 'Get' and every word contributes meaning—specifying the platform (Threads), resource (user profile), and a key data point (verification status). No redundancy or filler is present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a zero-parameter read operation with no output schema, the description provides the minimum viable context by mentioning one specific returned field (verification status). However, without annotations or output schema, it should ideally specify that it returns the authenticated user's profile object and note any API constraints (e.g., rate limits) to be considered complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema contains zero parameters (empty object). According to calibration rules, 0 parameters warrants a baseline score of 4. The description does not need to compensate for missing parameter documentation, though it could have explicitly noted that the tool uses the authenticated user's context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('Threads user profile information'), distinguishing it from Instagram siblings (ig_*) by specifying 'Threads' and from content/analytics siblings (threads_get_user_threads, threads_get_user_insights) by focusing on 'profile'. The addition of 'including verification status' provides specific scope. However, it doesn't clarify whether this retrieves the authenticated user's profile or supports arbitrary user lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like ig_get_profile (Instagram vs Threads platforms) or threads_get_user_insights (profile metadata vs analytics). It omits prerequisites such as authentication requirements or token scopes needed to access profile data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully documents authentication requirements (permission scope and API type) but omits other behavioral traits like pagination limits, rate limiting, or error conditions that would help an agent predict execution outcomes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of exactly two sentences with zero redundancy: the first declares the operation, the second states prerequisites. Every word earns its place in this highly efficient structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given three simple parameters with complete schema documentation, the description adequately covers authentication requirements. However, lacking an output schema, it misses an opportunity to describe the conversation list structure or pagination behavior beyond the cursor parameter, leaving gaps in contextual understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters (folder, limit, after) fully documented in the schema. The description provides minimal additional parameter context beyond implying a list operation, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves 'Instagram DM conversations list' with specific verb (Get) and resource (DM conversations). It implicitly distinguishes from sibling 'ig_get_messages' by specifying 'conversations' rather than individual messages, though it doesn't explicitly highlight this distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides prerequisite requirements (permission 'instagram_manage_messages' and Instagram Messaging API) but lacks explicit guidance on when to use this versus sibling alternatives like 'ig_get_messages' or 'ig_send_message'. It states when you can use it but not when you should prefer it over alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'processing status' implying asynchronous behavior, but fails to disclose possible status values (e.g., FINISHED, EXPIRED), polling recommendations, or container expiration behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single front-loaded sentence of nine words with zero redundancy. Every word earns its place by conveying the exact operation and target resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter tool, but lacking given no output schema exists. Should mention expected status values or the async polling pattern to be complete, though the core operation is understandable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with 'Container ID to check'. Description adds context that this refers to a 'media container', but provides no additional format details, examples, or constraints beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Check') and resource ('processing status of a Threads media container'). Explicitly mentions 'Threads' to distinguish from sibling tool ig_get_container_status, and 'container' distinguishes from threads_get_post.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool (e.g., after creating a container with publishing tools) versus alternatives, nor does it mention the typical polling workflow for async processing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure. It fails to mention whether this creates a publishing container (suggested by existence of threads_get_container_status sibling), whether the operation is synchronous or asynchronous, or what success/failure looks like. Critical gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficient sentences: the first establishes the core purpose, the second enumerates key capabilities. No filler text; every word earns its place. Appropriately sized for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage, basic parameter needs are met. However, for a publishing tool with no output schema and no annotations, the description lacks important context about return values (post ID vs container ID) and the asynchronous nature implied by sibling tools. Adequate but incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents all parameters. The description lists supported features ('topic tag, quote post, alt text, and spoiler flag') but adds no semantic meaning beyond what the schema already provides. Baseline 3 is appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Publish') and resource type ('image post') for the Threads platform. It effectively distinguishes from siblings like threads_publish_text, threads_publish_video, and threads_publish_carousel by specifying 'image post'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While the tool's purpose is clear from the name and description, there is no explicit guidance on when to use this versus threads_publish_carousel for single images, or prerequisites like checking publishing limits (threads_get_publishing_limit exists as a sibling). Usage is implied but not stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fails to disclose critical behavioral traits: rate limits, that posts are public/create persistent content, authentication requirements, or immutability after publishing. It only lists functional capabilities.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two well-structured sentences with zero waste: front-loaded with the core action, followed by optional capability enumeration. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 10-parameter publishing tool with no output schema or annotations, the description covers basic functionality but omits operational context like error scenarios, return values, or the existence of publishing limits that should be checked beforehand.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description summarizes optional features (link, poll, GIF, etc.) but adds no semantic depth beyond the schema's existing documentation of parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Publish') and resource ('text-only post on Threads'), effectively distinguishing it from sibling tools like threads_publish_image and threads_publish_video by emphasizing 'text-only'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While 'text-only' implies when to use this versus media-focused siblings, the description lacks explicit guidance on when to use quote features versus threads_reply, or prerequisites like checking publishing limits via threads_get_publishing_limit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully clarifies that results are ranked by popularity ('top (most popular)'), but omits other operational details such as pagination behavior, rate limits, authentication requirements, or data freshness that would be helpful for a read-only API endpoint.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It is appropriately front-loaded with the action and resource type. However, given the lack of annotations and output schema, it borders on too terse—one additional sentence explaining return format or pagination would improve utility without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the core function adequately but is incomplete regarding return values (no output schema exists to compensate). It does not describe what constitutes 'media' objects, what fields are returned, or error conditions. For a tool with no annotations and no output schema, additional descriptive context would be necessary for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage (hashtag_id, limit, after are all documented). The description does not add additional semantic context about parameter formats, constraints, or relationships, but since the schema is comprehensive, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Get'), identifies the resource ('media'), and clarifies the scope with 'top (most popular)' and 'tagged with a specific hashtag'. The parenthetical '(most popular)' effectively distinguishes this tool from the sibling ig_get_hashtag_recent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by specifying 'top (most popular)' media, suggesting it should be used when popularity ranking is desired versus recency. However, it does not explicitly name the sibling alternative (ig_get_hashtag_recent) or provide explicit when-to-use/when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'public info' indicating data visibility level, but omits explicit read-only classification, rate limits, or requirement that target must be Business/Creator account type.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single 11-word sentence with action front-loaded ('Look up'). Zero redundancy or filler content. Efficiently conveys core functionality without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter lookup tool with simple flat schema. Given no output schema or annotations, could benefit from explicit mention of the Business/Creator account requirement and error behavior (e.g., private accounts), but covers the basic contract sufficiently.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description reinforces 'by username' matching the required parameter, but adds no further semantic context like validation rules or field format details beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Look up' with clear resource 'another Instagram Business/Creator account's public info' and distinguishes from sibling ig_get_profile by emphasizing 'another' (third-party) lookup capability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage context by specifying 'another' account (suggesting use for external lookups vs. own profile), but lacks explicit when-to-use guidance or named alternatives like ig_get_profile for authenticated user data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It successfully communicates API versioning behavior (v22.0 deprecation), but omits other behavioral traits like rate limits, authentication requirements, or return data structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. Front-loaded with the core action, followed immediately by a critical deprecation note. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only insights tool with 100% schema coverage, the description adequately covers the primary complexity (deprecated metrics). No output schema exists, so return value explanation is not required. Could improve by mentioning permission requirements.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema has 100% coverage, the description adds critical semantic context about valid metric values. It explains that certain metrics were deprecated and suggests specific replacements ('views', 'reach', etc.), providing value beyond the schema's simple list of valid options.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Get') and resource ('Instagram account insights') clearly. However, it does not explicitly differentiate from sibling tool 'ig_get_media_insights' which retrieves media-level insights rather than account-level.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage guidance through the deprecation warning, instructing users to avoid deprecated metrics like 'impressions' in favor of 'views' and 'reach'. Lacks explicit guidance on when to choose this tool over 'ig_get_media_insights' or 'threads_get_user_insights'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses the two-step internal process and HTTPS URL requirement, but fails to mention mutation side effects (creates public post), return value format (container ID?), reversibility, or error handling behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, zero waste. Purpose front-loaded ('Publish a photo'), followed by implementation detail ('Two-step process'), then critical requirement ('Requires image_url'). Efficient structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 5-parameter publishing tool with complete schema documentation, but gaps remain due to missing annotations and output schema. Description should disclose what the tool returns (post ID? container status?) and confirm the destructive/public nature of the mutation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description mentions image_url constraints but this merely echoes schema content ('Public HTTPS URL'). No additional semantics provided for optional parameters (caption, location_id, user_tags, alt_text) beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Publish') + resource ('photo to Instagram'). Explicitly names 'photo' which distinguishes from siblings like ig_publish_video, ig_publish_reel, ig_publish_story, and threads_publish_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through naming and mentions internal 'two-step process' (container creation then publishing), but lacks explicit when-to-use guidance versus sibling publishing tools (e.g., when to choose photo vs carousel) or prerequisites like account permissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Successfully discloses the ephemeral 24-hour lifecycle trait, but omits operational details like whether media_url requires pre-upload to IG servers, rate limits, or async publishing status checks.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. Front-loaded with the core action ('Publish a Story'), followed immediately by format parenthetical, then the critical behavioral constraint (24-hour expiry). Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter publishing tool but missing prerequisites typical for IG APIs (e.g., media container requirements, Business Account eligibility) and error conditions. No output schema exists, though description doesn't address what success/failure looks like.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, establishing baseline 3. Description reinforces the image/video options but adds no Story-specific constraints (e.g., 9:16 aspect ratio, video duration limits) or format validation details beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Publish) + resource (Story) + format constraints (image or video). The 24-hour disappearance trait clearly distinguishes this ephemeral format from sibling publishing tools like ig_publish_photo or ig_publish_reel.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage context through the 24-hour disappearance note (ephemeral vs permanent content), but lacks explicit comparison to sibling publishing tools or guidance on when to choose Stories over feed posts, reels, or carousels.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds valuable behavioral detail that it 'waits for video processing before publishing' (critical given sibling ig_get_container_status exists), but omits auth requirements, error scenarios, and side effects since no annotations are provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first establishes purpose, second discloses key blocking behavior. Efficiently front-loaded with no filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 4-parameter publishing tool despite missing output schema details, though could strengthen context by mentioning Instagram Business/Creator account requirements or distinguishing video specs from Reels.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage; description correctly omits redundant parameter details since schema fully documents video_url, caption, thumb_offset, and location_id.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Publish' with clear resource 'video' and scope 'Instagram feed' effectively distinguishes from siblings like ig_publish_reel, ig_publish_story, ig_publish_photo, and threads_publish_video.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage context through 'feed' specification but lacks explicit when-to-use guidance comparing to Reels/Stories alternatives or prerequisites like required Facebook Page connection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full disclosure burden. It provides the validity period (~60 days) and credential requirements, but omits safety profile (whether this invalidates the short token, idempotency, or rate limits).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. Front-loaded with the core operation and validity period, followed immediately by prerequisites. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a single-parameter utility function. Covers input requirements and operational prerequisites. Lacking output description (what the long-lived token looks like), but given the simple input schema and standard OAuth pattern, this is acceptable though not ideal.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema has 100% coverage for the single parameter, the description adds crucial context about required external credentials (META_APP_ID and META_APP_SECRET) not present in the input schema. This compensates for schema limitations regarding authentication context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the core action (exchange) and resources (short-lived to long-lived token) with specific validity context (~60 days). Implicitly distinguishes from sibling `meta_refresh_token` by specifying 'short-lived' input, though explicit comparison would strengthen this.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides critical prerequisites (META_APP_ID and META_APP_SECRET requirements) but lacks explicit guidance on when to use this versus `meta_refresh_token` or initial authentication flows. No mention of error conditions or retry behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully communicates the irreversible nature of the deletion. However, it omits other critical behavioral details such as whether this cascades to delete replies, required authentication/ownership levels, or rate limiting concerns typical for Instagram API write operations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: the first establishes the operation and scope, the second provides the critical irreversibility warning. Information is front-loaded with the action verb immediately clarifying intent. No redundant or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter destructive operation without output schema, the description is appropriately complete. The irreversibility warning adequately covers the primary risk. Minor gaps remain regarding cascade behavior (replies) and differentiation from ig_hide_comment, but these do not prevent correct tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (comment_id is fully described in the schema as 'Comment ID to delete'), establishing a baseline score of 3. The description does not add semantic details beyond the schema, such as ID format expectations or where to obtain valid comment IDs (e.g., from ig_get_comments).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (delete), resource (comment), and scope (from your media post). It effectively distinguishes from siblings like ig_hide_comment (hide vs delete permanence) and ig_delete_media (comment vs media deletion) through precise verb choice and object specification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The irreversibility warning provides implicit guidance about the destructive nature, but the description lacks explicit comparison to ig_hide_comment (which preserves the comment for moderators) or prerequisites like media ownership requirements. No explicit when-not-to-use guidance is provided despite having a close sibling alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It adds valuable context about photo tags versus @mentions, but fails to disclose safety characteristics (read-only status), rate limits, authentication requirements, or pagination behavior beyond the schema field names.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the primary action. The parenthetical clarification is precisely placed and every word earns its place. Zero waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (2 optional parameters, straightforward listing operation) and the absence of an output schema, the description adequately covers the essential domain context (photo tags vs mentions). It appropriately omits return value details since no output schema is provided.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (both 'limit' and 'after' are documented in the schema), the baseline is 3. The description does not add semantic details about valid limit ranges, cursor format, or parameter interaction beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Get') and resource ('media where the account is tagged'). The parenthetical '(photo tags, not @mentions)' effectively distinguishes this tool from sibling ig_get_mentioned_comments and clarifies the exact scope of the operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'not @mentions' provides implicit guidance on when NOT to use this tool (implying @mentions are handled elsewhere), but it does not explicitly name the alternative tool (ig_get_mentioned_comments) or state prerequisites for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully identifies what data points are retrieved (validity, expiration, scopes, user) but omits whether this is a safe read-only operation, what response format to expect, or how it handles malformed tokens versus expired ones.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, 12-word sentence that is front-loaded with the action verb. Every word earns its place by specifying either the operation, target resource, or the specific attributes inspected. Zero redundancy or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single string parameter) and high schema coverage, the description is functionally complete. It compensates for the missing output schema by listing the inspection targets. A minor gap remains in not describing the return structure or error cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Access token to inspect'), establishing a baseline of 3. The description mentions 'access token' but adds no format constraints (e.g., JWT vs. opaque), validation rules, or examples beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Debug/inspect') with a clear resource ('access token') and enumerates exactly what aspects are validated: 'validity, expiration, scopes and associated user.' This effectively distinguishes the tool from its 40+ siblings, which are predominantly content management operations (posting, commenting) rather than authentication utilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage through the 'Debug/inspect' framing, it lacks explicit guidance on when to prefer this over siblings like meta_exchange_token or meta_refresh_token, or prerequisites such as needing a token to inspect. The usage is intuitive but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It successfully lists the available metric types (views, likes, etc.), but omits other behavioral traits like whether data is real-time or cached, rate limits, or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the action and resource. Every element serves a purpose: the verb ('Get'), the resource ('insights/analytics'), the scope ('specific Threads post'), and the exemplar metrics. Zero redundancy or waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 simple parameters, no nested objects) and lack of output schema, the description is appropriately complete. It identifies the return values implicitly through the metric list, though explicitly stating the return structure would elevate this to a 5.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents both parameters. The description reinforces the metric options by listing them parenthetically, but does not add syntax details, format constraints, or semantic context beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Get insights/analytics), resource (Threads post), and distinguishes from siblings like 'threads_get_user_insights' by emphasizing 'post' level granularity. The parenthetical list of metrics further clarifies the specific data returned.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage through specificity ('for a specific Threads post'), it lacks explicit guidance on when to choose this over 'threads_get_user_insights' or 'threads_get_posts'. No when-not-to-use or alternative recommendations are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the critical behavioral trait that hiding is cosmetic (replies remain accessible via direct link) rather than a security deletion. It lacks disclosure of reversibility, auth requirements, or error conditions, but covers the most important behavioral quirk.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences totaling 13 words. The first states the action; the second explains the critical behavioral consequence. No filler or redundant information. Perfectly front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter mutation tool without output schema, the description is nearly complete. It explains the domain-specific meaning of 'hide' on Threads. It could improve by mentioning reversibility (via threads_unhide_reply) or success indicators, but the core behavior is documented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (reply_id is documented), establishing a baseline of 3. The description does not add additional semantic context about the parameter format, how to obtain the ID, or validation rules beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (hide) and target (reply on Threads post). The second sentence distinguishes this from delete operations by explaining visibility semantics. However, it does not explicitly differentiate from sibling 'threads_unhide_reply' or state when to prefer hiding over deleting.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The explanation that 'hidden replies are still visible if directly accessed' provides implicit guidance about the tool's moderation use case (soft moderation vs removal). However, there is no explicit 'when to use this' guidance, prerequisites, or mention of the 'threads_unhide_reply' alternative for reversing the action.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully indicates ownership requirements ('your Threads post') and the restorative nature of the operation, but omits details about permission scopes, error cases (e.g., attempting to unhide an already-visible reply), and idempotency guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of seven words with no redundancy. It front-loads the action verb and immediately qualifies the scope and target, earning its place without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (single parameter, 100% schema coverage, boolean-like toggle behavior), the description adequately covers the essential contract. It could be improved by mentioning error conditions or confirming idempotency, but it is sufficient for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (reply_id is documented as 'Reply ID to unhide'), so the baseline applies. The description does not add additional semantic context about the parameter format or constraints beyond what the schema provides, which is appropriate given the high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Unhide') with a clear resource ('previously hidden reply on your Threads post'). It effectively distinguishes from sibling tools like threads_hide_reply (inverse operation) and threads_delete_post (permanent removal) by specifying the restorative nature of the action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'previously hidden reply' implies this tool reverses a prior hide operation, providing implicit context for when to use it. However, it lacks explicit guidance on when to choose this over deletion or other moderation actions, and does not mention the sibling threads_hide_reply tool by name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Adds critical auth requirements (APP_ID/SECRET) not in schema. However, fails to disclose that this creates a persistent subscription (not one-time), whether it's idempotent, or what happens if the callback_url is unreachable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, zero waste. First sentence establishes purpose and scope; second sentence states prerequisites. Front-loaded with essential information, no redundant or filler text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 4-parameter webhook tool with complete input schema. However, lacks output description (no output schema exists) - doesn't indicate whether it returns subscription ID, success boolean, or error details for invalid callback URLs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with good descriptions. Description adds valuable concrete examples ('instagram', 'page') for the 'object' parameter that help agents map intent to enum values. Enhances understanding beyond schema baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Subscribe' + resource 'webhook notifications' + scope 'for an object'. Examples ('instagram', 'page') clarify target resources. Clearly distinguishes from sibling 'meta_get_webhook_subscriptions' (read vs write operation).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
States prerequisite credentials (META_APP_ID and META_APP_SECRET) needed for invocation. However, lacks explicit guidance on when to use this vs 'meta_get_webhook_subscriptions' or when to update existing subscriptions vs create new ones.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the aliasing behavior and 'user context' scoping, but lacks details on pagination behavior, rate limiting, authentication requirements, or error conditions that would help an agent handle edge cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, information-dense sentence that front-loads the core action and immediately clarifies the relationship to the sibling tool. Every word serves a purpose with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple retrieval purpose and complete input schema, the description is adequate but lacks mention of the return value structure (no output schema exists to compensate). For a tool with no annotations, additional context about the response format would improve agent utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for all 4 parameters (limit, since, until, after). The description does not add semantic context beyond the schema (e.g., explaining that 'since'/'until' filter publication dates or that 'after' handles pagination), warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get all threads published by the user') and explicitly distinguishes this tool from its sibling 'threads_get_posts' by noting it is an alias with 'user context', making the scope distinction immediately clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description identifies the sibling relationship ('alias for threads_get_posts'), implying this tool adds user-specific scoping. However, it stops short of explicitly stating when to prefer this over 'threads_get_posts' or other alternatives like 'threads_search_posts'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Explicitly discloses irreversibility ('This action is irreversible') and authentication requirements, providing critical safety context for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with action. Every sentence earns its place: defines scope, warns of irreversibility, states permission requirement. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter destructive tool with no output schema. Covers critical aspects (scope, irreversibility, permissions) though could mention error cases (e.g., media not found).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with media_id fully described. Description adds semantic context about what types of media can be deleted but does not elaborate on parameter format or validation beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Delete' with resource 'Instagram media post' and explicit scope enumeration '(posts, carousels, reels, stories)' distinguishes from sibling ig_delete_comment (comments) and threads_delete_post (Threads content).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through scope definition but lacks explicit when-to-use guidance vs alternatives. Notes permission requirement ('Requires instagram_manage_contents permission') which aids usage but doesn't constitute full decision guidelines.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Adds critical deprecation warning (v22.0) and valid metric enumeration. Missing: error handling, rate limits, auth requirements, or data retention details. Does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with purpose front-loaded, followed by high-value deprecation warning, then metric enumeration. Slightly redundant with schema's metric description, but the standalone list improves scanability. No filler text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for 2-parameter tool with complete schema coverage. Deprecation warning addresses API version complexity. Gap: no output schema exists, and description doesn't clarify return structure (object, array, or scalar values), which would help the agent handle the response.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage, establishing baseline 3. Description adds value by warning about deprecated parameter values ('impressions', 'video_views') not mentioned in schema, and reinforces valid metric options. Deprecation context changes how agent should construct the 'metric' parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Get') + resource ('insights/analytics') + scope ('specific media post'). Effectively distinguishes from sibling 'ig_get_account_insights' (account-level) and Threads tools by specifying 'media post' granularity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides specific guidance on deprecated metrics ('impressions'/'video_views') and directs to 'views' instead. Includes REEL-specific metric guidance (likes, comments, reposts). Lacks explicit contrast with account-level insights tool but implies scope through specificity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully lists expected return fields (name, category, namespace), but omits operational details such as authentication requirements, rate limiting, or whether this queries a cache vs live API.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no extraneous text. Every word serves a purpose: the verb specifies the action, 'Meta App' scopes the resource, and the parenthetical lists the specific data fields returned.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters) and lack of output schema, the description adequately compensates by listing the expected return fields. It is complete enough for an agent to understand what will be returned, though operational context would strengthen it further.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters and the input schema is empty. With no parameters requiring semantic clarification, the baseline score applies as per rubric guidelines for parameterless tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get') and resource ('Meta App basic information'), distinguishing it from the numerous Instagram/Threads content siblings by focusing on application metadata rather than media, comments, or posts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage context (retrieve app metadata), it lacks explicit guidance on when to use this versus the other Meta-specific tools like meta_debug_token or meta_exchange_token, or any prerequisites like authentication.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden and successfully discloses critical behavioral traits: destructive nature ('irreversible') and operational constraints ('Rate limited to 100 deletions per 24 hours'). Lacks auth requirements or error state details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: purpose declaration, safety warning, and rate limit—each earning its place. Information is front-loaded with the action clearly stated first.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter destructive operation without output schema, description adequately covers essential safety and operational context. Could be improved by mentioning return value (success indicator) or permission requirements, but sufficiently complete for agent selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (post_id fully documented), establishing baseline 3. Description implies the target resource but does not add syntax details, examples, or constraints beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description opens with specific verb 'Delete' and resource 'Threads post', clearly distinguishing from Instagram tools (ig_*) and other Threads operations like replies or publishing. Scope is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implicit guidance through 'irreversible' warning, cautioning against casual use, but lacks explicit when-to-use criteria or comparisons to siblings like threads_hide_reply (though no direct alternative for post deletion exists).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully enumerates available metrics (views, likes, etc.), but omits operational details like rate limits, authentication requirements, time granularity, or return value format. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with optimal front-loading: action verb first, scope qualifier second, specific capabilities in parentheses. Every element earns its place with zero redundancy or filler text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequately complete for a 3-parameter tool with full schema coverage. Description covers the 'what' (available metrics) effectively. Minor gap: no output schema exists, and description doesn't clarify return format (object vs array) or time-series granularity, though metric enumeration compensates partially.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by surfacing the valid metric options (views, likes, etc.) in the main text where they're immediately visible, reinforcing the schema's comma-separated list and making the tool's capabilities scannable at a glance.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (Get) and resource (account-level Threads insights), and explicitly distinguishes from sibling 'threads_get_post_insights' by specifying 'account-level' scope. Lists all available metrics to clarify what insights are retrievable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implicit guidance through 'account-level' designation, suggesting use for aggregate metrics rather than individual post analysis. However, lacks explicit when-to-use guidance or comparison to siblings like 'threads_get_post_insights' or 'ig_get_account_insights'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses the critical blocking behavior ('Waits for video processing'). However, it lacks details on failure modes, timeouts, or what the tool returns upon completion.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences total with zero waste: sentence 1 states purpose, sentence 2 discloses blocking behavior, and sentence 3 lists key features. Perfectly front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich input schema (7 params, 100% coverage) and lack of output schema, the description provides adequate completeness by covering purpose, blocking behavior, and feature support. Minor gap: does not describe return values or error conditions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions four specific parameters (topic tag, quote post, alt text, spoiler flag) but does not add semantic meaning beyond the schema's descriptions (e.g., explaining how quote_post_id functions or reply_control options).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Publish') + resource ('video post') + platform ('Threads'), clearly distinguishing it from sibling tools like threads_publish_image and threads_publish_text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'Waits for video processing' provides implicit behavioral guidance (synchronous blocking vs. async), but there is no explicit 'when to use this vs. alternatives' guidance or prerequisites mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses critical behavioral constraints: required permission ('instagram_manage_messages') and policy restrictions (messaging window limitations). Lacks disclosure of return values, idempotency, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: purpose front-loaded, followed by permission requirement, then usage constraints. Every sentence provides distinct, high-value information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 2-parameter input with 100% schema coverage and absence of output schema, the description adequately covers the complex policy requirements (permission and messaging windows). Minor gap regarding return value documentation, but constraint coverage is the critical need here.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both recipient_id and message fully documented). Description does not add parameter-specific semantics beyond what the schema provides, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('Send') + resource ('DM') clearly stated. Distinguishes from sibling read operations (ig_get_messages, ig_get_conversations) by identifying itself as the write/mutation action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear constraints on when the tool can be used (24hr/7-day window, requires prior incoming message). However, does not explicitly name complementary tools (e.g., ig_get_conversations to check eligibility) or explicit 'when not to use' alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully reveals the rate limit constraint (250/day) and time window behavior. It could improve by mentioning authentication requirements or return value structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. Critical information (action, time window, hard limit) is front-loaded and densely packed without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, single purpose) and lack of output schema, the description adequately covers the essential behavioral contract. It could marginally improve by indicating the return value format (count vs. object).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters (100% coverage by default). Per evaluation rules, zero-parameter tools receive a baseline score of 4. The description appropriately requires no parameter clarification.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Check') and clearly identifies the resource (remaining publishing quota) and scope (current 24-hour window). It effectively distinguishes from sibling publishing tools by focusing on rate limits rather than content creation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear contextual guidance by stating the 250 posts/day maximum, implicitly signaling this should be checked before publishing. However, it lacks explicit 'when to use' wording (e.g., 'Use before threads_publish_*') or failure mode guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses the 7-day rolling rate limit and workflow position. It could improve by mentioning what happens when a hashtag is not found (error vs null response), but the rate limit disclosure is excellent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first establishes purpose, second covers workflow dependency and rate limits. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter lookup tool with no output schema, the description is complete. It covers purpose, prerequisites, rate limits, and implicitly the return value (hashtag ID). No additional context is needed for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Hashtag name to search (without #)'), so the description appropriately focuses on higher-level concerns rather than repeating parameter details. It meets the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description states the specific action (search for hashtag ID), the input method (by name), and distinguishes itself from siblings by clarifying it returns an ID rather than media content (unlike ig_get_hashtag_recent/top).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite relationship: 'Required before querying hashtag media,' guiding the agent to use this before ig_get_hashtag_recent/top. Also discloses the critical rate limit constraint (30 unique hashtags per 7-day window) that governs when it should be used.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
GitHub Badge
Glama performs regular codebase and documentation scans to:
- Confirm that the MCP server is working as expected.
- Confirm that there are no obvious security issues.
- Evaluate tool definition quality.
Our badge communicates server capabilities, safety, and installation instructions.
Card Badge
Copy to your README.md:
Score Badge
Copy to your README.md:
How to claim the server?
If you are the author of the server, you simply need to authenticate using GitHub.
However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.
{
"$schema": "https://glama.ai/mcp/schemas/server.json",
"maintainers": [
"your-github-username"
]
}Then, authenticate using GitHub.
Browse examples.
How to make a release?
A "release" on Glama is not the same as a GitHub release. To create a Glama release:
- Claim the server if you haven't already.
- Go to the Dockerfile admin page, configure the build spec, and click Deploy.
- Once the build test succeeds, click Make Release, enter a version, and publish.
This process allows Glama to run security checks on your server and enables users to deploy it.
How to add a LICENSE?
Please follow the instructions in the GitHub documentation.
Once GitHub recognizes the license, the system will automatically detect it within a few hours.
If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.
How to sync the server with GitHub?
Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.
To manually sync the server, click the "Sync Server" button in the MCP server admin interface.
How is the quality score calculated?
The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).
Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.
Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).
Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/exileum/meta-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server