Skip to main content
Glama

Planner (proof-of-done)

Ownership verified

Server Details

Evidence-gated task verification for AI agents. Decompose goals into acceptance criteria, attach proof (screenshot, curl, file), independent LLM judge accepts or rejects. 24 tools. Hosted remote MCP (streamable-http, OAuth 2.1 + DCR).

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.8/5 across 24 of 24 tools scored. Lowest: 2.2/5.

Server CoherenceA
Disambiguation4/5

Most tools target distinct actions, but attach-evidence-file and request-evidence-upload share a close relationship that may confuse an agent. Overall, each tool has a clear purpose with detailed descriptions.

Naming Consistency4/5

Tool names follow a consistent verb-noun pattern with hyphens, except for 'todo', which is a noun and breaks the pattern. The rest are consistent.

Tool Count3/5

24 tools is in the borderline heavy range (16-25). While each tool seems justified for a planning system, the count feels slightly high for coherence without clear modular grouping.

Completeness3/5

The set covers core operations for goals, projects, and evidence, but lacks update-project, update-criterion, and list-project-dependencies. Evidence removal is present, but criterion deletion is missing.

Available Tools

24 tools
add-criterionAInspect

Добавить acceptance criterion к цели. Grove: только в backlog (AC frozen after start), блокирующий quality-линтер. Standard: редактируемы пока цель не закрыта, линтер advisory

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesФормулировка критерия
goalIdYesUUID цели
positionNoПозиция (default = append в конец)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate a write operation (readOnlyHint=false). Description adds behavioral constraints: in Grove, criteria are frozen after goal start and a blocking quality-linter applies; in Standard, editable until closed and linter is advisory. This goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is two sentences, front-loaded with the main action, and each sentence adds necessary context without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers environment-specific behavior and parameter defaults (position appends). Lacks details on error conditions, prerequisites, or return value, but is adequate for a simple add operation given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All parameters have descriptions in the schema (100% coverage). The description does not add additional meaning beyond what is already in the schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Add acceptance criterion to goal' with a specific verb and resource. Distinguishes from sibling tools as no other sibling adds criteria.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context about when to use the tool based on environment (Grove vs Standard) and associated constraints (frozen after start, linter behavior). Does not explicitly mention alternatives or when not to use, but the context is helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add-evidence-plainAInspect

SUBORDINATE / supplementary path — does NOT close an acceptance criterion. Adds a text-only note (URL to a permanent external source like CI run / GitHub commit / issue, or a description of a manual scenario) as extra context alongside the real proof. The path that actually covers an AC and closes a Grove goal is attach-evidence-file — use that one for every criterion. Plain evidence NEVER counts toward AC coverage no matter how many you add; it is only a complement to an attached file. NOT for bytes — screenshots, logs, API responses, exports all go through attach-evidence-file. NOT for filesystem paths — those need attach-evidence-file with the actual file.

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesText payload: URL to a permanent external source or a description of a manual scenario
createdByNoIdentifier of the uploading agent
criterionIdYesUUID acceptance criterion
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite minimal annotations (readOnlyHint false, idempotentHint false), the description thoroughly explains that the tool does not close acceptance criteria and does not count toward coverage, adding crucial behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the key point 'SUBORDINATE / supplementary path' and uses clear bullet-like dashes. Every sentence adds value with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description fully covers the tool's purpose, limitations, and relationship to siblings. For a simple tool with no output schema, it provides complete contextual understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the description adds little beyond the schema's parameter descriptions. It clarifies that 'text' can be a URL or description, but this is already implied by the schema's description. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool adds a text-only note as supplementary evidence and distinguishes it from attach-evidence-file, which closes acceptance criteria. The verb 'adds' and resource 'text-only note' are clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: use attach-evidence-file for every criterion, plain evidence never counts, and specifies what not to use it for (bytes, filesystem paths). This clearly differentiates from the sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add-project-dependencyA
Idempotent
Inspect

Добавить зависимость одного проекта от другого (depends-on). Принимает UUID или slug.

ParametersJSON Schema
NameRequiredDescriptionDefault
projectIdYesUUID или slug проекта-зависимого
dependsOnProjectIdYesUUID или slug проекта, от которого зависит
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutation (readOnlyHint=false) and idempotency (idempotentHint=true). The description confirms the mutation nature ('add dependency') but adds no further behavioral context, such as what happens if the dependency already exists or if it creates circular dependencies. With annotations covering mutability and safe retry, the description provides no additional value beyond stating the action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that front-loads the key action and resource. Every word contributes meaning, with no redundancy or filler. It is optimally concise for the information it conveys.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (two string parameters, no output schema), the description is minimally adequate. However, it does not explain return values, error cases, or prerequisites. While the idempotentHint annotation helps, the description lacks contextual completeness for an agent to fully anticipate the tool's behavior without additional assumptions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides full descriptions for both parameters (100% coverage). The description only restates that the tool accepts UUID or slug, which is redundant with the schema. No additional semantic details are given about the parameters beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (add dependency) and the resource (project dependency) with the specific type 'depends-on'. It also mentions acceptable input formats (UUID or slug). This effectively distinguishes from sibling tools like remove-project-dependency.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for adding a dependency but lacks explicit guidance on when to use this tool versus alternatives. No context is given about prerequisites, side effects, or scenarios where this tool is appropriate. It is adequate but minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

attach-evidence-fileAInspect

PRIMARY path to close a Grove goal: this is the ONLY tool that covers an acceptance criterion. Attach binary evidence (screenshot, log dump, API response, export) to an AC — call it once per criterion to satisfy the close gate. The subordinate add-evidence-plain only adds context for proofs with NO bytes (URLs to permanent external sources, manual repro descriptions) and does NOT cover an AC. Caption is optional but strongly recommended: state what the file captures and the reproduction conditions (URL/commit/session/inputs) so a third reviewer can reproduce.

⚠ PICK THE RIGHT TRANSPORT BEFORE YOU CALL THIS TOOL ⚠ • BEST for ANY file > ~1 KB raw — and the ONLY no-token path, so use it in a claude.ai / hosted-agent session that has no raw X-Auth-Token → call the sibling MCP tool request-evidence-upload with this same criterionId. It returns a one-time {uploadUrl, expiresAt}; then stream the raw bytes with a single PUT: curl -sS --fail --upload-file "/abs/path/to/file.png" "<uploadUrl>" (optionally add -H "X-Content-Sha256: " so corruption fails fast). No base64, no token — the signed ?t= ticket in the URL is the only credential, single-use, criterion-scoped. The PUT response is the same evidence JSON this tool returns. • ALTERNATIVELY, if you DO have the raw X-Auth-Token in your shell → the planner-attach.sh helper (zero-install bash, binary-safe). The MCP base64 path below is unreliable for non-trivial files: long string arguments get truncated or whitespace-corrupted on the agent side BEFORE the JSON-RPC request is sent. Measured 2026-05-20 on prod: a 4 KB PNG arrived at the server as 1874 decoded bytes (file_hash_mismatch); a 2 KB payload arrived with stray whitespace (failed base64_decode). The server itself accepts up to 25 MiB raw — the bottleneck is the agent-side serialisation of contentBase64, NOT the server.

planner-attach.sh COPY-PASTE RECIPE (replace 3 placeholders, run in your shell): curl -sS https://planner.monopoly-gold.com/api/cli/planner-attach.sh
| PLANNER_TOKEN="" bash -s --
--criterion-id ""
--file "/abs/path/to/file.png"
--caption "what is captured and the repro conditions"
--created-by ""

Where to get each value:

  • PLANNER_TOKEN: the very same token that is already in your MCP config under the X-Auth-Token header for the planner server. NOT a separate credential.

  • CRITERION_UUID: the AC id you got from get-goal / list-goals. Same UUID you would pass to this MCP tool.

  • file path: absolute path on YOUR (agent) machine — the script reads it locally and streams multipart. The planner server never sees your filesystem.

The helper computes SHA-256 itself and ships it as contentSha256, so any in-flight corruption fails fast with HTTP 400 instead of poisoning the evidence row. Output on stdout is the same JSON shape this MCP tool returns; non-zero exit means HTTP ≥ 400 (stderr explains).

Without curl/bash? Fall back to raw multipart: POST https://planner.monopoly-gold.com/api/criteria//evidence/file, header X-Auth-Token, form fields file=@..., contentSha256=..., caption, createdBy. • File ≤ ~1 KB raw → this MCP tool is fine. ALWAYS pass contentSha256 (hex SHA-256 of raw bytes BEFORE base64). Without it, a silently truncated PNG looks valid to the MIME sniffer; the server cannot distinguish a truncated 4 KB PNG from a valid 1 KB one and the vision judge burns ~30s on broken bytes. With the hash, the server fast-fails with error=file_hash_mismatch and points back here at the multipart endpoint.

Validates MIME whitelist (png/jpeg/webp/gif/pdf/txt/json/zip), per-file size cap (ATTACHMENTS_MAX_FILE_BYTES, default 25 MiB), per-project attachments quota. Returns evidence record + file URL + serverSha256.

ParametersJSON Schema
NameRequiredDescriptionDefault
kindNoOptional evidence kind override. The only accepted value is `session_history` — marks this attachment as the goal-level «full Claude session transcript» artifact required by the close gate (I4-session-history). Such evidence does NOT cover any AC and is NOT sent to the evidence judge. Omit for normal per-AC proof (kind is derived from MIME). NOTE: transcripts are usually > 1 KB → use request-evidence-upload (pass kind=session_history) or the multipart helper, not this base64 path.
captionNoOptional human-readable description, stored in evidence.payload
filenameYesOriginal filename (used to derive MIME). Path components are stripped.
mimeTypeNoMIME type — if omitted, derived from filename extension; must be in whitelist
createdByNoIdentifier of the uploading agent
criterionIdYesUUID acceptance criterion the file will be evidence for
contentBase64YesFile payload, base64-encoded (RFC 4648 §4 standard alphabet, padding optional)
contentSha256NoHex-encoded SHA-256 of the raw bytes (before base64). When provided, the server recomputes the hash on the decoded payload and rejects with error=file_hash_mismatch if they diverge — primary defence against MCP base64 truncation.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses behavioral traits beyond annotations: it validates MIME whitelist, enforces per-file size cap (25 MiB) and per-project quota, and warns about base64 truncation issues. It explains the importance of contentSha256 for detecting corruption. It also describes the special kind 'session_history' marking evidence as an artifact. Annotations already indicate mutation (readOnlyHint false) and non-idempotency, which the description confirms.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but information-dense. It is structured with clear sections: primary purpose, a prominent warning about transport, copy-paste recipes, and fallback instructions. Every sentence adds value given the complexity (8 parameters, multiple alternatives, safety warnings). A slightly tighter edit could remove redundancy, but overall it is well-organized and the length is justified by the need to prevent common errors.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation, no output schema, 8 params, sibling tools, and safety considerations), the description is exceptionally complete. It covers behavior, validation, alternatives, return values (evidence record + file URL + serverSha256), and relationships to request-evidence-upload and planner-attach.sh. It addresses potential pitfalls like base64 truncation and missing contentSha256. No significant gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage, the description adds significant meaning beyond the schema. For contentBase64, it warns about truncation and notes padding optional. For contentSha256, it explains its role in fast-failing on corruption. For kind, it details the 'session_history' semantics. For filename, it says path components are stripped. For mimeType, it notes derivation from extension. The description adds practical context that helps an agent use parameters correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool attaches binary evidence to a Grove goal acceptance criterion. It explicitly distinguishes from sibling 'add-evidence-plain' by noting that tool only adds context for proofs without bytes and does not cover an AC. The verb 'attach' and resource 'evidence-file' with the context of closing a goal make the purpose specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides extensive guidance on when to use this tool versus alternatives. It states this is the PRIMARY path to close a goal and the ONLY tool covering an AC. It warns against using the base64 path for files > ~1 KB and provides detailed alternatives: request-evidence-upload with PUT, or the planner-attach.sh script. It includes copy-paste recipes and explains when the MCP tool is appropriate (files ≤ ~1 KB).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

block-goalBInspect

Добавить блокер к цели. Опционально: создать цель-резолвер или привязать существующую

ParametersJSON Schema
NameRequiredDescriptionDefault
goalIdYesUUID блокируемой цели
descriptionYesОписание блокера — что мешает
linkedGoalIdNoUUID существующей цели-резолвера (взаимоисключающе с resolverTitle)
resolverTitleNoЗаголовок новой цели-резолвера (взаимоисключающе с linkedGoalId)
resolverParentIdNoUUID родителя резолвера (default = goalId, т.е. дочерняя)
resolverDescriptionNoОписание цели-резолвера
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutability (readOnlyHint=false). Description adds no further behavioral details such as whether existing blockers are overwritten, prerequisites (goal existence), or consequences (e.g., status changes). No output schema specified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence in Russian conveying core action and optional variants. No unnecessary words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters and absence of output schema, the description is too brief. It omits context like what a blocker represents, how it interacts with goal state, or error conditions (e.g., invalid goalId). Sibling tools like 'remove-blocker' exist but no cross-referencing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. The tool description adds minimal extra meaning beyond synchronizing the intent of creating vs linking a resolver, which is already stated in the schema. Baseline score of 3 is appropriate as description does not significantly enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses a specific verb (Добавить блокер) and resource (цель), clearly distinguishing from sibling tool remove-blocker. It also specifies optional sub-actions (create or link resolver), making the tool's purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies when to use optional resolver parameters but lacks explicit when-to-use or when-not-to-use guidance. No comparison with similar tools like add-project-dependency or add-criterion is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create-goalCInspect

Создать цель/задачу/milestone в дереве планирования

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoРежим: grove (default) или standard. Grove enforce-ит I_start на in_progress и I₃ на done; standard несёт те же AC, но без evidence-гейта
tagsNoТеги
typeNogoal / milestone / task / habit (default: task)
titleYesНазвание цели (макс. 500)
statusNobacklog (default) / in_progress
deadlineNoДедлайн ISO 8601
estimateNoОценка
parentIdNoUUID родителя (null = root)
priorityNoПриоритет 1–5 (default: 3)
projectIdNoUUID проекта
descriptionNoОписание
acceptanceCriteriaNoСписок AC-формулировок. Grove: опциональны на create, обязательны до перехода в in_progress, блокирующий quality-линтер. Standard: описательные, линтер advisory
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate this is a mutation (readOnlyHint=false). The description adds no extra behavioral context beyond stating 'create.' It does not disclose any side effects, permissions needed, or behavior nuances beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence. It is appropriately short and to the point, though it could be slightly more informative without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 12 parameters and no output schema, the description is too minimal. It does not explain the planning tree context, the meaning of different modes (grove vs standard), or the relationship to parentId/projectId. More detail is needed for a complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter has a description. The tool description adds no additional meaning beyond the schema. Baseline 3 is appropriate as the schema handles parameter semantics adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'create a goal/task/milestone in the planning tree,' which clearly identifies the action (create) and the resource (goal/task/milestone) in its context. It distinguishes from siblings like update-goal or delete-goal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. It does not mention when not to use it or provide context for choosing between create-goal and other tools like update-goal or delete-goal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create-projectCInspect

Создать новый проект

ParametersJSON Schema
NameRequiredDescriptionDefault
iconNoИконка (эмодзи, макс. 10 символов)
slugYesУникальный slug (только a-z, 0-9, дефис)
tagsNoТеги
titleYesНазвание проекта
statusNoСтатус: active / archived / paused (default: active)
isDefaultNoСделать проектом по умолчанию
descriptionNoОписание проекта
repositoryUrlNoURL репозитория
repositoryPathNoПуть к репозиторию на сервере
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and idempotentHint=false, signaling a non-idempotent write. The description adds no further behavioral details such as permissions, side effects, or constraints. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of a single phrase. It is front-loaded and efficient, though it could benefit from a bit more detail without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 9 parameters (2 required), no output schema, and no behavioral context, the description is insufficient. It does not explain return values, error conditions, or the effect of the operation, leaving the agent with gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all 9 parameters. The description adds no additional meaning beyond what the schema provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Создать новый проект' clearly indicates the verb (create) and resource (project). It is specific and unambiguous, though it does not differentiate from sibling tools like 'create-goal' or 'add-criterion', as they target different resources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not specify when to use this tool, prerequisites, or alternatives. Siblings like 'create-goal' exist but no guidance is given on choosing between them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete-accountA
Destructive
Inspect

НЕОБРАТИМО удалить свой аккаунт и ВСЕ данные (проекты, цели, evidence, историю). Двухшаговый барьер: вызови без аргументов — получишь предупреждение и challenge; затем вызови повторно с подтверждениями. НЕ вызывай без явной просьбы пользователя.

ParametersJSON Schema
NameRequiredDescriptionDefault
reasonNoЗачем удаляешь (свободный текст, ≥10 символов)
acknowledgeNoТочная фраза подтверждения
objects_totalNoprojects + goals из инвентаря (для вызова 2)
reasoning_answerNoОтвет на reasoning-риддл из вызова 1
confirm_passphraseNoФраза из вызова 1 (для вызова 2)
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark destructiveHint=true, but the description adds critical details: irreversibility, precise list of deleted data, and the two-step barrier. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose, second explains usage pattern and warning. No fluff, front-loaded, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive action with 5 parameters and no output schema, the description adequately covers the two-step workflow and data scope. It could mention expected return values (e.g., success message or error), but the primary usage guidance is sufficient for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptive parameter names and descriptions, so baseline is 3. The description adds context by mentioning the two-step flow (first call without args, second with confirmations), which helps the agent understand the parameter roles (e.g., reasoning_answer from first call, confirm_passphrase for second). This extra context justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool irreversibly deletes the account and all associated data (projects, goals, evidence, history). It distinguishes itself from sibling tools like delete-goal and delete-project by specifying the scope is the entire account.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes a two-step process: first call without arguments to receive a warning and challenge, then a second call with confirmation fields. Also warns not to call without explicit user request, providing clear when-to-use and when-not guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete-goalA
DestructiveIdempotent
Inspect

Удалить цель и все её подцели рекурсивно. НЕОБРАТИМО.

ParametersJSON Schema
NameRequiredDescriptionDefault
goalIdYesUUID цели
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveness (destructiveHint: true) and non-read-only behavior. The description adds valuable context: it deletes recursively and is irreversible (НЕОБРАТИМО). This goes beyond annotations and helps the agent understand the full impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two short sentences with no superfluous words. The key action and warning are front-loaded, making it efficient for an AI agent to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple deletion tool with one parameter and good annotations, the description covers the core behavior (recursive deletion, irreversibility). It does not mention return values, but no output schema is expected, so this is acceptable. Slightly lacking in mentioning side effects on related data (e.g., evidence), but the recursive subgoal deletion implies that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description does not need to add parameter meaning. It does not elaborate on goalId beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (delete), resource (goal), and scope (recursively including all subgoals). It distinguishes itself from sibling tools like delete-account or delete-project by specifying 'цель' (goal) and adding recursive deletion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., block-goal, update-goal). It only warns about irreversibility but does not state prerequisites or when to prefer this over other actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete-projectA
DestructiveIdempotent
Inspect

Удалить проект. Если в проекте есть цели — удаляет их тоже (force).

ParametersJSON Schema
NameRequiredDescriptionDefault
forceNoУдалить вместе со всеми целями (default: false — откажет если есть цели)
projectIdYesUUID проекта
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveness and idempotency. The description adds value by detailing the conditional deletion of goals with the force parameter, which goes beyond annotation information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the core purpose and key behavior, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description omits return value information. It adequately covers the action's effect but lacks what is returned upon success or failure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description largely repeats the parameter semantics from the schema without adding new context or syntax details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'delete a project' and adds the conditional behavior regarding goals, effectively distinguishing it from sibling tools like 'delete-goal'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool but does not explicitly provide when-not-to-use or alternative tools. It gives context via the force parameter but lacks direct sibling differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-goalA
Read-only
Inspect

Получить полные данные цели: описание, статус, детей, блокеры, заметки, историю

ParametersJSON Schema
NameRequiredDescriptionDefault
goalIdYesUUID цели
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds context beyond the readOnlyHint annotation by specifying the exact fields returned (children, blockers, notes, history). This helps the agent understand the scope of data. However, it does not mention any potential size limits or real-time behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that effectively conveys the tool's action and output fields. Every phrase adds value, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with no output schema, the description adequately lists the returned fields. It does not mention if the response is paginated or if there are limits, but that is not critical for this type of tool. Overall, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter 'goalId', which already describes it as 'UUID цели'. The description does not add any additional meaning or format details beyond what the schema provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to retrieve full data of a goal, listing specific fields (description, status, children, blockers, notes, history). This distinguishes it from sibling tools like 'get-project' (different resource) and 'list-goals' (which returns a summary list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when full goal details are needed, but does not provide explicit guidance on when to use this tool versus alternatives (e.g., 'get-summary' for a summary, 'list-goals' for listing). No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-projectA
Read-only
Inspect

Получить данные проекта с количеством целей по статусам

ParametersJSON Schema
NameRequiredDescriptionDefault
projectIdYesUUID проекта
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with the readOnlyHint annotation by indicating a data retrieval operation. It adds context about returning goal counts by status, but does not disclose other behaviors like response structure or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence) and front-loaded with the action and key output details, though it could benefit from a more structured format.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with one parameter and no output schema, the description provides sufficient context about the returned data (project details with goal counts), and annotations cover read-only behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter, and the description does not add meaning beyond the schema's 'UUID проекта' description, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (get) and the specific resource (project data with goal counts by status), distinguishing it from sibling tools like get-goal or get-summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies a read operation but does not explicitly state when to use this tool versus alternatives such as get-goal or list-projects, nor does it provide exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-summaryA
Read-only
Inspect

Получить сводку: количество целей по статусам, заблокированные, просроченные

ParametersJSON Schema
NameRequiredDescriptionDefault
projectIdNoUUID проекта
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. The description adds behavioral context about what the summary includes (counts by status, blocked, overdue), which is useful but could elaborate on the exact return format or edge cases.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, concise sentence that conveys the tool's purpose without unnecessary words. Front-loaded with the action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only aggregation tool with one optional parameter and no output schema, the description is fairly complete. It specifies the content of the summary, though it could mention the output structure or that it returns counts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema documents the parameter. The description does not add extra meaning beyond the parameter name and schema description (`UUID проекта`). Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves a summary of goal counts by status, blocked, and overdue. The verb 'get' and resource 'summary' are specific, and it differentiates from sibling tools like get-goal or list-goals by focusing on aggregated counts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied: it's for getting aggregated counts rather than individual goal details. However, there is no explicit guidance on when to use this tool versus alternatives like list-goals, nor any exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-treeA
Read-only
Inspect

Получить дерево целей — полную иерархию с вложенными детьми

ParametersJSON Schema
NameRequiredDescriptionDefault
rootIdNoUUID корня поддерева
projectIdNoUUID проекта (по умолчанию: все проекты)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description does not need to reiterate safety. The description adds that the output is a 'full hierarchy with children', which provides some behavioral context beyond annotations. However, it omits details like error handling or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that directly conveys the tool's core function with no redundant words. It is front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with only two optional parameters and no output schema, the description provides sufficient context about what is returned (tree hierarchy). It could be slightly improved by outlining the structure, but is adequate as-is.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with both 'rootId' and 'projectId' having clear descriptions. The description does not add extra parameter-level meaning, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves the complete hierarchy of goals with nested children. The name 'get-tree' aligns with this and distinguishes it from siblings like 'get-goal' (single goal) and 'list-goals' (flat list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when the full tree structure is needed, but does not explicitly state when to prefer it over alternatives or mention any exclusions. This is acceptable as the purpose is clear enough for an agent to infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-goalsB
Read-only
Inspect

Получить список целей с фильтрацией по статусу, типу, проекту

ParametersJSON Schema
NameRequiredDescriptionDefault
typeNoТип: goal, milestone, task, habit
limitNoМакс. кол-во результатов (default: 50)
statusNoСтатус: backlog, in_progress, blocked, done, cancelled
parentIdNoUUID родителя (только прямые дети)
projectIdNoUUID проекта (фильтр)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating no mutation. The description adds that it returns a filtered list but does not disclose any additional behavioral traits like pagination, sorting, or rate limits. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise, but it is incomplete as it does not cover all parameters or usage details. It front-loads the purpose but sacrifices completeness for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With five parameters, no output schema, and only a read-only annotation, the description is minimal. It does not explain return values, pagination behavior (despite a limit parameter), or ordering. It leaves gaps for a listing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description only mentions three of the five parameters (status, type, project), omitting limit and parentId. It adds minimal meaning beyond the schema and even underrepresents the available filtering capabilities.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (get list) and resource (goals) and specifies filtering by status, type, project. It distinguishes from single-goal retrieval (get-goal) and goal creation (create-goal).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for listing with filters but does not explicitly mention when to use this tool versus alternatives like get-tree (hierarchy) or get-summary. No exclusion criteria or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-projectsC
Read-only
Inspect

Список проектов

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoСтатус: active / archived / paused
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the safety profile is covered. The description adds no additional behavioral context such as pagination, ordering, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly terse (2 words) and lacks any structured information. It is under-specified rather than concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, and the description does not indicate what fields or records are returned. For a simple list tool, additional context like pagination or default behavior is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (the status parameter has a description listing valid values). The description adds no further parameter details beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Список проектов' (list of projects) merely restates the tool name 'list-projects' without adding any distinct meaning or differentiating it from siblings like 'list-goals'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., get-project, get-summary). There is no context about filtering or scope.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

move-goalA
Idempotent
Inspect

Переместить цель в другого родителя в дереве (null = сделать корневой)

ParametersJSON Schema
NameRequiredDescriptionDefault
goalIdYesUUID перемещаемой цели
newParentIdNoUUID нового родителя (null = сделать root)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate the tool mutates (readOnlyHint=false) and is idempotent (idempotentHint=true). The description adds clarity on the null behavior for newParentId but does not disclose potential side effects or validation behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the core functionality upfront with no extraneous text. It is optimally concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter tool with no output schema, the description is sufficient. It explains the key behavior with null. However, it could mention error conditions or prerequisites for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds minimal value beyond the schema's parameter descriptions, which already explain the newParentId parameter including the null meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'move' and resource 'goal', and specifies the context of moving to another parent in a tree structure. It distinguishes from sibling tools like create-goal, delete-goal, and update-goal by focusing on the move operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is for moving goals in the hierarchy but does not explicitly state when to use it over siblings like reorder-goals or update-goal. It lacks when-not conditions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove-blockerA
Idempotent
Inspect

Снять блокер с цели по UUID. Если был последним активным — цель выходит из status=blocked в предыдущий статус (in_progress / backlog).

ParametersJSON Schema
NameRequiredDescriptionDefault
goalIdYesUUID цели
blockerIdYesUUID блокера
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotency (idempotentHint: true) and write operation (readOnlyHint: false). The description adds valuable behavioral context: the goal status transitions from blocked to previous status if the removed blocker was the last active one.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences are concise and front-loaded with the action, followed by a critical behavioral note. Every word is necessary, no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description explains the status transition logic, it lacks details about what happens to the blocker resource itself (deleted or just disassociated) and does not mention the return value or output format. For a simple tool, it is moderately complete but leaves some gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters (goalId and blockerId) already provided. The description does not add additional meaning beyond what the schema offers, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (remove a blocker from a goal by UUID) and the resource (goal). It distinguishes from siblings like block-goal by indicating it's the inverse operation. The behavioral effect of status transition is also specified.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by describing the effect on goal status when removing the last blocker. It does not explicitly state when to use or provide alternatives, but the behavioral detail helps the agent decide applicability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove-evidenceA
Idempotent
Inspect

Удалить evidence-запись. Запрещено если goal уже done (frozen after close)

ParametersJSON Schema
NameRequiredDescriptionDefault
evidenceIdYesUUID evidence-записи
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds a behavioral constraint about the goal state, which is not covered by annotations. The annotations (readOnlyHint=false, idempotentHint=true) are consistent with removal, and the description enhances transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: one sentence with the action followed by a condition. Every part earns its place with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has one required parameter and no output schema, the description is fairly complete. It covers the core action and a key constraint. It could mention return behavior, but the simplicity makes this adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter evidenceId. The description does not add any additional meaning beyond what the schema already provides, so it meets the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Удалить evidence-запись' (delete evidence record) and the resource. It distinguishes from siblings like 'add-evidence-plain' which add records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states a when-not-to-use condition: 'Запрещено если goal уже done (frozen after close)'. This provides clear context, though it does not name alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove-project-dependencyA
Idempotent
Inspect

Убрать зависимость проекта. Принимает либо dependency_id, либо пару projectId+dependsOnProjectId (UUID или slug).

ParametersJSON Schema
NameRequiredDescriptionDefault
projectIdNoUUID или slug проекта-зависимого (если dependencyId не задан)
dependencyIdNoUUID самой связи
dependsOnProjectIdNoUUID или slug целевого проекта (если dependencyId не задан)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate the tool is not read-only (readOnlyHint=false) and is idempotent (idempotentHint=true). The description adds the parameter flexibility (either dependencyId or pair) but does not disclose additional behaviors like what happens if the dependency does not exist or any side effects. With annotations covering the safety profile, the description adds moderate value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that front-loads the action and immediately provides key usage information. Every word is necessary and there is no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, full schema coverage, and the idempotent annotation, the description provides sufficient context for a typical usage. It lacks details on failure handling or permissions, but these are not critical for basic invocation. The description is adequate for an agent to understand what to do.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema descriptions cover 100% of parameters, so baseline is 3. The tool description adds value by explaining the logical relationship between parameters (either dependencyId or projectId+dependsOnProjectId), which aids correct invocation beyond the schema's individual field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Remove a project dependency') and the resource. It also specifies the two modes of specifying the dependency (by ID or by project pair), which distinguishes it from the sibling 'add-project-dependency' tool and other removal tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use it (to remove a project dependency) and, by naming the sibling tool, hints at the alternative for adding. However, it does not explicitly state when not to use it or provide detailed usage context, such as prerequisites (e.g., dependency must exist).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reorder-goalsA
Idempotent
Inspect

Задать порядок целей внутри одного родителя и приоритета. Принимает массив UUID — каждой присваивается position = индекс в массиве.

ParametersJSON Schema
NameRequiredDescriptionDefault
idsYesUUID целей в новом порядке (все должны иметь одного родителя и одинаковый priority)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent behavior. The description adds the key behavioral detail that each goal gets position equal to its index in the array, going beyond annotations. Does not mention permissions or reversibility, but is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with action, no wasted words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains input and effect completely. Could mention return value but acceptable for a simple reordering operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes the ids parameter well (coverage 100%). The description adds the semantic detail that position equals array index, which adds meaning beyond the schema's constraint description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sets the order of goals within one parent and priority, which distinguishes it from siblings like move-goal. The verb 'задать порядок' (set order) and resource 'цели' (goals) are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (when reordering goals with same parent and priority) but does not explicitly state alternatives or when not to use. The schema adds constraints but the description could mention moving vs reordering.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request-evidence-uploadAInspect

PREFERRED path to attach a LARGE binary evidence file (screenshot, log dump, PDF, session transcript — anything > ~1 KB) to an acceptance criterion. Returns a one-time {uploadUrl, expiresAt} scoped to this criterion. Then STREAM the raw file to it with a single PUT — no base64, no token:

curl -sS --fail --upload-file "/abs/path/to/file.png" ""

Optionally pass the hex SHA-256 of the file so the server fast-fails on any in-flight corruption: curl -sS --fail -H "X-Content-Sha256: " --upload-file "/abs/path/to/file.png" ""

The PUT response is the same evidence JSON that attach-evidence-file returns (evidence id, serverSha256, judge verdict, criterion evidenceCount). A non-2xx PUT means the upload was rejected (expired/already-used/wrong-criterion/hash-mismatch) and NO evidence was created — request a fresh URL and retry.

Use this instead of attach-evidence-file for any non-trivial file. Use add-evidence-plain only for byte-less context (external URLs, manual repro notes) — it does NOT cover an AC.

ParametersJSON Schema
NameRequiredDescriptionDefault
kindNoOptional evidence kind override. Only accepted value is `session_history` (goal-level transcript artifact; does NOT cover an AC).
captionNoOptional human-readable description, stored on the evidence
createdByNoIdentifier of the uploading agent
criterionIdYesUUID acceptance criterion the file will be evidence for
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description fully discloses behavioral traits beyond annotations: it returns a one-time scoped URL with expiry, requires streaming PUT, explains non-2xx responses mean rejection and no evidence creation, and mentions optional SHA-256 for corruption detection. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with purpose, usage, examples, and alternatives. While long, every sentence adds value; the curl command is a helpful practical detail. Minor reduction possible but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explains the PUT response format, error handling, and retry logic, covering all key aspects of the tool's usage flow. Context is complete for an upload tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and already describes each parameter adequately. The description adds minimal extra parameter-specific meaning beyond the schema, but provides workflow context for 'criterionId'. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'PREFERRED path to attach a LARGE binary evidence file' and explicitly distinguishes it from siblings 'attach-evidence-file' and 'add-evidence-plain', making the specific verb+resource scope unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance ('non-trivial file') and when-not-to-use ('Use add-evidence-plain only for byte-less context'), names alternatives, and gives concrete examples with curl commands, enabling correct tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

todoBInspect

Быстро записать задачу или идею улучшения в проект

ParametersJSON Schema
NameRequiredDescriptionDefault
tagsNoТеги (автоматически добавляется "suggestion")
titleYesНазвание задачи
projectNoSlug проекта (если не указан — используется дефолтный)
priorityNoПриоритет 1–5 (default: 4)
descriptionNoОписание
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false and idempotentHint=false. The description adds no behavioral context beyond saying 'quickly write,' which aligns with mutation but doesn't disclose side effects, auth needs, or other behaviors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that is front-loaded with the action. It is efficient and wastes no words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters, no output schema, and no explanation of return values or auto-tagging ('suggestion' tag is mentioned only in the schema but not in the description), the description is incomplete for a creation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters having descriptions. The tool description does not add meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: quickly record a task or improvement idea in a project ('Быстро записать задачу или идею улучшения в проект'). It uses specific verbs and resources, distinguishing it from siblings that focus on goals, evidence, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like create-goal or add-criterion. There is no mention of context, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update-goalB
Idempotent
Inspect

Обновить поля цели: статус, приоритет, название, описание и др.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoРежим цели: standard или grove. Иммутабелен после mode_locked_at
tagsNoНовые теги
typeNoНовый тип: goal, milestone, task, habit
titleNoНовое название (макс. 500)
goalIdYesUUID цели
statusNoНовый статус: backlog, in_progress, blocked, done, cancelled
deadlineNoНовый дедлайн (ISO 8601)
estimateNoНовая оценка
priorityNoНовый приоритет (1–5)
descriptionNoНовое описание
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint false) and idempotency (true). Description adds no extra behavioral details (e.g., auth needs, side effects, mode immutability).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence front-loads the action and key examples with no extraneous text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Ten parameters are present but no output schema or explanation of update behavior (e.g., partial updates, returned object). Description insufficient for a complex mutation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so each parameter already has a description. The tool description merely lists some fields (status, priority, etc.) without adding new semantics, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool updates goal fields (status, priority, name, description, etc.), which distinguishes it from siblings like create-goal, delete-goal, or block-goal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like move-goal or block-goal; lacks context about prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources