Skip to main content
Glama

Server Details

Enterprise-safe agentic AI design doctrine. Read-only MCP, UK/EU residency, zero-training policy.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.1/5 across 21 of 21 tools scored. Lowest: 3.3/5.

Server CoherenceA
Disambiguation5/5

Every tool targets a distinct function: validation, asset listing, cluster/principle/example/guide access, various handoff types, user progress tracking, feedback, and team summarization. No two tools are ambiguous.

Naming Consistency4/5

Tools follow a domain.action pattern (e.g., architect.validate, clusters.list). Most are verb_noun, but a few like handoffs.agency use a noun, and me.coaching_context uses a noun phrase. Overall consistent but with minor deviations.

Tool Count4/5

21 tools is on the higher side but justified by the broad scope covering principles, examples, guides, handoffs, user progress, and feedback. Each tool serves a clear purpose, though some consolidation could reduce count.

Completeness4/5

The tool surface covers the main domain well: CRUD-like operations for principles/examples/guides, multiple handoff types, user evidence tracking, and feedback. Minor gaps exist (e.g., no update/delete for evidence or handoffs) but these are acceptable for the intended use.

Available Tools

23 tools
architect.certifyCertify Production-Ready ArchitectureAInspect

Pro/Teams — second-pass adversarial certification of an architect.validate run that scored production_ready (A or B first-pass tier). Mints the certified production_ready badge when both reviewers sign off; caps the run to C/emerging when the second pass surfaces a missed production_blocker. ATOMIC ONE-SHOT, RECOVERABLE: single LLM call typically runs 60-150s server-side (empirical, on real third-party code at high reasoning effort — small payloads finish faster). This exceeds the standard MCP-client tool-call idle budget (~60s in Claude Code), so the FIRST notifications/progress event fires at t=0 and carries the same run_id you passed in. If your client closes the tool-call early, recover the cert verdict via me.validation_history(run_id=<that-id>) once the server-side LLM call lands — same pattern as architect.validate. The run is atomic by contract — no in_progress lifecycle, no cancellation, no resume. If the cert call fails outright (provider error, persistence error), a fresh architect.certify is the recovery path (eligibility gate enforces the retry budget). For long-running cert workflows the answer is to re-validate, not to make this tool stateful. Eligibility gate (typed rejection enum on failure): caller must own the run, run must be tier=production_ready, less than 24h old, not already certified, and within the cert retry budget (max 3 attempts per run). Reads first-pass findings from the persisted run; the caller must re-send the code (the architect persists findings + recommendations, never code, by design — privacy-preserving). The cert outcome updates the persisted run's result_json so the public review URL + me.validation_history(run_id=...) reflect it. ENTERPRISE-SAFE: code is processed transiently by the LLM provider (OpenAI, no-training-on-API-data) and dropped; JSON-escaped + delimited as inert untrusted data so prompt-injection inside payloads is ignored. UK/EU data residency (Cloud Run europe-west2). Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
codeYesThe same code that was sent to architect.validate to produce this run_id. Sent verbatim — the cert reviewer needs the actual code to surface production_blockers the first pass missed. Running cert against findings alone (no code) cannot catch what the first pass didn't evidence. Sent under the same enterprise-safety envelope as architect.validate (transient processing, no training, JSON-escaped + delimited).
run_idYesThe run_id from a prior architect.validate call. Returned in the validate response when persistence_status='saved'. Must be owned by the caller (per-user authorisation, same gate as me.validation_history).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses key traits: bounded synchronous one-shot (20-40s), atomicity, no in_progress/cancel/resume, failure recovery via fresh cert, retry budget, enterprise-safe data handling (transient, no training, JSON-escaped), UK/EU residency, auth method. Adds context beyond annotations (readOnlyHint=false, openWorldHint=true). No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is dense but well-structured, front-loaded with core purpose and behavior. Some redundancy (e.g., repeated safety details) but overall efficiently conveys all necessary information given complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive coverage: purpose, usage guidelines, behavior, parameter semantics, failure recovery, data handling, privacy, auth, eligibility gate, and effect on persistence (updates result_json). No output schema but return behavior implied. Fully complete given tool complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds meaning: for 'code', explains why it must be verbatim, ties to enterprise-safety; for 'run_id', specifies ownership and retrieval from validate response. This extra context justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as 'second-pass adversarial certification of an architect.validate run' and specifies it 'Mints the certified production_ready badge' or caps to 'C/emerging'. This distinctively differentiates it from sibling tools like architect.validate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (after architect.validate returns production_ready A/B) and when not (re-validate instead of stateful cert), along with eligibility gate conditions (caller owns run, tier production_ready, <24h old, not certified, retry budget). Provides alternative recovery path on failure.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

architect.validateValidate Agent ArchitectureAInspect

Pro/Teams — single-pass doctrine review of agent code, workflow, or architecture against the 10-principle Agentic AI Blueprint. Returns a structured assessment with code_classification (autonomous_agentic_workflow vs non_agentic_component), per-principle findings (verdict, severity_score 0-100, confidence, evidence_quality, code-cited evidence, recommendation), a severity-weighted readiness block (score, grade A-F, tier), recommended examples, processing latency, reproducibility fingerprints (model, seed, system_fingerprint, doctrine_fingerprint, prompt_template_fingerprint), and a persistence_status with shareable run_id / badge_url / review_url when saved. Typed failures: timed_out, rate_limited, dependency_unavailable, schema_mismatch (each carries retryable + next_action). Returns in ~30-50s; fits the ~60s MCP-client tool budget. TWO-STEP CERTIFY: when first-pass scores production_ready (A or B tier) the response carries certification_status='not_evaluated'. To mint the certified production_ready badge, call architect.certify(run_id, code) next — a separate Pro/Teams tool with its own eligibility gate (caller-owned, tier=production_ready, less than 24h old, not already certified, retry budget). Cert is kept off the validate hot path so first-pass review always returns inside the tool budget; cert itself runs as a single LLM call (~20-40s) and also fits cleanly. Call when the user wants a governance audit, a readiness score, or a public badge for an agent or workflow they have just built or changed. Pass repository="" to group runs into a project trend. INPUT FIDELITY: when reviewing existing code, send the FULL file contents verbatim as implementation_context. Do NOT truncate, compress, summarise, or paraphrase the code; the architect's verdict relies on per-line evidence (variable names, structural choices, exact branch ordering). If the file is large, split into multiple architect.validate calls scoped by file rather than condensing one call. Architecture summaries are valid ONLY when no code exists yet (greenfield review). ENTERPRISE-SAFE: payloads are processed transiently by the LLM provider (OpenAI, no-training-on-API-data) and dropped; user code + context are JSON-escaped and delimited as inert untrusted data so prompt-injection inside payloads is ignored. Pass private_session=true to bypass server-side logging — enforced in code. TIMEOUT RECOVERY: this tool typically returns within the client tool budget. If your MCP client closes the tool-call early (transport reset, etc.), the run still completes server-side — the first notifications/progress event fires at t=0 carrying the run_id, and you can recover the result via me.validation_history(run_id=...) once the run completes. Recovery requires the same Bearer token (per-run authorisation) and is unavailable when private_session=true (nothing persists). UK/EU data residency (Cloud Run europe-west2). Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
taskNoWhat the agent or workflow is trying to accomplish. Adds evaluation context.
filesNoList of file paths relevant to the implementation context.
goalsNoSpecific safety or quality goals to evaluate against (e.g. 'prevent irreversible actions', 'explicit approvals').
languageNoProgramming language of the code being evaluated (e.g. 'python', 'typescript').
focus_areaNoNarrow the evaluation to a specific principle cluster or slug (e.g. 'delegation-and-scope').
repositoryNoRepository name or path for additional context.
example_limitNoMaximum number of curated examples to include in recommendations.
private_sessionNoSet to true to disable all logging for this validation call.
implementation_contextYesThe artifact under review. When reviewing existing code, pass the FULL file contents verbatim — no truncation, no whitespace compression, no condensing of multi-line statements, no summarisation. The architect's findings cite specific identifiers, branch ordering, and structural choices that get destroyed by compression, so a summarised submission produces a degraded verdict that does not reflect the actual code. If a single file is too large, split into multiple architect.validate calls scoped by file/module rather than condensing one call. Architecture summaries (high-level prose) are accepted ONLY when no code exists yet (greenfield review); never as a substitute for code that already exists.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false, openWorldHint=true), the description details transient processing, privacy controls (private_session), data residency, authentication, and that it calls an external LLM. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose, then returns, then privacy details. Each sentence adds value, but the privacy section is somewhat long. Still well-organized and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains what the tool returns (coverage, findings, recommendations) but lacks detail on output structure. Given no output schema, more specifics would help. Privacy details are good, but 'Blueprint doctrine' is not defined or linked to sibling resources.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so parameters are already well-documented in schema. The description does not add significant meaning beyond what the schema provides, warranting baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a clear verb and resource: 'evaluate code, a workflow, or an architecture description against the Blueprint doctrine.' It distinguishes itself from sibling tools that provide reference data (e.g., principles.list) by focusing on evaluation and recommendations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states what to evaluate but does not explicitly mention when to avoid this tool or suggest alternatives among siblings. The context is adequate for basic usage but lacks exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

assets.listList Agent AssetsA
Read-onlyIdempotent
Inspect

List downloadable doctrine and agent asset artifacts.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint as true, so the description does not need to restate safety. It adds no additional behavioral insight beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, clear sentence with no superfluous words. It conveys the essential information efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with no parameters, the description adequately conveys the result type. However, it does not mention potential omissions like pagination or sorting, though these may not be applicable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, and the input schema coverage is 100%. The description adds no parameter details, but with no parameters, a baseline of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action ('List') and the resource ('downloadable doctrine and agent asset artifacts'), which distinguishes it from other list tools like clusters.list or guides.list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for listing specific asset types but does not explicitly state when to use it versus alternatives. However, the resource name ('assets') and context from siblings make the usage context reasonably clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clusters.getGet ClusterA
Read-onlyIdempotent
Inspect

Get full detail for one cluster: description, member principles, and shared rationale. Use this when you already know the exact cluster slug. Prefer clusters.list when you need to discover which clusters exist.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesStable slug of the principle cluster (e.g. 'delegation-and-scope').
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, which handle safety. The description adds no additional behavioral context beyond the core action, but does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no extraneous information. Every word contributes to the purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one parameter, no output schema, clear annotations), the description is largely complete. It could benefit from briefly noting what a principle cluster is, but it is sufficient for an agent with domain knowledge.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a well-described 'slug' parameter. The description's 'stable slug' is redundant with the schema. No new meaning is added beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Get' and the resource 'principle cluster' by its 'stable slug', distinguishing it from the sibling clusters.list which returns a list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this vs alternative tools (e.g., clusters.list). The description implies it's for fetching a single cluster, but lacks exclusionary or comparative context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clusters.listList ClustersA
Read-onlyIdempotent
Inspect

List all principle clusters with their stable slugs and linked principle titles. Use this to discover which clusters exist before drilling in with clusters.get or filtering principles.list by cluster. Prefer clusters.get when you already know the cluster slug and need full detail.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. Description adds that it returns 'stable slugs and linked principle titles', providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words, front-loaded with the action and key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple listing tool with no parameters and no output schema, the description is adequate. Could mention if there is pagination or ordering, but likely not needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so baseline is 4. Description does not need to add parameter info, and it does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool lists principle clusters and specifies the returned fields (stable slugs and linked principle titles). This distinguishes it from siblings like clusters.get.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use over alternatives like clusters.get or principles.list. Usage is implied but not elaborated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

examples.getGet ExampleA
Read-onlyIdempotent
Inspect

Get full detail for one curated example: code links, principle coverage, difficulty, library, and notes. Use this when you already have the slug from examples.search or a guide cross-link. Prefer examples.search to find examples by topic, principle ID, difficulty, or library.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesStable slug of the curated example (e.g. 'agents-building-blocks-5-control').
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and idempotentHint. Description adds only 'curated' qualifier, no new behavioral traits beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, perfectly front-loaded, no redundant words. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple get-by-slug tool with good annotations and full schema, description is adequate. Missing output schema but not critical for this action.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with clear example for 'slug'. Description adds no additional meaning; baseline 3 appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get a curated example by its stable slug', specifying verb (Get), resource (curated example), and unique identifier (slug). Distinguishes from sibling 'examples.search' which would be for searching.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: when you have a slug. No explicit when-not or alternatives mentioned. Context is clear but lacks depth.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

examples.searchSearch ExamplesA
Read-onlyIdempotent
Inspect

Search curated examples ranked by relevance, with optional filters for principle coverage, difficulty, and library. Use this when the user describes a use case, technique, or library and wants matching examples. Prefer examples.get when you already have the example slug and need full detail.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of results to return. Capped at server maximum.
queryYesFree-text search query matched against example title, summary, and metadata.
libraryNoFilter by library or framework name (e.g. 'langgraph', 'openai', 'anthropic').
difficultyNoFilter by difficulty level.
principle_idsNoFilter to examples that cover these principle IDs.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint and idempotentHint, which are consistent with the 'Search' operation. The description does not contradict annotations and adds no misleading behavior. It could mention sorting or result ranking, but the annotations already assure safety, so the bar is lower. Score 4 for consistency and sufficient transparency given the tool's simplicity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is front-loaded with the essential action and resource, followed by filter dimensions. Every word earns its place; no redundancy or vagueness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, 1 required, no output schema, and sibling tools, the description covers the search dimensions but omits return behavior (e.g., pagination, ranking, default sort). For a search tool, this is a notable gap. Without output schema, completeness is lower than ideal.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema fully documents each parameter. The description only lists the filter dimensions without adding new meaning (e.g., syntax, format, or behavior). Per guidelines, baseline is 3, and no extra value is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses precise language: 'Search curated examples by text, principle coverage, difficulty, and library.' It clearly identifies the action (search) and resource (curated examples), and distinguishes from sibling tools like examples.get (presumably fetch by ID) and guides.search (different resource).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for searching examples but does not explicitly state when to use this tool versus alternatives (e.g., examples.get, guides.search) or provide exclusions. The context is clear but lacks explicit guidance, leaving the agent to infer from tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guides.getGet Application GuideA
Read-onlyIdempotent
Inspect

Get a full application guide by its stable slug (e.g. 'security-application', 'observable-evaluation'). Returns sections, action items, and linked principles. Use this when you already have the guide slug from guides.list or guides.search. Prefer guides.search when the user describes a topic in natural language; prefer guides.list when you need the full inventory.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesStable slug of the application guide (e.g. 'security-application', 'observable-evaluation').
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true; description adds no behavioral context beyond repeating the purpose.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with action and resource, efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple get operation; lacks return format details but no output schema expected.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers slug fully (100%); description adds example slugs but no new meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it retrieves a full application guide by stable slug, differentiating from siblings like guides.list and guides.search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use when slug is known, but no explicit guidance on when to use list/search instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guides.listList Application GuidesB
Read-onlyIdempotent
Inspect

List application guides that show how Blueprint principles apply to engineering challenges (security, evaluation, observability, etc.). Use this to discover which guides exist before drilling in. Prefer guides.search when the user describes a topic or failure mode in natural language. Prefer guides.get when you already know the guide slug and need full detail.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint, so the description does not need to restate safety. The description adds minor context about the content of the guides but does not disclose output format or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is front-loaded with the verb 'List' and provides essential information without any wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description should specify what the returned data contains (e.g., guide titles, IDs). It only mentions the thematic focus but omits details about the structure of the list, leaving the agent uncertain about how to use the result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the description cannot add parameter semantics beyond what the schema provides. The baseline score for zero parameters is 4, and the description does not detract from that.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists application guides and specifies the thematic content (blueprint principles applied to engineering challenges). While it distinguishes from siblings like 'guides.get' and 'guides.search' by being a list operation, it could be more explicit about what 'application guides' are.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'guides.search' or 'guides.get'. The description lacks explicit when-to-use or when-not-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guides.searchSearch Application GuidesA
Read-onlyIdempotent
Inspect

Search application guides by free-text query, matched against section answers and action items. Use this when the user describes an engineering challenge (security review, evaluation harness, observability) and wants matching guides. Prefer guides.get when you already have the guide slug; prefer guides.list when you need the full inventory.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of results to return. Capped at server maximum.
queryYesFree-text search query matched against all guide content including section answers and action items.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and idempotentHint, so the tool's safety profile is clear. The description adds value by explaining the scope of the search (all guide content including sections and action items), which is not inferable from annotations alone.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no extraneous words, front-loading the action and scope. It efficiently conveys the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description explains what the search covers, it omits information about the return format (e.g., guide IDs, scores, snippets). Since there is no output schema, the description could be more complete by hinting at the response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; the schema already documents both parameters. The description restates that 'query' is a free-text search matching all content, but does not add significant new semantic information beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies a verb ('Search'), a resource ('application guides'), and the scope ('by text query'). It also details that matching covers all content including section answers and action items, which clearly distinguishes it from sibling tools like guides.get or guides.list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states the tool's function but provides no guidance on when to use it versus alternatives such as guides.get (for a specific guide) or other search tools (examples.search, principles.search). No when-not-to-use instructions are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

handoffs.agencyRequest Agency HandoffAInspect

Submit an agency engagement enquiry on behalf of the authenticated user for a founder-led discovery call. Agency engagements cover four scopes: workflow sprint (rapid agentic workflow implementation), proof-of-concept (validate a specific agent design in a bounded timeframe), pilot support (co-design and validate a production-ready pilot), and advisory (ongoing architectural guidance across a product team). Use this when the user has identified a need for hands-on expert support beyond self-service learning. Requires a Firebase Bearer token.

ParametersJSON Schema
NameRequiredDescriptionDefault
roleNoRole or title of the person submitting the agency inquiry.
localeNoResponse locale for the acknowledgment.en
reasonYesDescription of the engagement need: workflow sprint, proof-of-concept, pilot support, or advisory.
companyNoCompany or team name submitting the agency inquiry.
websiteNoWebsite or relevant URL for the team or project.
agent_nameNoName of the agent or client triggering the handoff.mcp-client
support_typeNoType of support needed.
trace_summaryNoOptional agent trace summary for operator context.
agent_platformNoPlatform or runtime the agent is running on.
workflow_stageNoCurrent workflow stage.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false, idempotentHint=false, and openWorldHint=true. The description adds that a Firebase Bearer token is required, which is extra security context beyond the annotations. However, it doesn't describe side effects, persistence, or what happens after submission (e.g., confirmation, call scheduling).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences long, front-loaded with the core purpose. Each sentence adds necessary information: purpose, scope details, usage guidance, and authentication requirement. No redundant or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 10 parameters and no output schema, the description should clarify the expected outcome (e.g., acknowledgment, call initiation). It mentions 'discovery call' but does not explain what the agent will receive after submission. This missing information reduces completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 10 parameters with 100% description coverage. The description adds value by explaining the meaning of the four engagement scopes in detail (workflow sprint, proof-of-concept, pilot support, advisory), which helps the agent choose appropriate values for 'reason' and 'support_type'. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool submits an agency engagement enquiry for a founder-led discovery call. It lists four specific scopes (workflow sprint, proof-of-concept, pilot support, advisory), and the name 'handoffs.agency' differentiates it from siblings like 'handoffs.operator' and 'handoffs.partnership'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description says 'Use this when the user has identified a need for hands-on expert support beyond self-service learning.' This gives a clear condition for use. It doesn't explicitly specify when not to use or mention alternatives, but the context implies that other handoff tools handle different scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

handoffs.operatorRequest Operator HandoffAInspect

Creates a support handoff when an agent needs human review, escalation, or account-specific follow-up. Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoTopic category for routing (e.g. 'agent', 'billing', 'access', 'general').agent
localeNoResponse locale for the handoff acknowledgment.en
reasonYesClear description of why a human operator review is needed.
page_urlNoURL of the page or context where the handoff was triggered.
agent_nameNoName of the agent or client triggering the handoff.mcp-client
trace_summaryNoOptional summary of the agent's recent actions or trace for operator context.
agent_platformNoPlatform or runtime the agent is running on (e.g. 'claude-code', 'cursor', 'copilot').
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate the tool is mutating (readOnlyHint=false) and not idempotent. The description adds an authentication requirement ('Auth: Bearer <token>') but does not disclose other behavioral details such as rate limits or side effects like ticket creation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence plus an auth note, making it concise and front-loaded. While it could be expanded slightly, it avoids unnecessary text and effectively communicates the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 7 parameters fully documented in the schema and no output schema, the description provides minimal additional context. It lacks explanation of return values, differentiation from sibling tools, and post-call effects, making it adequate but incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter description coverage, so the schema already explains each parameter. The description does not add extra meaning beyond the schema, resulting in a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a support handoff for human review, escalation, or account-specific follow-up, using a specific verb and resource. It implicitly distinguishes from sibling tools (handoffs.agency and handoffs.partnership) by specifying the operator context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool: when an agent needs human review, escalation, or account-specific follow-up. However, it does not explicitly exclude other scenarios or mention alternatives like sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

handoffs.partnershipRequest Partnership HandoffAInspect

Creates a partnerships handoff for design partner, ecosystem, training, or advisory conversations needing human review. Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
roleNoRole or title of the person submitting the partnership inquiry.
topicNoPartnership topic category.ecosystem
localeNoResponse locale for the handoff acknowledgment.en
reasonYesClear description of the partnership opportunity or inquiry.
websiteNoWebsite of the organization for additional context.
agent_nameNoName of the agent or client triggering the handoff.mcp-client
organizationNoName of the organization or company making the partnership inquiry.
trace_summaryNoOptional agent trace summary for operator context.
agent_platformNoPlatform or runtime the agent is running on.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false and idempotentHint=false. The description adds the auth requirement (Bearer token) which is helpful, but does not elaborate on side effects or other behavioral nuances beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no redundant information. It front-loads the primary action and includes essential auth info. Every sentence contributes value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 9-parameter tool with no output schema, the description provides a basic overview and auth info but lacks details about return behavior, error conditions, or operational context (e.g., what happens after handoff). The schema fills some gaps, but completeness is moderate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% and each parameter has a description. The tool description summarizes the purpose but adds no extra meaning beyond what the schema already provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a partnerships handoff for specific conversation types (design partner, ecosystem, training, advisory) needing human review. It uses a specific verb and resource, and the context is distinct from siblings like handoffs.agency.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for partnership inquiries needing human review but does not explicitly contrast with sibling tools or specify when not to use it. No exclusions or alternatives are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

me.add_evidenceAdd Evidence NoteAInspect

Append a free-text evidence note to a specific stage in the authenticated user's active course. Evidence notes record concrete implementation observations, decisions, or artefacts that demonstrate progress through a Blueprint principle — for example, describing how a delegation boundary was implemented or what approval flow was chosen and why. Call this after the user has articulated something they have built or observed, not to capture intent. Requires a Firebase Bearer token.

ParametersJSON Schema
NameRequiredDescriptionDefault
noteYesEvidence note to append to the delegation boundary notes for this stage.
stage_idYesID of the stage to append the evidence note to.
course_slugYesSlug of the course the stage belongs to (e.g. 'agentic-fundamentals').
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-readOnly and non-idempotent. The description adds that it requires a Firebase Bearer token, which is useful authentication context. It also implies mutation by saying 'append.' However, it doesn't describe side effects (e.g., whether notes are appended each time) or return behavior, which would enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph with no redundant information. It front-loads the core action, provides context, includes an example, and ends with a requirement. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (3 required params, no output schema), the description covers the main action, usage context, and authentication. It lacks details on return values or error conditions, but for an append operation, it is mostly complete. The absence of output schema is not a major gap here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with clear parameter descriptions in the schema (e.g., 'Evidence note to append for this stage'). The description adds an example of content (e.g., 'how a delegation boundary was implemented') but does not substantially extend the meaning beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('append', 'record') and identifies the resource ('evidence note to a specific stage'). It distinguishes between capturing intent and recording observations, which clarifies purpose and differentiates from potential sibling tools that might deal with intent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to call: 'after the user has articulated something they have built or observed, not to capture intent.' This provides clear guidance. However, it does not explicitly name alternative tools or scenarios when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

me.coaching_contextGet My Coaching ContextA
Read-onlyIdempotent
Inspect

Returns stages where evidence is thin relative to principle requirements. Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. The description adds authentication details (Bearer <token>) and clarifies the output type (stages with thin evidence). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, with two sentences containing no fluff. Every sentence adds value: one defines the purpose, the second specifies authentication.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and good annotations, the description covers the core functionality and auth requirement. However, it does not explain what 'stages' or 'principle requirements' refer to, which could be inferred from sibling tool names but is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description adds no parameter information, which is acceptable. The baseline for 0 params is 4, as the schema fully covers parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns stages where evidence is thin relative to principle requirements. This is a specific verb-resource combination that distinguishes it from siblings like me.add_evidence or me.learning_path.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for identifying weak evidence stages, but does not explicitly specify when to use it versus alternatives or provide any exclusion criteria. No guidance on prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

me.learning_pathGet My Learning PathA
Read-onlyIdempotent
Inspect

Returns your learning path state: course progress, certification status, and Capstone track eligibility. Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true. The description adds that it returns state and requires authentication, which is useful but not extensive. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences, zero wasted words. The most important information (purpose) is front-loaded, followed by auth requirement. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and no output schema, the description provides a reasonable outline of what is returned (course progress, certification status, Capstone track eligibility). It could be more detailed about the structure, but it is sufficient for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters, so schema description coverage is 100%. The description does not add parameter details because none exist. Baseline for zero parameters is 4, and the description does not need to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'returns' and the resource 'learning path state' and lists specific components (course progress, certification status, Capstone track eligibility), making the purpose unambiguous. It distinguishes this tool from siblings by focusing on the user's personal learning path, which is unique among the listed tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes authentication requirements ('Auth: Bearer <token>'), providing clear context for use. While it does not explicitly mention when not to use or alternatives, the tool is singular in its domain (no sibling tools for learning paths), so the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

me.validation_historyMy Architect Agent Validation HistoryA
Read-onlyIdempotent
Inspect

Pro/Teams — return the authenticated user's architect.validate run history with the Blueprint Readiness Score (0-100), letter grade (A-F), and tier (draft, emerging, production_ready). Three lookup modes: (1) run_id=<id> returns a SINGLE run with the full persisted result_json — use this to RECOVER a result when your MCP client tool-call timed out before architect.validate returned. The run completes server-side and persists; the run_id is surfaced in the first progress notification of every architect.validate call so you have the recovery handle even when your client gives up early. (2) repository=<name> returns the full per-run trend for that repository plus a regression diff between the latest two runs. (3) No arguments returns one summary per repository the user has validated, sorted by most recent. Use modes (2) or (3) BEFORE calling architect.validate again on the same repository — they tell you which principles regressed since the last run, so you can focus the new review on what is actually changing. Auth: Bearer . Pro or Teams plan required.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of runs to return when scoped to a single repository. Capped at 50. Ignored when `run_id` is provided.
run_idNoSingle-run lookup by run_id (UUID). Returns the persisted result_json verbatim — the same payload architect.validate would have returned if your client hadn't timed out. Use this to recover a result when your MCP tool-call closed before the server returned. Per-run authorisation: returns only runs owned by the calling user.
repositoryNoRepository name or path to scope the history to. Pass the same value you would pass to architect.validate. Omit to get one summary per repository. Mutually exclusive with `run_id` — if both are passed, `run_id` wins.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint), description adds plan requirements (Pro/Teams), auth method (Bearer token), behavior with and without repository argument, and return details including regression diff. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is multi-sentence but each sentence adds value (plan, auth, behavior, usage advice). Could be slightly more concise, but well-structured and front-loaded with critical info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description fully explains returned fields (score, grade, tier, regression diff). Covers two modes of operation, auth, plan requirement, and best practice usage. Highly complete for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage and good descriptions. Description adds context: repository should match architect.validate argument, limit capped at 50, and explains the effect of omitting repository. Enhances semantic understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states specific verb ('return'), resource ('architect.validate run history'), and adds scope ('with repository or without') that distinguishes it from sibling architect.validate. Clearly describes what the tool does and how it differs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this before architect.validate to avoid re-flagging fixed issues. Provides context on when to call (before a re-run), but does not explicitly mention when not to use it, though that is implied by the sibling tool's purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

principles.getGet PrincipleA
Read-onlyIdempotent
Inspect

Get full detail for one principle: definition, rationale, risk, heuristics, and linked example slugs. Use this when you already have the exact slug from principles.list or principles.search. Prefer principles.search when the user describes a topic or failure mode in natural language. Prefer principles.list when you need every principle or every principle within a cluster.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesStable slug of the principle (e.g. 'establish-trust-through-inspectability').
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, so the tool is safe and repeatable. The description adds that the slug is 'stable', meaning the identifier does not change, which is valuable behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that clearly conveys the action, object, and identifier. No wasted words, and the most critical information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-retrieval tool with one required parameter and comprehensive annotations, the description is fully adequate. It doesn't need to describe return values given the tool's straightforward nature.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already covers the slug parameter with a description (100% coverage). The description adds that the slug is 'stable' and provides an example, which adds slight additional meaning to the parameter's semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Get' and resource 'doctrine principle', with the identifier 'stable slug'. This clearly distinguishes it from sibling tools like 'principles.list' (multiple) and 'principles.search' (filtered), leaving no ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool: when you have a specific slug for a principle. It doesn't explicitly state when not to use it or name alternatives, but the context of 'single' retrieval is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

principles.listList PrinciplesA
Read-onlyIdempotent
Inspect

List all 12 Blueprint principles with stable slugs, titles, and clusters. Use this when you need the full inventory or want every principle in one cluster (pass cluster slug to filter). Prefer principles.search when the user describes a topic, failure mode, or keyword in natural language. Prefer principles.get when you already know the exact slug and need full detail.

ParametersJSON Schema
NameRequiredDescriptionDefault
clusterNoCluster slug to filter by (e.g. 'delegation-and-scope'). Omit to return all principles.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds context beyond readOnlyHint and idempotentHint by noting 'stable slugs' and implying no side effects; no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no wasted words; efficiently conveys purpose and key option.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete for a simple list tool with one optional param; no missing info given annotations and schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; description adds example and clarifies behavior of omitting the parameter, exceeding schema details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool lists doctrine principles with stable slugs and optional filtering, distinguishing it from get and search siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implicit guidance via filtering option and sibling context; lacks explicit when-not-to-use but is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

principles.searchSearch PrinciplesA
Read-onlyIdempotent
Inspect

Search Blueprint principles by free-text query and return the closest matches ranked by relevance. Use this to find principles related to a specific design challenge, failure mode, or keyword (e.g. 'reversibility', 'approval flow', 'delegation boundary'). Returns principle title, cluster, definition, rationale, and implementation heuristics. Prefer this over principles.list when you have a specific topic in mind rather than wanting all principles.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of results to return. Capped at server maximum.
queryYesFree-text search query matched against principle title, definition, rationale, and cluster.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, so the agent knows it's safe and repeatable. The description adds behavioral context: results are 'closest matches ranked by relevance' and lists return fields (title, cluster, definition, rationale, implementation heuristics). This goes beyond what annotations provide, earning a high score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first states the core function, the second gives usage guidance and preference over a sibling. Every sentence is necessary and impactful, with the main action front-loaded. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having no output schema, the description explicitly lists all return fields (title, cluster, definition, rationale, implementation heuristics) and mentions ranking. For a search tool with only two parameters, this provides complete context for an agent to understand what it will receive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents both parameters. The description adds value for the query parameter by specifying it is matched against 'principle title, definition, rationale, and cluster,' which clarifies search scope. For limit, no extra info is added, but the description overall provides meaningful context that supplements the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Search Blueprint principles by free-text query and return the closest matches ranked by relevance,' which is a specific verb+resource. It clearly distinguishes from sibling tools like principles.list (all principles) and principles.get (specific principle) by mentioning the use case of finding principles related to a design challenge or keyword.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Prefer this over principles.list when you have a specific topic in mind rather than wanting all principles,' providing direct guidance on when to use this tool versus an alternative. This is exceptional for helping an agent choose correctly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

signals.feedbackSubmit FeedbackAInspect

Records explicit user feedback — open to all callers, no auth required. Call ONLY when the user explicitly says they want to give feedback; never proactively. contact_email stored only when permission_to_follow_up is true — confirmed in response.

ParametersJSON Schema
NameRequiredDescriptionDefault
surfaceNoWhich Blueprint surface the feedback is about. Use 'mcp' if the session was via Claude Code or another MCP client. Use 'principles', 'examples', 'guides', 'coaching', or 'validation' based on what the user interacted with.
task_typeNoWhat the user was doing when they decided to give feedback. Use plain English — e.g. 'code-review', 'architecture-design', 'agent-setup', 'onboarding', 'validation'. Infer from context.
what_helpedNoAsk the user: 'What was most helpful?' Record their answer verbatim or paraphrased in plain English. Max 1000 chars. No code snippets, no proprietary content.
what_missingNoAsk the user: 'What was missing or could be improved?' Record their answer verbatim or paraphrased. Max 1000 chars.
contact_emailNoOnly ask for this if the user explicitly says they want a follow-up response. Never prompt for email unprompted. Only stored when permission_to_follow_up=true.
rating_clarityNoAsk the user: 'How clear was the Blueprint guidance? Rate 1–5.' 1 = very unclear, 5 = very clear. Only set if the user gives an explicit number.
would_use_againNoAsk the user: 'Would you use the Blueprint again for a similar task?' Set true/false based on their answer. Only set if they answer explicitly.
rating_usefulnessNoAsk the user: 'How useful was the Blueprint for this task? Rate 1–5.' 1 = not useful, 5 = very useful. Only set if the user gives an explicit number.
permission_to_follow_upNoSet to true only if the user explicitly said they want a follow-up. Must be confirmed before storing contact_email.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and idempotentHint=false. The description states 'Records explicit user feedback', confirming mutation. It also adds behavioral detail that 'contact_email stored only when permission_to_follow_up is true — confirmed in response.' This goes beyond annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences, front-loaded with purpose and usage guidance. Every word serves a function, and there is no redundancy or unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 optional parameters and no output schema, the description covers the main behavior and usage constraint adequately. It could mention return values or confirmation, but the context is sufficient for an experienced agent. Minor gap in not describing what the response contains beyond 'confirmed'.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add additional meaning beyond what is in the schema's parameter descriptions. The only extra is the condition linking contact_email to permission_to_follow_up, but that is also noted in the parameter description. Thus, no significant value added over schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool records explicit user feedback, specifies it is open to all callers with no auth required, and emphasizes the verb 'Records' and resource 'user feedback'. It also distinguishes itself from sibling tools by focusing on feedback submission and not other signals.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when to use: 'Call ONLY when the user explicitly says they want to give feedback; never proactively.' This is strong guidance. However, it does not name alternative tools (e.g., signals.report) for when not to use, but the directive is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

signals.reportReport Value EventAInspect

Pro/Teams — records a value moment after a successful validation or design session. Only structured fields stored — no prompts or code. Do not include proprietary content in brief_context. Offer once per session after a clear success signal; never silently. Requires active Pro or Teams Bearer token. Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
team_sizeNoIf the user mentions their team size during the session, record it here. Do not ask for it explicitly — only capture if volunteered.
event_typeYesPick the type that best matches what just happened: 'review_confidence' — architect.validate returned aligned; 'runtime_risk_found' — architect.validate found violations; 'workflow_clarity' — principles/examples clarified a design decision; 'agent_setup_success' — user successfully wired up an agent or MCP tool; 'onboarding_helped' — user understood how to start using the Blueprint; 'research_time_saved' — user found relevant doctrine faster than expected; 'team_alignment' — Blueprint helped align a team on agentic design; 'other' — use only if none of the above fit.
surface_usedNoWhere the value was experienced. Use 'mcp' when called from Claude Code, Cursor, Windsurf, or any MCP client. Use 'principles' if the user was browsing or searching principles. Use 'examples' if the user was reading implementation examples. Use 'for-agents' if the user came via the /for-agents page. Use 'learn' or 'certification' for course-related sessions.
brief_contextNo1–2 plain-English sentences summarising what was helpful. Example: 'Validation identified a missing approval gate before email send.' No code snippets, no proprietary content, no user PII. Max 500 chars.
workflow_stageNoInfer from what the user was doing: 'exploring' — reading doctrine, browsing principles; 'designing' — planning architecture or agent flows; 'implementing' — writing or refactoring code; 'reviewing' — running architect.validate on existing code; 'shipping' — preparing for production or deployment.
perceived_valueNoAsk the user: 'On a scale of 1–5, how valuable was this session?' Map their answer directly: 1=low, 5=high. Do not guess — only set this if the user gave an explicit score.
would_recommendNoAsk the user: 'Would you recommend the Blueprint to a colleague?' Set true/false based on their answer. Only set if asked — do not assume.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations, the description reveals that it is a write operation (mutates state), requires a Pro/Teams Bearer token, and logs structured fields only (no prompts/code). It also specifies that the tool should be offered only after explicit success signals, not silently.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at four sentences, front-loading the purpose and key constraints. It avoids unnecessary detail, though a more structured format could improve readability slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 7 parameters and no output schema, the description covers essential aspects: when to use, auth, data privacy, and behavior. It omits error handling or success feedback, but these are not critical for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already thoroughly documents each parameter. The description adds little additional parameter-level meaning beyond the schema, meeting the baseline expectation but not exceeding it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool records a value moment after a successful session. It specifies the exact resource ('value moment') and action ('records'), and distinguishes it from other tools like signals.feedback by focusing on success signals and structured fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use the tool ('after a clear success signal' and 'offer once per session'), what not to include ('no proprietary content'), and authentication requirements. However, it does not explicitly mention alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

team.summarizeSummarize Team UsageA
Read-onlyIdempotent
Inspect

Pro/Teams — summarises your tool usage patterns and value signals from log data. Offer when user asks how the Blueprint is helping or what to explore next; not proactively. ENTERPRISE-SAFE: pass private_session=true to bypass all server-side logging for this summary call. UK/EU data residency (Cloud Run europe-west2). Auth: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
days_backNoNumber of days of usage history to include in the summary.
private_sessionNoSet to true to skip logging this summary call.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and idempotentHint. Description adds crucial context: enterprise-safe with private_session flag to bypass logging, UK/EU data residency (Cloud Run europe-west2), and Bearer auth. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with clear front-loading of purpose, then usage, then security details. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, usage, security, and residency. However, no output schema and description does not specify what the summary contains (e.g., format or metrics), leaving slight ambiguity for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters having descriptions. Description adds context about private_session bypassing server-side logging, but this is largely consistent with the schema description. Baseline 3 applies since schema already covers semantics adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'summarises your tool usage patterns and value signals from log data', which is a specific verb+resource. It distinguishes from sibling tools (e.g., assets.list, clusters.list) that do not perform summarization.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance: 'Offer when user asks how the Blueprint is helping or what to explore next; not proactively.' This tells the agent when to use it, though it lacks explicit mention of alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources