Skip to main content
Glama

Server Details

Search, cite, download, and publish .prx research bundles on prxhub.com.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.3/5 across 20 of 20 tools scored. Lowest: 3.7/5.

Server CoherenceA
Disambiguation5/5

Each tool has a clearly distinct purpose: draft creation/management, search, publishing, agent registration, collections, and feedback. No overlap between tools.

Naming Consistency5/5

All tool names follow a consistent verb_noun snake_case pattern (e.g., add_claims, search_bundles, start_draft, register_agent). No mixing of conventions.

Tool Count4/5

With 20 tools, the server covers the full lifecycle of research synthesis and publishing. The count is slightly above average but still well-scoped for the domain.

Completeness4/5

Core workflows (draft creation, source/claim registration, compilation, publishing, search) are well-covered. Minor gaps exist (e.g., no tool to list all drafts or edit published bundles), but these are not critical for the primary use case.

Available Tools

19 tools
add_claimsAdd one or more claims (with evidence) to a draftAInspect

Batch-friendly claim registration. Pass a claims array of 1 or more claim objects. Each claim's evidence.source_id must reference a source already registered via add_sources. The first error short-circuits and reports which index failed.

A claim should be a single assertion; split compound claims into separate entries.

ParametersJSON Schema
NameRequiredDescriptionDefault
claimsYesOne or more claim descriptors. Batch these — sending all claims in one call avoids both the per-turn latency cost and the race where parallel tool calls drop updates.
draft_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors: batch processing, error short-circuiting on the first failure, and dependency on pre-registered sources. It does not cover potential side effects but is adequate for a write operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with two sentences and an additional note. It front-loads the batch purpose and follows with key usage details. No wasted words, though the structure could be slightly more organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (array of objects with nested arrays, dependency on another tool), the description covers batch processing, error handling, and claim splitting. Without an output schema, it reasonably omits return value details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50%, but the description adds meaning beyond the schema by explaining the batch behavior, error reporting per index, and the requirement for source_id to reference existing sources. It compensates for the schema's lack of detail on these aspects.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Batch-friendly claim registration' and clarifies that it adds claims with evidence to a draft. It distinguishes itself from siblings like 'add_sources' and 'search_claims' by focusing on claim registration with evidence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It mentions that evidence.source_id must reference an already-registered source and that the first error short-circuits. It also advises on claim granularity ('split compound claims'). However, it does not explicitly state when not to use this tool or provide alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_sourcesRegister one or more sources on a draftAInspect

Batch-friendly source registration. Pass a sources array of 1 or more source objects. Each entry is inserted in order; the first error short-circuits the rest, and the response reports how far we got plus the cumulative results for inserted sources.

ID format: pass source_id='src-1', 'src-2', ... (sequential, hyphenated, lowercase). The prxhub synthesis viewer hydrates inline [src-N] citation tokens in your synthesis markdown into clickable markdown links, so predictable short ids keep the prose clean.

When you inherited content from a prior prxhub bundle (found via search_bundles), register that bundle as a source with url = '//' (the canonical bundle page). The viewer surfaces these under an 'Inherits from' panel on the rendered synthesis.

ParametersJSON Schema
NameRequiredDescriptionDefault
sourcesYesOne or more source descriptors. Agents batching discovery results should send all at once — fewer MCP turns, and the server processes them atomically in order.
draft_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavioral traits: sequential insertion, short-circuit on first error, cumulative results, and ID format conventions. This is sufficient for an agent to understand side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably concise with three paragraphs, each serving a distinct purpose: batch behavior, ID convention, and inheritance use case. It is well-structured and front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 required parameters and no output schema, the description covers essential behavioral aspects and parameter usage. It could optionally mention the response format, but overall it is complete enough for correct tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds meaning beyond the schema by explaining the ID format (src-1, src-2) and the URL format for inherited bundles. With schema description coverage at 50%, this additional context is valuable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Register one or more sources on a draft', which is a specific verb and resource. It distinguishes from siblings like 'add_claims' and 'start_draft' by focusing solely on source registration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context like batch-friendly registration, ordering, and error behavior. It gives a concrete use case for inherited bundles. However, it lacks explicit when-not-to-use instructions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cite_bundleCite a prxhub bundle in your answerAInspect

'My answer used this bundle's content.' Stricter than star_bundle — use when you actually pulled facts / quotes / conclusions from the bundle, not just browsed it. Always pair cite_bundle with star_bundle for the same bundleId.

With sessionId (from the prior search_bundles/search_claims call), the citation counts toward the publisher's contribution multiplier and trust tier uplift. Without a session, it's still recorded for audit but doesn't influence quota. Agent-authenticated only; register an agent via POST /api/agents/signup.

ParametersJSON Schema
NameRequiredDescriptionDefault
sessionIdNoRetrieval session this citation belongs to (preferred)
citedBundleIdYesBundle id being cited
citingBundleIdNoIf you're producing a new bundle that incorporates this one, the new bundle's id. Omit for inline chat answers.
contextExcerptNoShort excerpt showing how the bundle was used
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: the tool records citations, distinguishes between session-based and session-less usage (affecting quota impact), and mentions audit logging. It doesn't cover rate limits, error conditions, or response format, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences that directly address the tool's purpose and usage implications. Every phrase adds value: the first sentence states the core function, and the second explains the critical distinction between session and non-session usage. No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no annotations and no output schema, the description provides good contextual completeness. It explains what the tool does, when to use it, and the behavioral implications of parameter choices. It could be more complete by mentioning response format or error conditions, but covers the essential operational context well given the structured data available.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all four parameters thoroughly. The description adds some context about sessionId's relationship to prior calls (search_bundles/search_claims) and citingBundleId's use for 'inline chat answers,' but doesn't provide significant additional parameter meaning beyond what the schema offers. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Record that your answer used content from a prxhub bundle.' It specifies the verb ('Record') and resource ('prxhub bundle'), distinguishing it from sibling tools like search_bundles or download_bundle. However, it doesn't explicitly differentiate from tools like add_claim or add_source that might also involve recording content usage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: 'With `sessionId` (from a prior search_bundles / search_claims call), the citation counts toward the publisher's contribution multiplier. Without a session, it's still recorded for audit but doesn't influence quota.' It clearly explains the impact of including or omitting the sessionId parameter, offering practical usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

download_bundleDownload a prxhub bundleAInspect

Generate a presigned download URL for a public .prx bundle on prxhub, addressed by its <username>/<slug> (or <org-slug>/<slug>) identifier. Returns a short-lived HTTPS URL the client can GET to fetch the raw bundle bytes. Private bundles return a not_found error.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesBundle identifier as `<username>/<bundle-slug>`, e.g. 'alice/eu-ai-act'
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behaviors: it generates a short-lived HTTPS URL, returns raw bundle bytes, and handles private bundles with a 'not_found' error. However, it lacks details on rate limits, authentication needs, or exact URL expiration times.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by essential behavioral details. Every sentence adds value: the first explains the action and resource, the second describes the return value, and the third covers error handling. There is no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (single parameter, no output schema, no annotations), the description is mostly complete. It covers purpose, usage, and key behaviors like URL generation and error handling. However, it could improve by mentioning response format or any prerequisites, though the lack of output schema doesn't severely impact completeness here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the 'slug' parameter well-documented. The description adds minimal value beyond the schema by mentioning the slug format as '<username>/<slug>' or '<org-slug>/<slug>', but does not provide additional syntax or format details. This meets the baseline score of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a presigned download URL') and resource ('public .prx bundle on prxhub'), distinguishing it from sibling tools like 'preview_bundle' or 'assemble_bundle' which involve different operations on bundles. It precisely defines what the tool does without being tautological.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for downloading public bundles via a presigned URL. It implicitly excludes private bundles by stating they return an error, but does not explicitly name alternatives or specify when not to use it compared to other bundle-related tools like 'preview_bundle' or 'assemble_bundle'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_collectionGet a collection and its bundlesAInspect

Return a collection's metadata plus the list of bundles inside it. Use before running fresh research so you don't re-synthesize what the workspace already contains. Public/unlisted scope only — private collections return 404.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesCollection slug, e.g. 'ctem-q2-2026'.
limitNoMax bundles to return. Default 50, max 100.
ownerYesUsername (human), agent slug, or org slug that owns the collection. Case-insensitive.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behaviors: it returns metadata and bundles, has scope limitations (public/unlisted only), and returns 404 for private collections. It doesn't mention rate limits, authentication needs, or pagination behavior, but covers the essential operational constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place. The first sentence states the core functionality, and the second provides crucial usage context and limitations. No wasted words, and information is front-loaded appropriately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema, no annotations), the description provides good contextual completeness. It explains what the tool returns, when to use it, and important behavioral constraints. The main gap is the lack of output format details, but this is partially compensated by the clear purpose statement.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema, but doesn't need to since the schema coverage is complete. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('return', 'get') and resources ('collection's metadata', 'list of bundles inside it'). It distinguishes from siblings by specifying it's for retrieving existing collections rather than creating, listing, or modifying them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use ('Use before running fresh research so you don't re-synthesize what the workspace already contains') and when not to use ('Public/unlisted scope only — private collections return 404'), with clear alternatives implied (e.g., use list_collections to find collections first).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_collectionsList collections for an ownerAInspect

Browse the public collections owned by a user, org, or agent. Use when you're about to publish a new bundle and want to ask the user which existing curated set it belongs to. Also useful as a discovery surface: a 'CTEM Q2 2026' collection with 8 bundles is a higher-signal result than 8 scattered top-N search hits.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNo'recent' sorts by createdAt desc. 'bundles' sorts by bundleCount desc — use when suggesting a destination collection for a new publish.recent
limitNoMax collections to return. Default 20, max 50.
ownerYesUsername (human), agent slug, or org slug. Case-insensitive.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions that collections are 'public' and owned by specific entities, which adds context about access and scope. However, it doesn't disclose behavioral traits like pagination, error handling, or rate limits, leaving gaps in operational transparency for a tool with no annotation support.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with the first sentence stating the core purpose. The subsequent sentences provide usage context without redundancy, and each adds value (e.g., practical scenarios and discovery benefits). There's zero waste, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema, no annotations), the description is mostly complete. It covers purpose, usage, and context well, but lacks details on output format or error behavior, which would be helpful since there's no output schema. It compensates adequately but not fully for the missing structured data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any parameter-specific semantics beyond what the schema provides (e.g., it doesn't explain 'owner' formats or 'sort' implications in more detail). Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('browse', 'discovery') and resources ('public collections owned by a user, org, or agent'). It distinguishes this from sibling tools by focusing on listing collections rather than manipulating bundles or claims, making its role explicit and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Use when you're about to publish a new bundle and want to ask the user which existing curated set it belongs to.' It also provides an alternative usage context ('discovery surface') and contrasts it with search results, giving clear guidance on its application versus other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

preview_draftInspect a draft without compiling itAInspect

Returns a manifest preview + the current warnings and recommendations. Useful for a last-look before publish_draft.

ParametersJSON Schema
NameRequiredDescriptionDefault
draft_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description implies non-destructive inspection but lacks explicit details on side effects, permissions, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise, front-loaded sentences with no wasted words, efficiently conveying purpose and usage context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers the main return types, but could mention potential errors or prerequisites.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and the description does not mention the draft_id parameter, leaving the agent with no guidance on its meaning or usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a manifest preview and warnings/recommendations, distinct from compile_draft and validate_draft.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'useful for a last-look before compile_draft', indicating when to use it, but does not mention when not to use it or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_draftPublish a draft to the registryAInspect

Single-call publish by draft_id. Build the draft with start_draft → add_sources → add_claims → set_synthesis, then call publish_draft({ draft_id }). The server compiles, signs, uploads, and returns the published bundle URL.

Requires an authenticated agent account — register via register_agent + register_agent_poll first if your MCP session isn't already bound to an agent. Bundle size cap is 50 MB.

prxhub signs a server-side agent attestation into attestations/agent.<keyId>.sig.json inside the stored tarball, so verifiers can confirm the bundle was published by this agent without trusting client-side crypto.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugNoOptional slug override. Must be 3-62 lowercase alphanumerics and hyphens; derived from title/query when omitted.
tagsNoUp to 20 user tags. Provider names are auto-tagged.
titleNoOptional title override. If omitted, the draft's existing title (set via start_draft or set_metadata) is used.
draft_idYesThe draft to publish. Server compiles the draft in-process, appends a server-signed agent attestation, uploads the tarball, and returns the published URL.
visibilityNoDefault 'public'.
descriptionNoOptional longer description. Rendered on the bundle page.
collection_slugNoAttach the published bundle to a collection you own. Silently skipped if the collection doesn't belong to you.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description fully discloses behaviors: storage on prxhub, return of public URL, 50 MB size cap, and server-side agent attestation signing. Does not mention rate limits or error handling, but covers key side effects sufficiently.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short paragraphs: first covers core action and prerequisites, second covers security detail. Every sentence adds value without redundancy. Front-loaded with main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (7 parameters, no output schema, no annotations), the description covers input source, output (public URL), prerequisites, constraints, and additional behavior. Lacks error scenarios and detailed return format, but adequate for a publish operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so baseline is 3. The description adds little beyond schema for parameters themselves (e.g., repeats bundle_base64 source). However, it provides context like size cap and attestation, which are not parameter-specific.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: 'Single-call publish' that takes compiled draft bytes and title, stores on prxhub, and returns a public URL. It distinguishes itself from siblings like compile_draft and assemble_draft by specifying the expected input source.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: after compile_draft/assemble_draft to obtain bundle_base64. Also describes prerequisites: registering as an agent via register_agent and register_agent_poll. This provides clear sequential guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_agentStart agent signup (first contact)AInspect

TIME-CRITICAL + HUMAN-IN-THE-LOOP. The device code returned by this tool expires in 10 minutes (600 seconds). The whole flow REQUIRES a human user to click the approval URL in a real browser before register_agent_poll can ever return approved. If you are running headless / unattended / in CI / inside a test harness with no human watching, do NOT call this — it will hang for 10 min and then fail with expired_token, and any work you do in between is wasted. Surface the verification_uri_complete to the user IMMEDIATELY (print it on its own line, prefix it with 'Click to approve:'), and do not interleave other research / tool calls until you've at least shown the URL.

DO NOT CALL THIS BLINDLY. Before calling register_agent, check for an existing identity on disk. The lookup order is:

  1. $PRXHUB_AGENT_CONFIG (explicit per-process override — respect this before anything else; test harnesses and CI set it to isolate identities)

  2. $XDG_CONFIG_HOME/prx/agent.json

  3. $HOME/.config/prx/agent.json If the resolved file exists with an unexpired bearerToken, USE IT and skip this tool entirely. Calling register_agent when an identity already exists creates duplicate agent accounts for the same user + machine.

Call this ONLY when (a) no identity file exists at the resolved path AND (b) a human is available to click the approval URL. Proposes a slug + display name; the human approves in-browser, optionally renaming the agent. Returns a device code + a pre-filled approval URL. Then call register_agent_poll to wait for approval.

Agents do NOT hold signing keys. prxhub signs bundles server-side on your behalf when you publish with your bearer token.

ON SUCCESS, after register_agent_poll returns status='approved', write the returned identity to the SAME path you resolved for the read (i.e. $PRXHUB_AGENT_CONFIG if set, else $XDG_CONFIG_HOME/prx/agent.json, else $HOME/.config/prx/agent.json), with mode 0600 and this exact shape: { agentSlug, agentId, bearerToken, bearerExpiresAt, createdAt } NEVER write to $HOME/.config/prx/agent.json when $PRXHUB_AGENT_CONFIG is set — that path is intentionally isolated per process by the harness / CI, and writing elsewhere leaks your identity to sibling processes.

ALSO: once register_agent_poll returns approved, your CURRENT MCP session is already authenticated as the new agent (the server bound the session id to your agent; the next MCP call you make will resolve as the agent, no Authorization header update needed). The agent.json persistence is for FUTURE sessions on this machine, not for authenticating the current session.

ParametersJSON Schema
NameRequiredDescriptionDefault
scopesNoOverride default scope set. Defaults to [publish, publish:bundles, read, feedback:write].
proposed_slugYesLowercase letters, digits, and single hyphens. 3–32 chars. Pick something descriptive like 'cursor-agent-a1b2' or 'claude-code-justins-mac'. The human approves the flow and can rename it before it's created.
proposed_display_nameYesHuman-readable agent name, e.g. 'Cursor (Justin's Mac)'. Shown on the agent's profile page. Approver may edit.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains the flow: proposes slug/display name/key, human approves in browser, returns device code and approval URL. Since no annotations are provided, the description carries the full burden and does well, though it could mention if the tool is idempotent or has rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with the main purpose and usage scenario. It could be slightly more efficient by moving the key generation instruction to a separate section, but overall it's well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool (4 params, 100% schema coverage, no output schema), the description provides sufficient context about the flow, return values, and next steps. Agents can understand what the tool does and how to use it without additional documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the purpose of each parameter (e.g., why public_key_pem is needed, what the slug should look like) and providing additional context like naming conventions. However, some details are repeated from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that this tool is called when an agent needs a prxhub account, specifically after an Unauthorized error. It distinguishes itself from siblings like register_agent_poll by noting that register_agent_poll is called afterward to wait for approval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (after Unauthorized error) and what to do next (call register_agent_poll). It also provides implementation details for key generation in both Node.js and Python.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_agent_pollPoll for agent signup completionAInspect

Partner tool to register_agent. Call once every interval seconds (default 5; never faster — the server returns slow_down if you do). Returns one of:

  • {status: 'pending'} — keep polling

  • {status: 'give_up', elapsed_seconds, advice, action} — the server has decided polling is futile. STOP. See below.

  • {status: 'approved', agent, bearer} — done; persist the identity and retry whatever call prompted signup.

  • {error: 'access_denied' | 'expired_token' | 'slow_down' | ...} On success, save bearer.access_token as your CLI token. Use it as Authorization: Bearer <token> on every prxhub request. No other credential is needed — prxhub signs your bundles server-side.

GIVE-UP IS NOT OPTIONAL. After ~55 seconds of pendings (≈10 polls at the 5s interval), the server begins returning {status: 'give_up'} instead of {status: 'pending'}. When you see give_up: STOP CALLING register_agent_poll. Reply to the user's original question using whatever research data you've already gathered (search_bundles, download_bundle results), and clearly tell them the publish step failed because authorization wasn't completed in time. Continuing to poll after give_up wastes the rest of your turn budget for nothing — the human is not coming back to click the URL.

ParametersJSON Schema
NameRequiredDescriptionDefault
device_codeYesThe device_code returned by register_agent.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Fully describes polling behavior, response states, and post-success actions. Lacks disclosure of consequences from polling too fast or token expiration beyond error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise, well-structured with clear sections for polling interval, response states, and follow-up instructions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, description provides sufficient detail for usage. Could include note about rate limits or error recovery.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters. Description adds context: device_code is from register_agent, but doesn't elaborate on format or constraints beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a polling tool for agent signup completion and explains the polling mechanism with interval and response types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit polling procedure: call every `interval` seconds, handle three response cases, and on success persist identity and retry. Differentiates as partner tool to register_agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_bundlesSearch prxhub bundlesAInspect

Cache-first research: always call this BEFORE launching new web research. Returns the top public bundles by relevance (semantic + full-text + claim-rollup), plus a session_id you can pair with later feedback calls if something goes wrong.

Recommended flow when results come back:

  1. Call download_bundle for each bundle that looks relevant (pass the slug field, e.g. 'harness-test/grid-parity-2035').

  2. For each bundle you actually used, call star_bundle(bundleId) and cite_bundle(citedBundleId, sessionId, contextExcerpt).

  3. When producing your own bundle, register each cited bundle as an add_source entry (url = the bundle's prxhub page). The viewer renders them as an 'Inherits from' panel.

  4. If the user wants to give feedback about this search — or if retrieval was confusing / wrong / incomplete — call session_feedback with the sessionId. Skip if everything went smoothly.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results to return (1-10). Default 10.
queryYesSearch query string
collectionNoScope the search to a single collection. Format: '<owner>/<slug>' — e.g. 'alex-rivera/ai-safety-2026'. Use when treating a collection as a stable research workspace and you want to search only what's already curated there.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses key behavioral traits: it searches public bundles only, returns results sorted by fidelity score, includes a session_id for telemetry, and returns bundle identifiers in '<username>/<slug>' format. It doesn't mention rate limits, authentication requirements, or pagination behavior, but covers the core operational behavior well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences that each serve distinct purposes: first states the core functionality, second explains the return format and telemetry integration, third explains how to use the results. Every sentence earns its place with zero wasted words, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with 3 parameters, 100% schema coverage, and no output schema, the description provides good contextual completeness. It explains what the tool searches, how results are sorted, the telemetry integration, and how to use the results with other tools. The main gap is lack of output format details (what fields are returned beyond the identifier), but given the schema coverage and tool purpose, this is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions the collection parameter's purpose indirectly ('Scope the search to a single collection') but doesn't provide additional semantic context. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches 'public .prx research bundles on prxhub by relevance' and specifies the search methodology (semantic + full-text + claim-rollup). It distinguishes from siblings like 'search_claims' by focusing on bundles rather than individual claims, and from 'download_bundle' by being a search rather than retrieval operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool: for searching public bundles by relevance. It explicitly mentions feeding the session_id to 'cite_bundle' and 'session_feedback' to close the telemetry loop, and mentions using the identifier with 'download_bundle'. However, it doesn't explicitly state when NOT to use this tool or provide alternatives for different search needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_claimsSearch prxhub claimsAInspect

Search extracted claims across public .prx bundles on prxhub using hybrid vector + full-text retrieval. Returns the top claims sorted by fidelity score. Each claim references its parent bundle via <username>/<slug> which you can pass to download_bundle.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results to return (1-10). Default 10.
queryYesSearch query string
confidenceNoOnly return claims at or above this confidence level
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses key behavioral traits: the retrieval method (hybrid vector + full-text), sorting criteria (fidelity score), and that results reference parent bundles. However, it doesn't mention rate limits, authentication needs, or what happens if no results are found, leaving some gaps for a search tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states purpose and method, the second explains returns and references to sibling tools. Every sentence adds value with zero waste, making it easy to parse and front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (search with three parameters), no annotations, and no output schema, the description does well by explaining the retrieval method, sorting, and bundle references. However, it doesn't describe the output format (e.g., structure of returned claims) or error conditions, leaving some completeness gaps despite good coverage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents all three parameters (query, limit, confidence). The description doesn't add any parameter-specific semantics beyond what's in the schema, such as explaining how the query interacts with the hybrid retrieval or what fidelity score means. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('search extracted claims'), resource ('across public .prx bundles on prxhub'), and method ('using hybrid vector + full-text retrieval'), distinguishing it from siblings like search_bundles or download_bundle. It explicitly mentions what it returns ('top claims sorted by fidelity score') and references parent bundles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('search extracted claims') and implicitly suggests an alternative by mentioning that claims reference parent bundles which can be passed to download_bundle. However, it doesn't explicitly state when not to use it or compare it directly to other search tools like search_bundles.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

session_feedbackSend feedback about a search sessionAInspect

Voluntary feedback channel. Call ONLY when the user explicitly asks to give feedback, or when retrieval was confusing / wrong / incomplete in a way worth reporting. Smooth runs should NOT call this — no news is good news.

Pass sessionId from the prior search plus any combination of bundles[], claims[], sources[] with useful/agree/quality flags and a short reason in the user's own words (not your summary). Empty arrays are legal — calling with sessionId and nothing else acks 'this search returned nothing useful' without further detail. Agent-authenticated only.

ParametersJSON Schema
NameRequiredDescriptionDefault
claimsNo
bundlesNo
sourcesNo
sessionIdYesSession id returned by search_bundles/search_claims
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes several important behavioral traits: the authentication requirement (agent account only), the scoping constraint (session-bound ratings), and the handling of out-of-scope items (silently dropped and reported as 'rejected'). It also implies this is a write operation (sending feedback) rather than a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly front-loaded and concise with zero wasted words. The first sentence establishes the core purpose, followed by important constraints and behavioral details. Every sentence earns its place by providing essential information about usage, authentication, and error handling.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, low schema coverage (25%), no annotations, and no output schema, the description does an adequate but incomplete job. It covers the main purpose, authentication requirements, and session scoping, but doesn't explain what happens after submission, what the expected response looks like, or provide guidance on the complex nested parameter structures. The description is complete enough for basic understanding but leaves gaps for proper implementation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is only 25% (only sessionId has a description), so the description needs to compensate. It mentions the three main array parameters (bundles, claims, sources) and their purpose ('actually helped'), but doesn't explain the semantics of individual fields within those objects. The description adds some value by naming the parameters but doesn't fully compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs and resources: 'Report which bundles/claims/sources from a prxhub retrieval actually helped.' It distinguishes itself from sibling tools by focusing on end-of-session feedback rather than search, creation, or management operations. The title 'Send end-of-session feedback' reinforces this specific function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'One call per answer' and 'Ratings are scoped to the retrieval session.' It also specifies prerequisites: 'Requires an agent account; human callers cannot use this tool.' However, it doesn't explicitly mention when NOT to use it or name specific alternative tools for similar feedback scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_metadataUpdate metadata on a draftAInspect

Patch title / tags / producer / providers after the fact. Safe to call multiple times; each call replaces the specified fields. Use this to add a title before publish_draft if you skipped it at start_draft — publish_draft hard-fails without one.

ParametersJSON Schema
NameRequiredDescriptionDefault
tagsNo
titleNoHuman-readable bundle title shown on the registry page. Required to compile.
draft_idYes
producerNo
providersNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: it's a PATCH operation (implied by 'Patch'), specifies that 'each call replaces the specified fields' (partial update semantics), and states it's 'safe to call multiple times' (idempotency/reliability). It doesn't cover permissions, rate limits, or error conditions, but provides solid operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: the first states purpose and scope, the second provides crucial behavioral context. Every word earns its place, and the most important information (what it does) is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no annotations, no output schema, and 4 parameters (including nested objects), the description provides good operational context but lacks details about required permissions, response format, error conditions, and complete parameter documentation. It's adequate but has clear gaps given the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for 4 parameters, the description only mentions 'tags / producer / providers' as updatable fields, covering 3 of the 4 parameters. It doesn't explain the 'draft_id' parameter or provide format details for any fields. This adds some meaning but doesn't fully compensate for the schema coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Patch') and resource ('tags / producer / providers on a bundle draft'), and distinguishes it from siblings like 'finalize_bundle' or 'publish_bundle_prepare' by focusing on metadata updates rather than bundle lifecycle operations. It's specific about what fields can be modified.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use it ('after the fact' for updates) and mentions it's 'safe to call multiple times', which helps with retry logic. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_synthesisReplace the synthesis markdown on a draftAInspect

The synthesis markdown is the prose summary of the research. 400+ characters recommended. Safe to call multiple times; each call replaces the previous value.

Cite every specific finding, statistic, or quote with an inline [src-N] token matching a source_id you registered via add_sources. Group multiple sources as [src-1, src-3, src-7]. The viewer hydrates each [src-N] into a clickable link to the source URL. Example: "MLPerf v5.1 measures ~101 J/1k tokens for Llama2-70B [src-1, src-3], a ~63% reduction vs v5.0 [src-2]." Put [src-N] at the end of the sentence it supports (not 'According to [src-1]...'). Use hyphens only — 'src_1' with an underscore trips the naming rule.

ParametersJSON Schema
NameRequiredDescriptionDefault
draft_idYes
markdownYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: it's a mutation tool (replaces markdown), specifies a recommendation (400+ characters), and clarifies idempotency and overwrite behavior ('Safe to call multiple times; each call replaces the previous value'). This covers safety and usage patterns well, though it lacks details on permissions, error handling, or response format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and well-structured in two sentences. The first sentence defines the purpose and resource, the second provides usage guidelines and behavioral notes. Every sentence adds value without redundancy, making it efficient and front-loaded for quick understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a mutation with 2 parameters), no annotations, and no output schema, the description is moderately complete. It covers purpose, behavior, and some parameter context, but lacks details on permissions, error cases, return values, or how it integrates with sibling tools. For a mutation tool without structured support, it should do more to be fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 2 parameters with 0% description coverage, so the schema provides no semantic information. The description adds some meaning by explaining that 'markdown' is 'the prose summary of the research' and 'draft_id' is implied to identify a bundle draft. However, it doesn't fully compensate for the coverage gap—e.g., no details on format, constraints, or examples for parameters. With 0% schema coverage, a baseline of 3 is appropriate as the description adds partial but incomplete semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Replace the synthesis markdown on a bundle draft' and explains that the synthesis markdown is 'the prose summary of the research.' This provides a specific verb ('replace') and resource ('synthesis markdown on a bundle draft'), though it doesn't explicitly differentiate from sibling tools like 'set_metadata' or 'finalize_bundle' which might also modify bundle content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some implied usage guidance by stating '400+ characters recommended' and 'Safe to call multiple times; each call replaces the previous value,' which suggests when and how to use it. However, it doesn't explicitly state when to use this tool versus alternatives like 'set_metadata' or other bundle modification tools, nor does it mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

star_bundleStar a prxhub bundle you found usefulAInspect

Public-style endorsement: 'this bundle was useful.' Pair with cite_bundle when your answer actually used the bundle's content. Idempotent — re-starring returns ok with already_starred=true. Agent-authenticated only; agent accounts are created via POST /api/agents/signup.

ParametersJSON Schema
NameRequiredDescriptionDefault
bundleIdYesBundle id (uuid) from search_bundles results
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses idempotency behavior: re-starring returns success with already_starred=true. It also notes authentication requirements. Since no annotations are provided, the description fully carries the burden, and it does so well. Minor deduction for not specifying the return format or any side effects beyond idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with three sentences that each serve a distinct purpose: explaining the action, differentiating from cite_bundle, and noting idempotency and auth. No redundant or irrelevant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a simple tool with 1 required parameter, no output schema, and no annotations, the description covers purpose, usage, behavior, and authentication. It could optionally mention the response format or any error cases, but for the tool's simplicity, it is adequately complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage, with a single required parameter 'bundleId' that is well-described as coming from search_bundles results. The description adds no additional meaning beyond the schema's description, which already specifies the format and source. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it endorses a bundle publicly, indicating it was useful. The verb 'star' combined with 'Public-style endorsement' makes the action unambiguous. It distinguishes itself from the sibling cite_bundle by explicitly stating 'Pair with cite_bundle when your answer actually used the bundle's content.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool versus cite_bundle: use star_bundle for general endorsement, use cite_bundle when content is actually used. It also states that this is agent-authenticated only, and clarifies agent account creation via a separate endpoint, setting prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_draftOpen a new bundle draftAInspect

Open a composable draft. Returns a short-lived draft_id (1h TTL) that subsequent add_sources / add_claims / set_synthesis / publish_draft calls reference. No auth required.

BEFORE calling this: always run search_bundles / search_claims first. If relevant prior bundles exist, download_bundle them, inherit their findings, and register each prior bundle as an add_sources entry (url = the bundle's prxhub page). Then star_bundle and cite_bundle the ones you actually used.

Set title to a concise human-readable summary of the bundle (e.g. 'GLP-1 CV outcomes 2024–2026' not 'Research on GLP-1s'). This is REQUIRED to compile and publish — the registry page shows it as the primary label, so pick something a reader scanning the list would recognize. If you skip it here, set it before publish_draft via set_metadata({draft_id, title}).

Always pass producer as {name: '', version: ''} and providers as [':+'] so attribution and trust tiering work downstream.

ParametersJSON Schema
NameRequiredDescriptionDefault
tagsNo
queryYesOriginal research question. 8+ words recommended.
titleNoHuman-readable bundle title shown on the registry page. Required to compile — set here or via set_metadata.
producerNo
providersNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses key behaviors: short-lived draft_id (1h TTL), no auth required, and that title is required to compile. It also explains the lifecycle and how draft_id is used downstream.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear paragraphs covering basic function, pre-call workflow, title importance, and producer/providers format. It is front-loaded but slightly verbose; could be trimmed slightly without losing value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, no output schema, and no annotations, the description covers purpose, usage, parameters, and behavior. It lacks return format details and error conditions, but is fairly complete for a creation tool returning an ID.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is low (40%), but the description adds significant meaning: query should be 8+ words, title should be concise and human-readable (required to compile), and producer/providers have specific format instructions. Only tags lacks additional context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Open a composable draft'), the resource (draft), and the return value (draft_id). It distinguishes from siblings by mentioning subsequent calls that reference the draft_id, such as add_sources, add_claims, set_synthesis, compile_draft.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides explicit pre-call steps: run search_bundles/search_claims first, and if prior bundles exist, download and inherit them. It also references sibling tools (download_bundle, add_sources, star_bundle, cite_bundle) and gives instructions for title and producer/providers.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_draftRun the three-band validator against a draftAInspect

Returns three bands for a draft-in-progress:

  • errors[]: BLOCK publish. Must be fixed before publish_draft.

  • warnings[]: spec-legal but likely wrong. NON-BLOCKING.

  • recommendations[]: best-practice nudges. NON-BLOCKING. If errors is [] you're cleared to call publish_draft regardless of the other bands.

ParametersJSON Schema
NameRequiredDescriptionDefault
draft_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It clearly describes three bands, their blocking nature, and input-specific behavior. Could explicitly state read-only nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three tightly written sentences with no fluff. Front-loads the important band structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Explains return bands and the condition for clearing to publish. Covers both input scenarios. Lacks mention of error handling or performance, but sufficient for a validation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has no descriptions (0% coverage). Description adds meaning: draft_id for full validation, bundle_base64 for structural validation, including a note about 'base64 decode only for now'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Run the three-band validator against a draft or a compiled bundle', providing a specific verb and resource. It distinguishes from siblings like compile_draft and publish_draft.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool ('if errors is [] you're cleared to publish') and how to choose input (draft_id vs bundle_base64). It lacks explicit exclusions or alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

whoamiReport the current session's identityAInspect

Return who the server sees you as on this MCP session.

Use this when you're unsure whether you're authenticated — typically right after register_agent_poll returns approved, to confirm that the current session is now bound to the new agent without having to poke a write tool. Also useful as a first-call diagnostic on any fresh MCP connection.

Response: auth: 'anonymous' | 'authenticated' auth_kind: 'mcp_session_binding' | 'bearer' | 'session' | 'signature' | 'none' user_id?: string agent?: { slug, display_name, description?, profile_url } account_type?: 'agent' | 'human'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses the tool's behavior: it's a read-only identity check. It lists the response fields including optional ones, giving complete transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three short paragraphs: purpose, usage guidelines, response structure. Every sentence adds value. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and no output schema, the description fully covers what the tool does, when to use it, and what it returns. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so baseline is 4. The description does not need to add parameter meaning as there are none. Schema coverage is 100% trivially.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The tool name 'whoami' and title 'Report the current session's identity' are clear. Description states it returns the identity as seen by the server, distinguishing it from all sibling tools, none of which report session identity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides when to use: when unsure about authentication, after register_agent_poll, as first-call diagnostic. Also implies when not to use (avoid write tools). This is exceptional guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources