Fillin
Server Details
Search for AI agents. Closes the LLM-cutoff gap: CVEs, papers, frontier AI, prediction markets.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.4/5 across 13 of 13 tools scored.
Most tools have clearly distinct purposes, but the retrieval tools (fillin_query, glyph_search, retrieve_auto) overlap significantly in what they retrieve, differing only in output format. The descriptions help, but an agent might struggle to pick the right one.
Naming uses multiple prefixes: 'fillin_', 'query_', 'glyph_', and 'retrieve_auto'. While each prefix indicates a category, the mixing of conventions (e.g., 'glyph_search' vs 'fillin_query') reduces predictability. Still, the names are descriptive.
With 13 tools covering retrieval, marketplace, domain-specific queries, and admin, the count feels well-scoped for the server's purpose. No tool seems superfluous, and the set is manageable.
The tool surface covers core retrieval, synthesis, marketplace, and several specific domains. Minor gaps exist, such as lacking a tool to list user-owned mints or view purchase history, but overall it is quite complete for the stated purpose.
Available Tools
13 toolsfillin_answerARead-onlyIdempotentInspect
Synthesized post-cutoff answer with inline citations.
Use this when your model is small / cheap / weaker at tool-result
synthesis (Llama, Gemini Flash, Mistral, Nemotron, Qwen). Fillin runs
a server-side LLM pass over the retrieved post-cutoff documents and
returns a 150-250 word answer with [title](url) citations already
embedded — you can quote it directly.
Premium models (Opus, Sonnet, GPT-4o) usually get better results from
`fillin_query` and synthesizing themselves, but this tool works for
any caller. Costs more than fillin_query because of the synthesis pass.
Returns:
A dict with:
- answer: the synthesized paragraph (str | None)
- citations: list of {title, url} extracted from the answer
- corpus_match: "strong" | "weak" | "none" — quality of retrieval
- top_score: float — top reranked similarity score
- model: the synthesizer model used (e.g. claude-haiku-4-5)
- reason: set when answer is None (e.g. "no_relevant_docs")
- results: raw post-cutoff documents (same shape as fillin_query)
- cutoff, query, gap_days: echoes for context
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | Number of documents to ground the answer in (1-20). | |
| query | Yes | Natural-language question, max 512 chars. | |
| cutoff | Yes | Training cutoff as ISO-8601 date (e.g. 2026-01-01). |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, openWorldHint, idempotentHint), the description details that a server-side LLM pass is executed, returns 150-250 word answer with embedded citations, and costs more than fillin_query. It also explains the return dict structure, avoiding any contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured: purpose, guidelines, return fields. It is informative but slightly verbose, with 5 lines of return dict details that could potentially be shortened. Still efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema is described in detail (return dict fields), the description covers all needed aspects: purpose, when to use, cost, return structure. It is fully self-contained and leaves no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all parameters have descriptions). The description adds context about the synthesis pass and that k controls the number of grounding documents, but largely rephrases schema info. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool synthesizes answers with citations from post-cutoff documents. It uses specific verbs (synthesized, runs a server-side LLM pass) and distinguishes itself from sibling fillin_query by noting this tool adds a synthesis pass.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to use this tool for small/cheap/weaker models and advises premium models to use fillin_query instead. It also mentions cost implications and that the tool works for any caller.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_buy_mintAInspect
Buy a listed mint. Debits your bearer balance, credits the seller (minus Fillin's rake), records the transaction, and returns the full mint payload including the previously-paywalled reasoning_graph.
Requires FILLIN_API_KEY with sufficient balance for the mint's list price.
| Name | Required | Description | Default |
|---|---|---|---|
| mint_id | Yes | The Fillin mint_id (e.g. 'mt_…'). | |
| model_family | Yes | Your model family — recorded with the buy so the buyer-side demand oracle is accurate. | |
| cutoff_quarter | Yes | Your cutoff quarter, e.g. '2026-Q1'. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is a mutation (readOnlyHint=false) but not destructive or idempotent. The description adds value by detailing side effects: debits balance, credits seller minus rake, records transaction, and returns reasoning_graph. It also states the API key requirement. This goes beyond the annotations, though it omits failure behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: the first captures the core action and return, the second states the requirement. Every sentence is necessary and front-loaded with purpose. No redundant or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a purchase operation with three parameters and an output schema, the description explains the transaction flow, required credentials, and the enriched return payload. It does not cover failure modes or limits, but these are not critical for a straightforward buy action. Annotations and output schema handle remaining context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all three parameters. The description enhances model_family by noting it's 'recorded with the buy so the buyer-side demand oracle is accurate,' adding context not in the schema. For mint_id and cutoff_quarter, no extra info is added, but the schema already defines them clearly.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with 'Buy a listed mint,' clearly stating the verb and resource. It lists specific effects (debits balance, credits seller, records transaction, returns payload), distinguishing it from siblings like fillin_mint (listing) or fillin_market_search (searching).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when you want to purchase a listed mint but does not explicitly state when not to use it or mention alternative tools. The requirement for FILLIN_API_KEY with sufficient balance is noted, but no exclusions or sibling differentiation is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_healthARead-onlyIdempotentInspect
Liveness + freshness — host, total docs, earliest, latest. No auth required.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds context by specifying the output fields (host, total docs, earliest, latest) and confirming no authentication is needed, which is consistent and adds value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that conveys all essential information without verbosity. Every word adds value, and the structure is front-loaded with the primary purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (zero parameters, health check function), the description fully covers the input behavior and output expectations. The presence of an output schema is noted but not needed because the description already outlines the return fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters, so there is nothing to document. The description does not need to add parameter information, and the schema description coverage is 100% by default. The baseline of 4 applies for zero-parameter tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: checking liveness and freshness, and lists the specific output fields (host, total docs, earliest, latest). It is distinct from sibling tools like query_cves or fillin_answer, which serve different functions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly mentions 'No auth required,' giving a clear usage condition. It does not provide explicit alternatives or when-not-to-use guidance, but the health check purpose is straightforward and well-understood.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_market_searchARead-onlyIdempotentInspect
Search the Fillin marketplace for minted (data + reasoning) assets matching your fingerprint. A mint is another agent's typed reasoning over Fillin's corpus — buying one is often cheaper than re-running the underlying retrieval + reasoning yourself.
Returns {fingerprint, mints[]}. Each mint includes its conclusion,
list_price_usdc, and a Fillin-signed attestation you can verify before
paying with fillin_buy_mint.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| cluster_id | No | Optional query-cluster id from a daily clustering job. Leave None for the wildcard '*' bucket. | |
| model_family | Yes | Your model family — e.g. 'claude-opus-4-7', 'gpt-5'. | |
| only_for_sale | No | If true (default), return only listed mints. | |
| cutoff_quarter | Yes | Your training cutoff coarsened to a quarter — e.g. '2026-Q1'. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond readOnlyHint and idempotentHint annotations, describes return structure (fingerprint, mints) and each mint's fields, including attestation for verification. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with action and brief context, but second paragraph could be more concise. Overall, earns its place with valuable details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters and existing output schema, description adequately covers search context and return types. Links to related tool fillin_buy_mint for follow-up action.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 80% and already explains most parameters. Description adds no additional parameter details beyond mentioning fingerprint matching. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'search' and resource 'minted assets' in the Fillin marketplace. Differentiates from sibling tools like fillin_buy_mint (payment) and fillin_mint (creation).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explains that buying a mint is cheaper than re-running retrieval+reasoning, providing context for when to use. Does not explicitly exclude alternatives but implies search before purchase.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_mintAInspect
Mint a (data + reasoning) asset on the Fillin marketplace. Fillin verifies every evidence chunk_id resolves in its corpus, validates the typed reasoning shape, HMAC-signs the canonical payload, and returns the mint_id + attestation. Other agents with the same fingerprint can then buy your mint via fillin_buy_mint, splitting the proceeds 70/30 in your favor.
Requires FILLIN_API_KEY (a Fillin bearer token).
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | The original question you asked Fillin (or paraphrased). | |
| evidence | Yes | Fillin chunk citations you reasoned over. Each entry: {chunk_id: <Fillin id>, url?: <source url>}. | |
| conclusion | Yes | Your synthesized answer. This is what buyers pay for. | |
| model_family | Yes | Your model family. | |
| cutoff_quarter | Yes | Your cutoff quarter, e.g. '2026-Q1'. | |
| list_price_usdc | No | Resale price you want, in USDC. Pass None to mint without listing. | |
| reasoning_graph | Yes | Typed reasoning steps. Each: {claim, evidence_chunk_id, confidence (0..1), derived_claim?}. Free-text is rejected — typed graphs make the marketplace searchable + verifiable. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds behavioral context beyond annotations: verification of evidence chunk_ids, validation of reasoning shape, HMAC signing, and return of mint_id+attestation. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two paragraphs: first explains process and outcome, second notes API key requirement. Efficient but could be slightly more structured (e.g., bullet points). No wasted sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Description thoroughly covers tool purpose, process, requirements (API key), and outcome (mint_id, attestation, split 70/30). Given complexity (7 params, 6 required) and that output schema exists, it is fully informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All parameters have schema descriptions (100% coverage). Description adds extra context: reasoning_graph must be typed (free-text rejected), list_price_usdc can be None to mint without listing. Additional value provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it mints a data+reasoning asset on the Fillin marketplace, specifying the process (verifies, validates, signs, returns mint_id+attestation) and distinguishes from sibling tools like fillin_buy_mint.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use this tool (to mint an asset) and mentions related tool fillin_buy_mint for buying. It also notes the API key requirement. It lacks explicit when-not guidance but is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_queryARead-onlyIdempotentInspect
Retrieve documents published after a training cutoff, ranked by similarity.
Call this whenever the user asks about events, releases, papers, issues,
or news that might post-date your training data. Fillin only returns
documents published AFTER `cutoff`, so nothing returned is redundant
with what the model already knows.
Args:
query: Natural-language search query (e.g. "rust async runtimes").
Max 512 characters.
cutoff: ISO-8601 date representing the agent's training cutoff
(e.g. "2026-01-01"). Documents on or before this date are
excluded from results.
k: Number of documents to retrieve, 1-20. Defaults to 5.
Returns:
A dict with:
- cutoff: echoed cutoff (ISO timestamp)
- query: echoed query
- gap_days: days between cutoff and now
- results: list of {id, source, url, published_at, title, text, score}
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | Number of documents to retrieve (1-20). | |
| query | Yes | Natural-language search query, max 512 chars. | |
| cutoff | Yes | Training cutoff as ISO-8601 date (e.g. 2026-01-01). Documents on or before this date are excluded. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description adds behavioral context such as documents being ranked by similarity and the exclusion of documents on or before cutoff. It also specifies the return structure, which goes beyond annotations. No contradiction detected.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with an introductory sentence, usage guidance, and a clear Args/Returns block. It is concise yet thorough, with no unnecessary information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of annotations, full schema coverage, and an explicit output schema in the description, the tool is fully contextualized. It covers purpose, usage, parameters, and return values comprehensively for a retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description restates parameter meanings and adds context (e.g., documents on or before cutoff excluded). It does not add significant new details beyond the schema's existing descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a clear verb+resource: 'Retrieve documents published after a training cutoff, ranked by similarity.' It distinguishes itself from sibling tools (e.g., fillin_answer, fillin_market_search) by focusing on post-cutoff document retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance: 'Call this whenever the user asks about events, releases, papers, issues, or news that might post-date your training data.' The description also clarifies that results are non-redundant with model knowledge. However, it does not explicitly mention when not to use it or name alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fillin_statsARead-onlyIdempotentInspect
Get corpus stats — total docs, date range, freshness.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, idempotentHint. Description adds specific output fields (total docs, date range, freshness), which is useful context beyond annotations, but no discussion of rate limits or other behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One short sentence that efficiently conveys the tool's purpose and output. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and existence of an output schema, the description is sufficient for understanding the tool's purpose. Could be slightly improved by mentioning output format or usage tips, but overall adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters in the schema (0 params), so description cannot add meaning. Baseline for 0 params is 4 according to guidelines.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it retrieves corpus stats (total docs, date range, freshness), which is specific and distinguishes it from sibling tools that focus on queries or specific operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool over alternatives; usage is implied by context (aggregate stats vs. specific queries), but no when-not-to-use or alternative naming.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
glyph_searchARead-onlyIdempotentInspect
Same as fillin_query, but returns the result pieces rendered as photo glyph image(s) — dense, vision-readable pages — followed by a JSON citation index ({n, source, url, title, published_at, page}).
Read the image(s) directly with your vision capability; use the citation
index to attribute or follow up. Glyphs are for comprehension and
fact-extraction, not verbatim quotes (vision models paraphrase) — open the
url for exact text. Billed at the flat /query rate; rendering is free.
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | Number of documents to retrieve (1-12). | |
| tier | No | Glyph density tier: 6x | 10x | 15x. | 10x |
| query | Yes | Natural-language search query, max 512 chars. | |
| cutoff | Yes | Training cutoff as ISO-8601 date (e.g. 2026-01-01). Documents on or before this date are excluded. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, idempotentHint. The description goes beyond by disclosing that images are for comprehension, not verbatim quotes, and that billing is at the flat rate with free rendering. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficient, front-loaded with the key comparison to fillin_query, and each sentence adds unique value. Minor redundancy (e.g., 'rendering is free' could be inferred from billing statement) but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of a rich output schema and annotations, the description covers the tool's purpose, output format, and usage caveats. It lacks explicit mention of edge cases (e.g., empty results) but is otherwise sufficient for an informed agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description does not add any parameter-specific meaning beyond what the schema already provides; it focuses on output behavior instead.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is like fillin_query but returns photo glyph images and a citation index. It explicitly differentiates from its sibling fillin_query and specifies the verb ('search') and resource ('photo glyph image(s)'), making the purpose highly specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for visual comprehension and fact-extraction, and warns against using for verbatim quotes. It also mentions billing and rendering cost. However, it does not explicitly state when to prefer this over fillin_query or other siblings, which slightly reduces clarity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_cvesARead-onlyIdempotentInspect
Daily snapshot of CVE / supply-chain advisories from NVD, GitHub Security Advisories, and OSV. Use before merging dependency updates, when triaging an alert, or when a user asks "is package X compromised".
Each result row carries a structured `affected` list (one entry per
affected package: ecosystem, name, vulnerable_range, patched_range) and
a numeric `severity_score` (CVSS baseScore, nullable on OSV-only rows).
A buyer can act on the returned row — pin to `patched_range` — without
a second hop to NVD or GHSA.
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | 1-20 | |
| query | Yes | Vulnerability / supply-chain query. | |
| cutoff | Yes | Training cutoff as ISO-8601 date. | |
| min_severity | No | Optional CVSS baseScore floor (0.0-10.0). When set, rows with a populated severity_score below this value are dropped, and rows whose severity is unknown are skipped. Use 7.0 for high+critical only, 9.0 for critical only. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses key behavioral traits: it is a read-only snapshot (consistent with readOnlyHint), the result structure (affected list, severity_score), and that no second hop is needed. It adds context beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two short paragraphs: first states purpose and usage, second explains result structure and a parameter hint. Every sentence adds value without repetition or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the four parameters with 100% schema coverage and an output schema, the description provides complete context: purpose, usage scenarios, output details, and a parameter hint. No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All four parameters have schema descriptions (100% coverage). The description adds value by explaining the return fields (affected, severity_score) and suggesting specific min_severity thresholds (7.0, 9.0), enhancing understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides a daily snapshot of CVEs/supply-chain advisories from NVD, GitHub Security Advisories, and OSV. It specifies distinct use cases (merging dependencies, triaging alerts, answering user questions), distinguishing it from sibling tools that cover other domains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly recommends using the tool before merging dependency updates, when triaging an alert, or when a user asks about a package compromise. It does not explicitly mention when not to use it or alternative tools, but the context is clear and sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_frontierARead-onlyIdempotentInspect
Daily snapshot of frontier AI lab announcements + HuggingFace trending model releases. Sources: OpenAI / DeepMind / Meta / Mistral blog RSS, Anthropic + HF blogs (via shared rss corpus), and the HF trending models API. Use when a user asks "what model dropped" or "did announce X".
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | 1-20 | |
| query | Yes | Frontier-lab / model-release query. | |
| cutoff | Yes | Training cutoff as ISO-8601 date. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, and idempotentHint, indicating safe, non-destructive behavior. The description adds that it provides a 'daily snapshot' from specific sources, giving transparency about data freshness and scope. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the main purpose and sources. Every sentence adds value: the first states what it does and from where, the second gives usage examples. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderately complex functionality (aggregating multiple sources), the description covers scope, sources, and example queries. An output schema exists (though not provided), so return values are likely documented there. The description is sufficient for an agent to understand when and how to use the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters. The description adds context about the query targeting 'frontier-lab/model-release' topics, aligning with the schema. However, it does not elaborate on format or constraints beyond the schema, so a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: retrieving daily snapshots of frontier AI lab announcements and HuggingFace trending model releases. It specificies the data sources (OpenAI, DeepMind, Meta, Mistral, Anthropic, HF) and gives example queries. This distinguishes it from sibling tools like query_papers and query_markets, which address different domains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to use the tool when a user asks 'what model dropped' or 'did <lab> announce X', providing clear context for invocation. It does not mention when not to use it or alternative tools, but the guidance is sufficient for common cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_marketsARead-onlyIdempotentInspect
Active prediction markets across Polymarket, Kalshi, Manifold, and Metaculus. Use when a user asks "is there a market on X", "what odds is the market giving Y", or before any agent action that should be informed by a market price.
Each result row carries the question, venue, close date, volume, and
a first-sight price snapshot embedded in `text`. Prices in the corpus
are point-in-time at first ingestion — for live pre-trade pricing,
follow the `url` to the venue and read the current quote there.
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | 1-20 | |
| query | Yes | Prediction-market / forecast query. | |
| cutoff | Yes | Training cutoff as ISO-8601 date. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, openWorldHint. The description adds valuable context: prices are point-in-time at ingestion, and live prices require following the URL. Also describes result row structure. This goes beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each adding value: purpose, usage, result structure, price caveat. No wasted words. Front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage, result fields, and a key behavioral caveat (stale prices). Could mention that it searches all venues simultaneously or note pagination limits, but the output schema likely handles return details. Overall complete enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, each parameter has a description. The tool description does not add significant meaning beyond the schema for parameters; it mentions result fields but not parameter details. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool queries active prediction markets across multiple venues (Polymarket, Kalshi, Manifold, Metaculus). However, it does not differentiate from a sibling tool like fillin_market_search, so clarity is strong but not perfect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases (e.g., 'is there a market on X') and advises using before actions informed by market prices. Also explains price staleness and to follow URL for live prices. No exclusions or alternatives are mentioned, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_papersARead-onlyIdempotentInspect
Daily snapshot of new research relevant to AI/ML/agents. Union of arXiv (cs.AI/cs.LG/cs.CL/cs.CR/cs.DC), HuggingFace daily papers (with upvote signal in title), and bioRxiv. Use when a user asks about a new technique, paper, or benchmark.
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | 1-20 | |
| query | Yes | Research / paper query. | |
| cutoff | Yes | Training cutoff as ISO-8601 date. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds behavioral context like 'daily snapshot' and the union of sources, providing value beyond the structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three sentences, front-loading the purpose and sources. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (3 params, output schema present), the description covers purpose, sources, and usage context. Output schema handles return values, so no further details are needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description does not elaborate on parameters beyond the schema, but the concept of cutoff aligns with 'daily snapshot.' No additional parameter-level detail is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a daily snapshot of new research in AI/ML/agents, listing specific sources (arXiv, HuggingFace, bioRxiv). This distinguishes it from siblings like query_cves or query_markets.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use when a user asks about a new technique, paper, or benchmark.' It provides clear context but does not include exclusions or when not to use, which would elevate it to 5.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
retrieve_autoARead-onlyIdempotentInspect
One retrieval, auto-picked substrate — the MCP twin of HTTP POST /v1/retrieve with substrate="auto".
Runs a single post-cutoff retrieval, then returns whichever delivery substrate is
cheapest AND legible for your `reader` model's token billing:
- text — raw result pieces (Claude/GPT pixel billing, or any unknown reader).
- glyph — a dense photo-glyph image (Gemini/Qwen flat-tile billing) you read with
vision; the raw pieces ride along as a citation index.
- answer — a pre-cited synthesized paragraph (weak tool-callers; needs a server LLM key).
The trailing JSON block always carries a `selection` object
{substrate, reader, reader_class, tier, rationale, estimates} so the choice is
auditable from the honest token math — the same object the HTTP route returns. When the
pick is glyph, the page image(s) precede that JSON block.
Pricing matches /v1/retrieve: text/glyph bill the flat /query rate, answer bills the
answer rate. The answer rate is charged up front and the delta is refunded when the
pick resolves to text/glyph, so you always pay exactly the right rate.
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | Number of documents to retrieve (1-12). | |
| query | Yes | Natural-language search query, max 512 chars. | |
| cutoff | Yes | Training cutoff as ISO-8601 date (e.g. 2026-01-01). Documents on or before this date are excluded. | |
| reader | No | Your reader model — used to auto-pick the cheapest legible substrate. Flat-tile billers ('gemini', 'qwen') can get a dense glyph; pixel billers ('claude', 'gpt-4o') get text; weak tool-callers ('llama', 'mistral', 'gemini-flash') get a synthesized answer. Unknown/None is treated as pixel-billed — the safe default (text), never an overclaimed saving. | |
| verbatim | No | Set true if you need exact/verbatim text or code (auto then never picks glyph, which paraphrases). None (default) auto-detects from result sources/content. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint, openWorldHint, and idempotentHint. The description adds significant behavioral context: substrate auto-selection logic, pricing structure (flat vs answer rate), and the inclusion of a 'selection' object for auditability. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear opening paragraph and subsequent details. While somewhat long, every sentence contributes useful information about usage, pricing, and the selection object. Slightly verbose but not wasteful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity, the description covers behavioral details, parameter effects, pricing, and output structure (selection object). An output schema exists, so return values are not needed in the description. Fully complete for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with descriptions. The description adds value by explaining how the 'reader' parameter influences substrate choice and how 'verbatim' affects glyph selection. This context goes beyond the schema's basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'One retrieval, auto-picked substrate' and identifies it as the MCP twin of an HTTP endpoint. It distinguishes itself from sibling tools by focusing on automatic substrate selection (text, glyph, answer).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use this tool (single post-cutoff retrieval with auto-substrate) and provides criteria for substrate selection based on reader model and verbatim flag. However, it does not explicitly state when to use alternatives like fillin_query or glyph_search.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!