Skip to main content
Glama

Paper Search (arXiv + Semantic Scholar + OpenAlex)

Server Details

Search arXiv, Semantic Scholar & OpenAlex, read full arXiv text, plus image-to-LaTeX OCR.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsC

Average 3.3/5 across 37 of 37 tools scored. Lowest: 2.3/5.

Server CoherenceA
Disambiguation4/5

Most tools are distinct and well-described, with source prefixes (e.g., get_openalex_, search_semantic) helping to differentiate. However, the multiple search functions (search_papers, search_all, search_papers_bulk, search_openalex_works) could still cause confusion for an agent, especially since their differences are subtle.

Naming Consistency5/5

Tool names follow a consistent verb_noun pattern (get_, search_, list_, recognize_) with clear source prefixes where needed. There's no mixing of conventions or chaotic naming. All names are predictable and readable.

Tool Count4/5

With 37 tools, the server covers three major APIs plus utilities, which is ambitious. While each tool serves a specific purpose, the count is slightly high; some tools like search_papers and search_all are close in function. Nevertheless, the scope justifies the number without being excessive.

Completeness5/5

The tool surface is remarkably complete for academic paper research: searching, fetching, citations, authors, recommendations, reading full text, and even formula/table recognition. It covers the full lifecycle of discovering and analyzing papers with no obvious dead ends.

Available Tools

37 tools
autocomplete_papersBInspect

Semantic Scholar: autocomplete paper titles for a partial query (fast type-ahead).

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions 'fast type-ahead' but does not disclose behavioral traits such as rate limits, result limits, or whether it is read-only. This leaves significant gaps for an agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, very concise and front-loaded with the purpose. However, it is somewhat terse and omits important details, preventing a perfect score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema, no annotations), the description covers the basic purpose but lacks details about response format, limits, or error conditions. It is minimally viable but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'query' has no description in the schema (0% coverage). The description only says 'partial query (fast type-ahead)', adding minimal meaning beyond the schema. It does not explain input format or any constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it autocompletes paper titles for a partial query, specifying it's a fast type-ahead. This distinguishes it from sibling tools like search_papers which likely return full results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for type-ahead scenarios, but it does not explicitly state when to use this tool vs. alternatives like search_papers or match_paper_title. More guidance on context would improve score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_authorBInspect

Semantic Scholar: a single author's profile by id.

ParametersJSON Schema
NameRequiredDescriptionDefault
author_idYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description does not disclose what the profile includes, authentication needs, rate limits, or any constraints beyond fetching a profile by ID.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise (7 words) and front-loaded with 'Semantic Scholar'. However, the brevity sacrifices detail needed for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description is minimally adequate but lacks specifics on author_id format and return value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the only parameter. The description adds 'by id' but does not explain the format, example, or constraints for author_id.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a single author's profile by ID from Semantic Scholar, with a specific verb and resource. It distinguishes from siblings like get_author_papers and search_authors.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., get_author_papers, search_authors). No context on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_author_papersCInspect

Semantic Scholar: all papers by a given author id, newest first.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
author_idYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full behavioral burden. It only reveals sorting order (newest first) but omits pagination behavior, error responses, rate limits, or data freshness. For a tool with no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the core purpose. No extraneous content or repetition. It efficiently communicates the tool's main function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations and output schema, the description should elaborate on return value structure, field contents, and error conditions. It only states 'all papers' with no indication of what fields are returned, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain any of the three parameters (author_id, start, max_results) beyond the redundant 'author id'. The agent gains no additional insight into parameter meaning, defaults, or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool fetches all papers by a given author ID, sorted newest first. This is a specific verb-resource pair with distinct functionality. While siblings like search_papers or get_paper are not directly differentiated, the purpose is unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like search_by_author or get_paper_authors. The context implies use cases for known author ID, but no when-not-to-use or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_authors_batchBInspect

Semantic Scholar: fetch many authors at once by id.

ParametersJSON Schema
NameRequiredDescriptionDefault
idsYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It does not disclose behavioral traits such as rate limits, result size constraints, error handling, or read-only nature beyond the implied 'fetch'.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one short sentence, front-loaded with the domain identifier. It is efficient with no wasted words, but could be expanded slightly without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and one parameter, the description is too sparse. It does not explain the output format, max number of IDs allowed, or error handling, leaving significant gaps for a batch operation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter (ids) with 0% description coverage. The description does not mention the parameter, its format, constraints, or examples, adding no value beyond the schema's bare structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches many authors at once by ID from Semantic Scholar. The verb 'fetch' and resource 'authors' are specific, and the 'batch' aspect distinguishes it from the singular 'get_author' sibling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving multiple authors by ID, but lacks explicit guidance on when to use this tool versus alternatives like get_author. No exclusions or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dataset_diffsBInspect

Semantic Scholar Datasets: incremental diff (added/updated/deleted) for a dataset between two releases. Needs the key.

ParametersJSON Schema
NameRequiredDescriptionDefault
end_releaseNolatest
dataset_nameYes
start_releaseYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description is the sole source of behavioral info. It indicates a read-only diff operation (not destructive), which is helpful. However, it lacks details about response format, pagination, and what the 'key' refers to (authentication vs field).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, compact sentence that conveys the core purpose and a key constraint. No unnecessary words, front-loaded with the function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, no annotations, and no parameter descriptions, the description is too sparse. It fails to explain key concepts like 'release', 'diff' format, or what the 'key' is. The agent lacks sufficient context to use this tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, so the description must compensate. It implies start_release and end_release as 'between two releases' but does not explain dataset_name or the optional end_release default. The 'key' phrase adds confusion as it is not a parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides an incremental diff (added/updated/deleted) for a dataset between two releases. This distinguishes it from siblings like get_dataset_release (full dataset) and get_dataset_download_links. However, it does not explicitly mention the parameter dataset_name and the term 'key' is ambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description only mentions 'Needs the key' as a prerequisite, but gives no guidance on when to use this tool versus other dataset-related siblings (e.g., get_dataset_release, list_dataset_releases). No alternatives or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dataset_releaseBInspect

Semantic Scholar Datasets: which datasets a release contains (papers, abstracts, citations, embeddings, s2orc, tldrs…). release_id defaults to 'latest'.

ParametersJSON Schema
NameRequiredDescriptionDefault
release_idNolatest
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It only states what the tool returns (list of dataset types) and a default parameter value. It does not disclose behavioral traits such as read-only nature, error handling for invalid release_id, rate limits, or any side effects. For a tool with no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that front-loads the tool's purpose and includes key context (the default parameter value). Every word earns its place with no redundancy or unnecessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema or annotations), the description is minimally adequate. It tells what the tool does and the default parameter. However, it lacks details about the return format, error scenarios, and what exactly 'datasets' includes. For a complex sibling set, more completeness would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no description for the parameter release_id (0% coverage). The description adds only that it defaults to 'latest', which is already in the schema default field. It does not explain the format, possible values, or criteria for valid release IDs. Since schema coverage is low, the description should compensate, but it barely adds value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'get', the resource 'dataset release', and what it returns ('which datasets a release contains') with examples like papers, abstracts, citations. It distinguishes from sibling list_dataset_releases by focusing on contents of a single release. However, it could be more precise about the output format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied: use when you need to know what datasets are in a specific release. It mentions the default value for release_id but provides no explicit guidance on when to use this tool versus alternatives like list_dataset_releases. No when-not-to-use or exclusion criteria are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_openalex_citationsAInspect

OpenAlex: papers that CITE this work (forward citation graph), most-cited first.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
work_idYes
max_resultsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are absent, so the description must cover behavioral traits. It discloses the citation direction and sort order but lacks details on pagination behavior, rate limits, authentication needs, or result structure. This is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that conveys the core purpose without extraneous information. It is front-loaded and efficiently uses simple words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (3 parameters, no output schema, no annotations), the description is incomplete. It does not explain the output format, how pagination works via start and max_results, or any OpenAlex-specific details (e.g., API limits, fields returned). Sibling tools suggest more context is available.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning to the three parameters (start, work_id, max_results). The required work_id is not explained, and the optional parameters lack context beyond their names. Schema or description should clarify the format of work_id.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that this tool returns forward citations (papers that cite the given work) sorted by most-cited first. It uses specific language ('papers that CITE this work') and distinguishes itself from siblings like get_openalex_references (backward citations).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for obtaining citing papers but does not explicitly state when to use vs. alternatives like get_openalex_references or get_paper_citations. No exclusions or usage conditions are provided, leaving the agent to infer context from the tool name and siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_openalex_referencesCInspect

OpenAlex: the works this one REFERENCES (its bibliography).

ParametersJSON Schema
NameRequiredDescriptionDefault
work_idYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but is too brief. It does not disclose authentication needs, rate limits, error handling, or what happens if work_id is invalid. Only the basic purpose is stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (6 words), which is concise but omits critical details. It could include parameter info without being verbose; currently it is under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of output schema and annotations, the description is incomplete. It does not explain the return value, pagination, or how max_results is used, leaving the agent without sufficient context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no meaning beyond the input schema. Schema coverage is 0%; the parameters work_id and max_results are not explained. The agent gets no clue about the work_id format or how max_results affects results.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves references (bibliography) of a work, distinguishing it from the sibling tool get_openalex_citations which retrieves citing works. However, it does not explicitly name the verb 'get' or 'list', and the context is implicit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like get_openalex_citations or get_paper_references. The agent must infer usage from the description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_openalex_workAInspect

OpenAlex: fetch one work's full record (316M-work, all-field corpus). id accepts OpenAlex Wxxxx, a DOI, or an arXiv id.

ParametersJSON Schema
NameRequiredDescriptionDefault
work_idYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavioral traits. It does not mention any side effects, authentication requirements, rate limits, or that it is a read-only operation. The description only covers input format, not what happens on execution.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no redundant information. It front-loads the purpose and immediately clarifies ID formats, making it efficient for an agent to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description mentions the scale (316M-work corpus) but does not describe what the returned record contains or any limitations. With no output schema, the description could be more complete by hinting at the response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, but the description adds essential meaning to the 'work_id' parameter by specifying accepted formats (OpenAlex Wxxxx, DOI, arXiv ID). This compensates well for the schema gap, though it could provide more format detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches one work's full record from OpenAlex, specifying the accepted ID formats (OpenAlex Wxxxx, DOI, arXiv ID). This is specific and distinguishes it from sibling tools like search_openalex_works or get_openalex_citations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for fetching a single work record but does not provide explicit guidance on when to use this tool versus alternatives (e.g., for batch fetching or citations). No when-not-to-use or prerequisite information is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_paperBInspect

Fetch one paper by id, with full abstract and PDF link.

ParametersJSON Schema
NameRequiredDescriptionDefault
sourceNoarxiv
paper_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the burden. It mentions output includes abstract and PDF link, which is helpful. However, it does not disclose error behavior (e.g., if paper_id not found) or any side effects. For a simple fetch, this is acceptable but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 10 words, very concise. However, it omits important details about parameters, which slightly reduces efficiency. It is front-loaded but under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 2-parameter tool with no output schema, the description is mostly adequate. It would benefit from explaining the source parameter and possibly the return format. It covers the primary use case but lacks completeness regarding parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has two parameters (source, paper_id) with 0% description coverage. The description only mentions 'by id' but does not explain the 'source' parameter or its default value. This leaves ambiguity for the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it fetches one paper by id with full abstract and PDF link. However, it does not differentiate from the sibling 'read_paper' which may have similar functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool versus alternatives like 'read_paper' or 'search_papers'. No when-to-use or when-not-to-use information is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_paper_authorsCInspect

Semantic Scholar: the authors of a paper (with h-index, paper/citation counts).

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
paper_idYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and description does not disclose behavioral traits such as pagination behavior, error handling, authentication requirements, or rate limits. Only minimal information about returned data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with clear front-loading of source (Semantic Scholar) and resource. Efficient but could be expanded to improve other dimensions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (3 params, 33+ siblings, no output schema, no annotations), the description is too brief. It omits pagination details, return structure, and comparison to similar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description fails to explain the usage of 'start' and 'max_results' parameters. The description only mentions author fields but does not clarify parameter purposes beyond the schema itself.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states it retrieves authors of a paper with h-index, paper, and citation counts, which clearly defines the resource and key fields. However, it could be more explicit about the verb (e.g., 'get' or 'retrieve').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like get_author (single author) or search_authors. No mention of prerequisites, filtering, or alternative use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_paper_citationsBInspect

Semantic Scholar: papers that CITE this one (forward citation graph). id accepts S2 id / DOI: / ARXIV: / CorpusId:.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
paper_idYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It only mentions ID formats and basic purpose, missing details on authentication, rate limits, or behavior on invalid IDs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, efficient and front-loaded. No wasted words, but could benefit from slightly more detail without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, no annotations, and only 1 of 3 parameters described. The tool is incomplete for an agent to understand return format or pagination behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only paper_id gets some explanation (accepted formats). start and max_results are not described; since schema coverage is 0%, the description fails to add meaning for these parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves papers that cite the given paper (forward citation graph) and specifies accepted ID formats. It distinguishes from siblings like get_paper_references.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for forward citations and lists ID formats, but does not explicitly state when NOT to use or compare with alternatives like get_paper_references.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_paper_referencesCInspect

Semantic Scholar: papers this one REFERENCES (its bibliography). id accepts S2 id / DOI: / ARXIV: / CorpusId:.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
paper_idYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must disclose behavioral traits. It only mentions the source (Semantic Scholar) and accepted ID formats. It does not state that the operation is read-only, any rate limits, required permissions, or what happens if the ID is invalid. The minimal disclosure is insufficient given the lack of annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one concise sentence that front-loads the core purpose. No redundant words. However, it could be structured to list parameters or usage scenarios, but it remains efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and many sibling tools, the description is incomplete. It lacks information on pagination, return structure, or filtering. The agent would need to infer too much to use this tool correctly in complex scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It only adds meaning for paper_id (acceptable formats). Parameters start and max_results are not explained at all. An agent would not understand pagination or the effect of these parameters without additional context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves papers referenced by a given paper (its bibliography). It specifies accepted identifier formats, which adds specificity. However, it does not explicitly distinguish itself from sibling tools like get_paper_citations or get_openalex_references, but the mention of 'references' is sufficient to imply the direction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided. The description does not indicate when to use this tool over alternatives (e.g., get_paper_citations for citations, or get_openalex_references for OpenAlex data). An agent would lack context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_papers_batchAInspect

Semantic Scholar: fetch many papers at once by id (S2/DOI:/ARXIV:/CorpusId:), up to ~500 per call.

ParametersJSON Schema
NameRequiredDescriptionDefault
idsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description covers important behavioral traits: the batch limit (~500) and accepted ID formats. However, it does not disclose behavior on invalid IDs, partial results, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One sentence with no wasted words. Front-loaded with the service name and key action ('fetch many papers at once by id'). Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema or annotations, the description covers the ID formats and batch limit adequately. Lacks details on output format or error behavior, which would be helpful for a batch operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the description adds significant meaning: it specifies the exact ID formats (S2, DOI, ARXIV, CorpusId) and the batch limit, which are not present in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches many papers at once by ID, specifying supported ID formats (S2, DOI, ARXIV, CorpusId) and a batch limit of ~500. This distinguishes it from siblings like get_paper (single) and search_papers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when fetching multiple papers by ID, but does not explicitly state when to use this tool versus alternatives like get_paper or search_papers. No exclusion criteria or alternative recommendations are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_categoriesCInspect

List common subject category codes for filtering/recent.

ParametersJSON Schema
NameRequiredDescriptionDefault
sourceNoarxiv
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states the action without disclosing any behavioral traits like read-only nature, authentication needs, or side effects, which is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, concise but lacking structure and clarity; it uses informal phrasing ('for filtering/recent') that could be more precise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity and lack of output schema, the description covers the basic purpose but omits details on output format, parameter behavior, and potential edge cases, leaving gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description does not mention the 'source' parameter or explain its meaning beyond the schema default, adding no value over the structured fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists subject category codes for filtering or recent use, with a specific verb and resource. However, it does not differentiate from siblings, as no other sibling tool lists categories.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context ('for filtering/recent') but provides no explicit guidance on when to use this tool vs alternatives, nor any when-not-to-use conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_dataset_releasesAInspect

Semantic Scholar Datasets: list all available release ids (dated snapshots of the full corpus).

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry full behavioral disclosure. It only states the basic operation, without mentioning any behavioral traits like pagination, rate limits, authentication, or what a release id entails.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is concise and front-loaded with 'Semantic Scholar Datasets:' for context. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a parameterless list tool with no output schema, the description is complete enough. It conveys the purpose clearly, though it could optionally indicate the format or ordering of release ids.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters, so the schema coverage is 100%. Per guidelines, baseline is 4. The description adds no parameter info, but none is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists all available release ids (dated snapshots of the full corpus), using a specific verb and resource. It is easily distinguishable from sibling tools, which focus on papers, authors, searches, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, such as get_dataset_release or other list tools. There is no mention of prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_ocr_modelsAInspect

List the OCR models available for recognize_formula / recognize_table.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present; the description does not disclose behavioral traits such as output format, rate limits, or side effects, which is minimal for a read-only listing tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, non-redundant sentence that conveys the core purpose efficiently, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no params, no output schema), the description is adequate but could be improved by specifying what the list contains (e.g., model IDs, names).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description adds no param-specific info but baseline is 4 as per guidelines.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists OCR models and specifies they are for use with recognize_formula and recognize_table, distinguishing it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (before recognition tools) but does not explicitly state when to use or provide alternatives, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_openalex_topicsBInspect

OpenAlex: search the topic taxonomy (~4500 topics) to find the right subject term for filtering or recent-work queries.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility for behavioral disclosure. The description only states the purpose and does not mention any behavioral traits such as read-only nature, pagination, rate limits, or output format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that efficiently conveys the core functionality. While concise, it could benefit from additional context without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, no output schema), the description provides minimal but adequate context for an AI agent to infer basic usage. However, it lacks details on parameter format and behavior, which may lead to suboptimal invocations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description partially compensates by indicating the query parameter is for searching topic names. However, it does not describe the max_results parameter or provide details on expected input formats.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches the OpenAlex topic taxonomy (~4500 topics) to find subject terms, which is a specific verb and resource. However, it does not explicitly differentiate from sibling search tools like search_authors or search_openalex_works.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for filtering or recent-work queries, providing context for when to use this tool. However, it lacks explicit guidance on when not to use it or comparisons to alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_paper_sourcesAInspect

List available paper corpora.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden. 'List' implies a non-destructive read operation, which is transparent. No hidden behaviors or contradictions are present.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, zero wasted words. Perfectly concise for a simple tool with no parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description does not explain what a 'paper corpus' is or what the output format looks like. For a simple list tool, additional context could be helpful but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so baseline is 4. The description adds no parameter info, but schema coverage is 100%, so no gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'list' and resource 'available paper corpora', clearly indicating what the tool does. It distinguishes from sibling tools that list other entities like categories or dataset releases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While no explicit when-to-use or when-not-to-use guidance is given, the name and description clearly indicate its purpose, and among siblings it is unique enough to infer when to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_recentBInspect

List the latest papers in a subject category, newest first.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
sourceNoarxiv
categoryYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It only discloses that results are 'newest first'. It does not mention pagination, default source, or what fields are returned. Lacks essential behavioral detail for an agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, very concise. However, it is slightly under-specified; a bit more detail on parameters could be added without losing conciseness. Still, it follows the principle of being front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters, no output schema, and no annotations, the description is incomplete. It fails to explain parameters, return format, or behavior like pagination and default source. An agent would have insufficient context to use it reliably.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description must compensate. However, it only implies the 'category' parameter via the phrase 'subject category', and does not explain 'start', 'source', or 'max_results'. Adds almost no meaning beyond the schema names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List', the resource 'papers', and the scope 'latest', 'newest first', and by 'subject category'. This differentiates it from sibling tools like search_papers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for listing recent papers in a category, but provides no explicit guidance on when to use this tool versus alternatives like search_papers or list_categories. No when-not-to-use or alternative mentions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

match_paper_titleAInspect

Semantic Scholar: find the single paper whose title best matches the given text (exact-match lookup).

ParametersJSON Schema
NameRequiredDescriptionDefault
titleYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full behavioral burden. However, it contains an internal contradiction: 'best matches the given text' suggests fuzzy matching while 'exact-match lookup' suggests exact string matching. This ambiguity undermines transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no wasted words. It front-loads the key purpose and constraints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description should provide more behavioral and return-value context. It states the tool returns a single paper but does not describe the output format or fields, making it incomplete for an agent to understand results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must explain the parameter. It only says 'given text', which barely adds meaning beyond the parameter name 'title'. It does not specify case sensitivity, formatting, or whether partial matches are allowed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds a single paper whose title best matches given text via exact-match lookup. It uses a specific verb ('find') and resource ('paper title match'), and distinguishes from sibling tools like search_papers by emphasizing 'single paper' and 'exact-match'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when needing an exact or best match to a title text. It contrasts with search siblings by specifying 'single paper', but does not explicitly exclude alternatives or mention when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_paperAInspect

Read a paper's full text. format='markdown' (default, body with formulas as $LaTeX$), 'html' (raw LaTeXML HTML), or 'latex' (the original LaTeX manuscript from the e-print source). arXiv only; id like 2401.01234.

ParametersJSON Schema
NameRequiredDescriptionDefault
formatNomarkdown
sourceNoarxiv
paper_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the burden of behavioral transparency. It discloses the available formats and the arXiv-only constraint, but does not mention potential errors, rate limits, or size limitations. As a read operation, safety is implied but not stated. The description is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first covers the core function and format options, the second adds the source constraint and an example. Every word serves a purpose, and the most important information is front-loaded. No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of three parameters and no output schema, the description covers the format options, the paper_id format, and the source limitation. It is mostly complete but lacks details about the return structure or error behavior. It sufficiently addresses the main user needs for a tool that reads paper text.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, so the description must compensate. It explains the 'format' parameter in detail (markdown with $LaTeX$ formulas, raw HTML, original LaTeX), and gives an example for 'paper_id'. The 'source' parameter is implicitly described as arXiv-only, but not explicitly explained in the parameter context. Overall, it adds significant meaning beyond the bare schema, especially for format options.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's verb ('Read'), resource ('paper's full text'), and specifics such as available formats (markdown, html, latex) and the source (arXiv only). It distinguishes itself from sibling tools like 'get_paper' by focusing on full text retrieval rather than metadata. The example ID clarifies the required format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives an example of the paper_id format and limits usage to arXiv, implying when to use the tool. However, it does not explicitly contrast with alternatives like 'get_paper' or 'search_papers', nor does it state when not to use it. Usage is implied but lacks explicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recognize_formulaAInspect

Recognize a math formula from an image and return LaTeX. Provide image_url (downloaded server-side) OR image_base64. model: deepseek-ocr (default), paddleocr-vl, or texify. Returns {latex, model, elapsed_ms}.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNodeepseek-ocr
image_urlNo
image_base64No
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses the return format including elapsed_ms and that image_url is downloaded server-side. Lacking annotations, the description carries the burden but omits details like supported image formats, size limits, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with three sentences, front-loading the purpose and then covering input and output details without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description covers return format and input options. It implies one of image_url or image_base64 must be provided, but does not clarify that both are optional in schema, nor does it discuss error conditions. Still, it is fairly complete for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description adds meaning by explaining the model parameter with a list of options and default, and clarifies the OR relationship between image_url and image_base64, which is not evident from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb ('Recognize'), the resource ('math formula from an image'), and the output ('return LaTeX'). It distinguishes itself from sibling tools like recognize_table by specifying 'math formula' and listing the return format as LaTeX.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides instructions on how to provide input (image_url or image_base64) and lists model options. However, it does not explicitly differentiate from sibling tools like recognize_table or indicate when not to use this tool, leaving room for confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recognize_tableAInspect

Recognize a table from an image and return LaTeX tabular code. Provide image_url OR image_base64. model: deepseek-ocr (default), paddleocr-vl, or texify. Returns {latex, model, elapsed_ms}.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNodeepseek-ocr
image_urlNo
image_base64No
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the output format {latex, model, elapsed_ms} and the input constraints (one of image_url or image_base64). However, it does not mention behavior when both inputs are provided, error conditions, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with only two sentences. The first sentence states the core function and output, while the second covers inputs and model options. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (3 params, no output schema, no annotations), the description covers the essential aspects: function, input methods, model choices, and output format. It could elaborate on error handling or model differences, but is largely sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description explains the 'model' parameter with valid values and emphasizes the exclusive OR requirement for image_url and image_base64. This adds significant meaning beyond the schema's type/default fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Recognize a table from an image' and the output 'return LaTeX tabular code'. This verb+resource+output combination is specific and distinguishes it from siblings like 'recognize_formula'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for table recognition and specifies input options (image_url or image_base64) and model choices. However, it does not explicitly state when to use this tool over siblings like 'recognize_formula' or provide exclusions for specific scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recommend_papers_for_paperAInspect

Semantic Scholar: recommend papers similar to one paper. pool='recent' (last open corpus) or 'all-cs' (all of CS). If the 'recent' pool yields nothing (common for older papers), it automatically retries the 'all-cs' pool.

ParametersJSON Schema
NameRequiredDescriptionDefault
poolNorecent
paper_idYes
max_resultsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses the retry logic when 'recent' pool yields no results, but omits details on rate limits, authorization, or output format. Given no annotations, this is partial transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, each adding value: purpose and behavior. No wasted words, front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description covers purpose and pool behavior, it lacks details on return format, error handling, and other parameters. With no output schema, this leaves gaps for agent usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description adds meaning for the 'pool' parameter (explaining valid values and retry behavior), but provides no information for 'paper_id' or 'max_results', leaving the agent to infer their semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool recommends papers similar to a given paper, with explicit pool options. It distinguishes from siblings like recommend_papers_from_examples by specifying it takes a single paper as input.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on pool selection and the automatic retry behavior. However, it does not explicitly contrast with sibling recommendation tools, so usage context is clear but not exhaustive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recommend_papers_from_examplesBInspect

Semantic Scholar: recommend papers from positive (and optional negative) example paper ids.

ParametersJSON Schema
NameRequiredDescriptionDefault
max_resultsNo
negative_idsNo
positive_idsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must convey behavior. It states that papers are recommended based on examples, but lacks details on algorithm, limitations, or error handling. Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, front-loaded with the tool's source and function. Could be slightly more detailed without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 3 parameters and no output schema. The description fails to explain parameters or return value, leaving the agent with insufficient information for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain the meaning or constraints of parameters like 'max_results', 'positive_ids', or 'negative_ids'. No value added over parameter names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action (recommend papers) and the resource (from positive and optional negative example paper IDs). It distinguishes from sibling tools like 'recommend_papers_for_paper' which uses a single paper.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description does not mention when-not or provide any context for selection among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_allAInspect

Aggregated search across arXiv, Semantic Scholar and OpenAlex at once. Fans out concurrently, de-duplicates the same work across corpora (by DOI or title) and re-ranks with Reciprocal Rank Fusion, so papers found by several sources rank highest. Each hit lists which sources found it and an ids map ({source: id}) you can pass to get_paper / read_paper / the citation tools. Prefer this over search_papers for a broad lookup.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
sourcesNoarxiv,semanticscholar,openalex
per_sourceNo
max_resultsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations were provided, but the description discloses key behavioral traits: concurrent fan-out, de-duplication, and Reciprocal Rank Fusion. It also describes the output structure (sources, ids map). No information on rate limits or performance, but it adds value beyond schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, dense paragraph that conveys all necessary information efficiently. Each sentence adds value, and the key recommendation is placed at the end. Could be slightly improved by structuring, but overall well-written.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters, no output schema, and no annotations, the description explains the high-level behavior well but lacks detail on parameter semantics and error handling. The tool has moderate complexity, and the description provides sufficient context for broad understanding but not full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain individual parameters like query, sources, per_source, or max_results. Only hints about output fields are given. The agent must infer parameter meaning from context, which is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs aggregated search across arXiv, Semantic Scholar, and OpenAlex, with de-duplication and RRF re-ranking. It also differentiates from sibling tool search_papers by explicitly recommending this tool for broad lookups.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises to prefer this over search_papers for broad lookups. It also explains the output format and how to use the ids for other tools. However, it does not mention when not to use it or any prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_authorsBInspect

Semantic Scholar: search for authors by name; returns profiles with h-index and paper/citation counts.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
startNo
max_resultsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It partially discloses behavior by stating the output includes h-index and counts, but it does not mention any safety traits (e.g., read-only), pagination limits, authentication needs, or whether results are sorted. For a search tool, this is adequate but incomplete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the source (Semantic Scholar), action, and output. It is front-loaded and contains no extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters and no output schema, the description is serviceable but lacks guidance on pagination, result limits, or how to handle large result sets. It does not fully orient the agent on the tool's capabilities relative to siblings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains the 'query' parameter implicitly via 'search for authors by name', but does not add meaning for 'start' or 'max_results' beyond their schema titles. The schema titles 'Start' and 'Max Results' are somewhat self-explanatory, but the description offers no additional context on defaults or behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (search), resource (authors), and method (by name). It also specifies the return fields (h-index, paper/citation counts). However, it does not differentiate from sibling tools like 'get_author' or 'search_by_author', which could cause confusion about when to prefer this tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for searching authors by name but provides no guidance on when not to use this tool, nor does it mention alternatives among the many sibling author-related tools. There is no context about prerequisites or edge cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_by_authorCInspect

Find papers by a specific author, newest first.

ParametersJSON Schema
NameRequiredDescriptionDefault
startNo
authorYes
sourceNoarxiv
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries full burden. It only reveals the sorting order but does not mention read-only nature, rate limits, authentication needs, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (7 words) but lacks necessary details. While concise, it under-specifies the tool, making it less helpful than it could be.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, no output schema, no annotations, and only a terse description, the tool definition is severely incomplete. An agent would lack sufficient context to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no explanation for the four parameters (author, start, source, max_results). The tool name implies author is the filter, but no details are provided about defaults or expected formats.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Find papers'), resource ('papers'), filter ('by a specific author'), and sort order ('newest first'). However, it does not differentiate from the sibling tool 'get_author_papers', which may have similar functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'search_papers' or 'get_author_papers'. The description lacks any context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_openalex_authorsCInspect

OpenAlex: search authors; returns profiles with h-index, i10-index, works/citation counts and institutions.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
startNo
max_resultsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lists the output fields, providing some behavioral insight, but fails to disclose pagination behavior, sorting, authentication needs, or any side effects. Since annotations are absent, the description should cover more behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, making it concise, but it omits crucial details about parameters and usage. Brevity without completeness reduces effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with three parameters and no output schema, the description is insufficient. It does not explain query semantics, result format, pagination, or how it differs from similar tools, leaving major gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description offers no explanation of the three input parameters (query, start, max_results). With 0% schema description coverage, the description was needed to clarify parameter meaning, but it provides none, leaving the agent to infer from names only.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches authors and specifies the returned fields (h-index, i10-index, citations, institutions), distinguishing from sibling tools that search works or institutions. However, it does not differentiate from the sibling 'search_authors' which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'get_author' for specific author IDs or 'search_authors' which might have different features. The description lacks any contextual usage instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_openalex_institutionsBInspect

OpenAlex: search institutions (universities, labs) with ROR id, country, works/citation counts.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It hints at filtering criteria but does not disclose behavior like pagination, rate limits, or read-only nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise and front-loaded with source ('OpenAlex') and resource ('institutions'). Could include more structure but is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and 0% schema coverage; description only partially covers what is returned (ROR id, country, counts) but lacks details on structure, pagination, or edge cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must compensate. It mentions possible search fields but does not explain how they relate to the 'query' parameter or provide format/syntax details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it searches institutions (universities, labs) in OpenAlex, with specific fields like ROR id, country, and citation counts. Differentiates from sibling tools like search_authors or search_openalex_works.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus others, but the name and description imply it's for institution searches. Lacks exclusions or alternative tool mentions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_openalex_worksBInspect

OpenAlex: advanced filtered work search. Filters: from_year, to_year, is_oa (open access only), min_citations, institution_id. sort_by: relevance|newest|cited.

ParametersJSON Schema
NameRequiredDescriptionDefault
is_oaNo
queryNo
sort_byNorelevance
to_yearNo
from_yearNo
max_resultsNo
min_citationsNo
institution_idNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It mentions filters and sort options but omits critical details like pagination, rate limits, empty result handling, or the role of the 'query' parameter. The description is insufficiently transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: a single sentence followed by a compact list. It is front-loaded with the tool's purpose. However, it could be more structured by grouping filters or adding brief usage notes.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 8 parameters and no output schema, the description lacks completeness. It does not explain the 'query' or 'max_results' parameters, nor does it describe the return value or result limits. A user cannot fully understand the tool's behavior from this description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds some meaning by explaining filters (e.g., 'is_oa (open access only)', 'sort_by: relevance|newest|cited'), but it covers only 5 of 8 parameters and omits 'query' and 'max_results'. With 0% schema coverage, the description partially compensates but remains incomplete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an 'advanced filtered work search' for OpenAlex, listing specific filters and sort options. This distinguishes it from siblings like get_openalex_work (single work retrieval) and search_papers (general paper search).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through its title and filter listing, but it does not explicitly state when to use this tool over alternatives or provide any exclusions. The agent is left to infer context from the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_papersCInspect

Search academic papers. Returns normalized hits with a short abstract preview; call get_paper for the full record.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
startNo
sourceNoarxiv
sort_byNorelevance
max_resultsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavior. It states returns are 'normalized hits with a short abstract preview', which is useful but lacks details on any side effects, authentication, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise with two sentences, front-loading the purpose. However, the extreme brevity sacrifices necessary detail for a tool with multiple parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, no output schema, and no schema descriptions, the description is incomplete. It omits critical details like pagination, source selection, and sorting, which are essential for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% for 5 parameters, but the description does not explain any parameter. It fails to add semantics beyond the schema, leaving all parameters undocumented in plain language.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Search academic papers' with a specific verb and resource. It distinguishes itself from 'get_paper' by mentioning it returns previews, but doesn't explicitly differentiate from other sibling search tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description suggests using 'get_paper' for full records, providing a clear alternative. However, it doesn't specify when to use this tool vs. other sibling search tools or any prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_papers_bulkBInspect

Semantic Scholar: bulk paper search (up to 1000 hits, sortable e.g. 'citationCount:desc' or 'publicationDate:desc', with a continuation token). Filters: fields_of_study, year (e.g. '2020-2024'), venue, publication_types, open_access_pdf.

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNo
yearNo
queryYes
tokenNo
venueNo
max_resultsNo
fields_of_studyNo
open_access_pdfNo
publication_typesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses key behaviors: returns up to 1000 hits, supports sorting and continuation token. However, with no annotations, it fails to mention whether it's read-only, rate limits, or authorization needs. Partially transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence efficiently conveys core functionality and filters. Front-loads key info. Could be slightly more structured for quick scanning, but no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 parameters, no output schema, and no annotations, the description covers main purpose and filter options but lacks details on parameter formats, output structure, and pagination behavior. Adequate but leaves gaps for autonomous agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds meaning beyond pure schema by listing filter parameters and giving sort examples. But it doesn't explain format for all parameters (e.g., venue, publication_types) or constraints. Incomplete but adds value above baseline (0% schema coverage).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a bulk paper search from Semantic Scholar with specific capabilities (up to 1000 hits, sorting, continuation token). Mentioning filters and sortable fields distinguishes it from sibling tools like search_papers and other search variants.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. It implies bulk usage but doesn't contrast with search_papers (likely for fewer results) or other search tools. Missing when-not-to-use and prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_snippetsBInspect

Semantic Scholar: search INSIDE paper full text and return matching text snippets (not just titles/abstracts).

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It only states basic function (search and return snippets) without details on result format, limits, or pagination. This is insufficient for a tool with no additional metadata.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no extraneous words. It is front-loaded with the key differentiator.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool and the lack of output schema or annotations, the description is adequate but has gaps: it does not mention result structure, error handling, or how to interpret snippets. More detail would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, yet the description does not explain the parameters (query, max_results) beyond the implied search context. For a low-coverage schema, the description should add meaning, but it falls short.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches inside paper full text and returns snippets, explicitly distinguishing it from title/abstract searches. This differentiation from siblings like search_papers is strong.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when needing in-text snippets, but does not explicitly state when not to use or compare with alternatives. Sibling context suggests broader searches use other tools, but no direct guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources