Skip to main content
Glama
SMABoundless

semantic-scholar-mcp-server

by SMABoundless

paper_search_bulk

Conduct bulk paper searches with cursor-based pagination to collect large result sets (up to 10 million). Filter by year, venue, fields, and citation count for systematic literature gathering.

Instructions

Bulk-search papers with cursor-based pagination for retrieving large result sets (up to 10M results). Returns a continuation token to fetch subsequent pages. Use for systematic literature collection.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sortNoSort order: 'citationCount' (desc), 'publicationDate' (desc), or 'paperId' (asc). Default: relevance.
yearNoYear filter, e.g. '2020' or '2018-2023'.
limitNoResults per page (1-1000, default: 100).
queryYesSearch query string.
tokenNoContinuation token from a previous bulk search response to get next page.
venueNoVenue filter, comma-separated.
fieldsNoComma-separated fields to return, overriding defaults. Paper fields: paperId, title, abstract, authors, year, citationCount, referenceCount, influentialCitationCount, isOpenAccess, openAccessPdf, fieldsOfStudy, externalIds, url, venue, publicationVenue, publicationTypes, publicationDate, journal, citations, references. Author fields: authorId, name, affiliations, homepage, paperCount, citationCount, hIndex.
fieldsOfStudyNoFields of study, comma-separated.
openAccessPdfNoOnly open access papers.
response_formatNoOutput format: 'markdown' for human-readable text (default), 'json' for raw structured datamarkdown
minCitationCountNoMinimum citation count.
publicationTypesNoPublication types, comma-separated.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose all behavioral traits. It mentions cursor-based pagination and a continuation token, and sets an upper limit of 10M results. However, it lacks details on rate limits, authentication, cost, or what happens if the result set exceeds 10M. The basic pagination behavior is covered, but further transparency is needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences: first states purpose and key feature, second adds pagination details, third gives usage context. Each sentence is concise and valuable, with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (12 parameters, many siblings, no output schema), the description is adequate but incomplete. It explains pagination and token usage but does not describe the response format beyond the token, nor guide on using filters. A more complete description would include a brief response structure or usage step.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for all 12 parameters, so the baseline is 3. The description does not add new meaning beyond the schema; it reiterates the token parameter's purpose but does not enhance understanding of other parameters. No additional semantics provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Bulk-search papers with cursor-based pagination for retrieving large result sets (up to 10M results)'. It uses specific verbs and resources, and distinguishes the tool from its sibling (e.g., paper_search) by emphasizing large-scale retrieval and systematic literature collection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context by stating 'Use for systematic literature collection'. It implies usage for large result sets, though it does not explicitly mention when not to use it or name alternative tools. A score of 4 is given for clear guidance without exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/SMABoundless/semanticscholar-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server