openalex-mcp-server
Server Details
Access the OpenAlex academic research catalog — 270M+ publications.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- cyanheads/openalex-mcp-server
- GitHub Stars
- 7
- Server Listing
- @cyanheads/openalex-mcp-server
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.5/5 across 5 of 5 tools scored.
Each tool has a clearly distinct purpose: search, name resolution, citation graph walking, trend aggregation, and field discovery. No functional overlap.
All tools follow the consistent pattern `openalex_verb_noun` using snake_case. Examples: `openalex_search_entities`, `openalex_resolve_name`.
Five tools is ideal for the server's scope—covering search, resolution, citation analysis, trend analysis, and field metadata without being too sparse or overwhelming.
The tool set covers the core operations for OpenAlex: searching all entity types, resolving names to IDs, walking citation graphs, aggregating trends, and discovering valid fields. For a read-only API, this is comprehensive.
Available Tools
5 toolsopenalex_analyze_trendsOpenalex Analyze TrendsARead-onlyIdempotentInspect
Aggregate OpenAlex entities into groups and count them. Use for trend analysis (group works by publication_year), distribution analysis (group by oa_status, type, country), and comparative analysis (group by institution or topic). Combine with filters to scope the analysis. Returns up to 200 groups per page — use cursor pagination for fields with many distinct values.
| Name | Required | Description | Default |
|---|---|---|---|
| order | No | Sort order for groups. Omit or pass "count" (default) to return the top-N groups by count descending — no further pages. Pass "key" to enumerate all distinct values in key-ascending order with cursor pagination. Use "key" only when you need a full traversal; most analysis calls want "count". | |
| cursor | No | Pagination cursor from a previous response. Only relevant when order is "key" — count-descending results have no next page. Pass the next_cursor from the previous response to advance. | |
| filters | No | Filter criteria (same syntax as openalex_search_entities filters). Narrows the population before aggregation. For full-text within filters, use abstract.search, title.search, or default.search — there is no bare 'search' filter key. Example: group works by year filtered to a specific topic. | |
| group_by | Yes | Field to group by. Works examples: "publication_year", "type", "oa_status", "primary_topic.field.id", "authorships.institutions.country_code", "is_retracted". Authors: "last_known_institutions.country_code", "has_orcid". Sources: "type", "is_oa", "country_code". Not all fields support group_by — check entity docs if unsure. | |
| per_page | No | Maximum groups per page (1-200). Default 200 (the upstream cap). A real top-N knob when order is count (the default) — reduce to return only the highest-count groups. | |
| entity_type | Yes | Entity type to aggregate. | |
| include_unknown | No | Include a group for entities with no value for the grouped field. Hidden by default. |
Output Schema
| Name | Required | Description |
|---|---|---|
| echo | Yes | Compact echo of the input criteria (entity_type, group_by, filters) — surfaces what was actually requested when no groups are returned. |
| meta | Yes | Aggregation metadata. |
| groups | Yes | Aggregation groups with counts. |
| notice | No | Guidance notice. Set when no groups are returned (recovery suggestions) or when the page is full and more groups likely exist (truncation signal with narrowing advice). Absent otherwise. |
| totalCount | Yes | Total entities matching the filters before grouping (across all pages). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, and open-world behavior. The description adds valuable context: returns up to 200 groups per page, cursor pagination only for key order, count order returns top-N with no further pages, and include_unknown hidden by default. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a concise single paragraph of ~70 words, front-loaded with the core action. Every sentence serves a purpose: defining the function, listing use cases, mentioning filters, and noting pagination limits. No redundant or extraneous content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's 7 parameters (2 required), full schema coverage, presence of output schema, and annotations, the description covers key behaviors (pagination, filter reference, grouping examples). It does not detail return format, but that is handled by the output schema. It is sufficiently complete for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description enhances understanding by providing concrete examples of valid group_by values across entity types (e.g., publication_year, type, oa_status) and clarifying filter syntax (abstract.search, title.search). This adds value beyond the schema's generic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool aggregates entities into groups and counts them, with specific use cases like trend, distribution, and comparative analysis. It distinguishes from siblings such as openalex_search_entities, which returns individual entities rather than aggregates.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists when to use the tool (trend, distribution, comparative analysis) and mentions combining with filters. However, it does not provide explicit when-not-to-use guidance or direct comparisons with sibling tools, though the context makes the distinction clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
openalex_describe_fieldsOpenalex Describe FieldsARead-onlyIdempotentInspect
List valid field names for an OpenAlex entity type and context (filter, group_by, or select). Use proactively before constructing a filter or group_by to avoid invalid-field 400 errors. Pass query to narrow the results by name similarity — useful when you have a partial or guessed field name.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Optional partial or guessed field name to rank results by similarity. Pass the field you tried (e.g. "funder") to get the closest matches first. Omit to return all fields for the entity_type + context. | |
| context | Yes | Field usage context. "filter": fields accepted in the filter param. "group_by": fields accepted in group_by (same valid set as filter). "select": fields accepted in select. | |
| entity_type | Yes | OpenAlex entity type to list fields for. |
Output Schema
| Name | Required | Description |
|---|---|---|
| total | Yes | Total number of valid fields for this entity_type + context. |
| fields | Yes | Valid field names, ranked by similarity to query when provided. |
| context | Yes | Context queried (filter, group_by, or select). |
| entity_type | Yes | Entity type queried. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and idempotentHint, so the safety profile is clear. The description adds context about using query to narrow results but doesn't add significant behavioral details beyond that. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with purpose and usage guidance. Every sentence adds value without redundancy. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists and parameters are fully covered, the description is complete. It covers what the tool does, when to use it, why, and how the query parameter works, with no missing information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description explains the query parameter's purpose and the context enum values, but these are already described in the schema. It adds marginal value over the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists valid field names for an OpenAlex entity type and context, using a specific verb ('list') and resource. It distinguishes from siblings by mentioning proactive use to avoid 400 errors, which sets it apart from other tools like search or analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly instructs to use proactively before constructing filters or group_by to avoid errors, and explains when to use the query parameter for partial field names. This provides clear guidance on when and why to invoke the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
openalex_get_citation_graphOpenalex Get Citation GraphARead-onlyIdempotentInspect
Walk the citation graph one hop from a seed work. Direction picks the edge: incoming citations (cites), the seed's own references (cited_by), or OpenAlex's algorithmically-related works (related_to). Note: direction follows OpenAlex's filter convention, which inverts the common English reading — cites returns works that cite the seed; cited_by returns works the seed cites. Results use the works schema; combine with filters/sort to narrow further.
| Name | Required | Description | Default |
|---|---|---|---|
| sort | No | Sort field. Prefix with "-" for descending. Common: "cited_by_count", "-publication_date". Default is OpenAlex relevance. | |
| cursor | No | Pagination cursor from a previous response. Pass to get the next page. | |
| select | No | OpenAlex work field names to return. Always returned: id, display_name. Defaults to the curated works select if omitted. | |
| filters | No | Additional filters to narrow the graph, same syntax as openalex_search_entities. Example: publication_year=">2020", is_oa="true". Do not include cites/cited_by/related_to — those are set by the `direction` parameter. | |
| seed_id | Yes | Seed work identifier. Accepts OpenAlex ID ("W2741809807"), DOI ("10.1038/nature12373" or full URL), PMID, or PMCID. Use openalex_resolve_name first if you only have a title. | |
| per_page | No | Results per page (1-100). Default 25. | |
| direction | Yes | "cites": works that cite seed_id (incoming citations). "cited_by": works that seed_id cites (its reference list). "related_to": OpenAlex algorithmically-related works (~8-30 typical, may be empty for less-cited seeds). |
Output Schema
| Name | Required | Description |
|---|---|---|
| echo | Yes | Compact echo of seed_id, direction, filters, sort — surfaces what was actually queried when no edges are returned. |
| meta | Yes | Result metadata including pagination. |
| notice | No | Recovery guidance when no edges are returned — suggests verifying the seed_id, broadening filters, or trying a different direction. Absent when results are present. |
| results | Yes | Works on the citation graph in this direction. |
| totalCount | Yes | Total edges from seed_id in this direction across all pages. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable context: the inverted direction naming, that 'related_to' may be empty for less-cited seeds, and that results use the works schema. This goes beyond what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two paragraphs, front-loaded with the core purpose. Every sentence is informative: the first paragraph covers the action and direction, the second explains the naming nuance and composability. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description does not need to detail return values. It covers purpose, all parameter behaviors, caveats (direction inversion, related_to emptiness), and usage hints (combine with filters). This is complete for a graph-walking tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. The description adds meaning beyond the schema by clarifying the direction inversion ('cites returns works that cite the seed') and the typical size of related_to results. This helps avoid common misinterpretations.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'walk' and resource 'citation graph from a seed work', clearly distinguishing it from siblings like openalex_search_entities (free-text search) and openalex_analyze_trends (trend analysis). It precisely defines the scope as one-hop traversal.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the three direction options and their inverted convention, helping the user choose correctly. It also mentions combining with filters/sort. However, it does not explicitly state when to avoid this tool (e.g., for multi-hop traversal) or compare directly to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
openalex_resolve_nameOpenalex Resolve NameARead-onlyIdempotentInspect
Resolve a name or partial name to an OpenAlex ID. Returns up to 10 matches with disambiguation hints. ALWAYS use this before filtering by entity — names are ambiguous, IDs are not. Also accepts DOIs directly for quick lookup.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Name or partial name to resolve. Also accepts DOIs for quick lookup. | |
| filters | No | Narrow autocomplete results with filters. Example: restrict to a specific country or publication year range. | |
| entity_type | No | Entity type to search. Omit for cross-entity search (useful when entity type is unknown). |
Output Schema
| Name | Required | Description |
|---|---|---|
| notice | No | Recovery guidance when no matches were found — echoes the query and suggests corrections. Absent when results are present. |
| results | Yes | Autocomplete matches, up to 10. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds that the tool returns up to 10 matches with disambiguation hints and accepts DOIs, providing useful behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each adding distinct value: core function, return details, and usage guidance. No wasted words, well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers return limit, disambiguation, DOIs, and usage context. It doesn't detail output schema, but that is provided separately. Could mention behavior on no matches, but overall complete for a resolution tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. The description adds meaningful guidance for the query parameter by advising to use it before entity filtering and noting DOI acceptance. This enhances semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool resolves a name or partial name to an OpenAlex ID, specifies return limit and disambiguation hints, and mentions DOI acceptance. This is distinct from sibling tools like searching entities or analyzing trends.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using this tool before filtering by entity, emphasizing that names are ambiguous while IDs are not. It also mentions DOI acceptance. However, it does not explicitly exclude scenarios where other tools like openalex_search_entities should be used instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
openalex_search_entitiesOpenalex Search EntitiesARead-onlyIdempotentInspect
Search, filter, sort, or retrieve by ID. Covers all OpenAlex entity types (works, authors, sources, institutions, topics, keywords, publishers, funders). Pass id to retrieve a single entity. Otherwise, use query and/or filters for discovery. Supports keyword search with boolean operators, exact phrase matching, and AI semantic search. Use openalex_resolve_name to resolve names to IDs before filtering. Searches and ID lookups return a curated set of fields by default; pass select to override with specific fields, or ["*"] for the full record.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Retrieve a single entity by ID. Supports: OpenAlex ID ("W2741809807"), DOI ("10.1038/nature12373"), ORCID ("0000-0002-1825-0097"), ROR ("https://ror.org/00hx57361"), PMID ("12345678"), PMCID ("PMC1234567"), ISSN ("1234-5678"). When provided, other search/filter/sort params are ignored — but `select` still applies: the curated per-entity-type default is returned unless you pass `select` (use `["*"]` for the complete record). Use openalex_resolve_name to find the ID if unknown. | |
| seed | No | Deterministic seed for `sample`. Same seed + same filters = same results — pass when reproducibility matters. Has no effect (and is rejected) without `sample`. | |
| sort | No | Sort field. Prefix with "-" for descending. Common: "cited_by_count", "-publication_date", "-relevance_score" (default when query present). Note: when combined with a keyword query, an explicit sort overrides relevance ranking entirely — top results may be highly cited but only tangentially on-topic. Use "-relevance_score" or omit sort to keep the most relevant results first. "-relevance_score" requires an active search via "query" or a "filter:search" filter — passing it without one will fail. | |
| query | No | Text search query. Supports boolean operators (AND, OR, NOT), quoted phrases ("exact match"), wildcards (machin*), fuzzy matching (machin~1), and proximity ("climate change"~5). Omit for filter-only queries. | |
| cursor | No | Pagination cursor from a previous response. Pass to get the next page. | |
| sample | No | Return a random sample of this many entities matching the filters (1-100). Single page only — pagination via `cursor` is not supported with sampling. Overrides `per_page`. Useful for unbiased exploration: spot-checking filter correctness, stratified review prompts, or generating exploration sets without bias toward most-cited. | |
| select | No | OpenAlex top-level field names to return. Always returned: `id`, `display_name` — additional fields you list are appended. A curated default per entity type applies to both searches and single-entity (`id`) lookups; pass field names to override it, or `["*"]` to retrieve the complete record (every field). Invalid field names produce an error identifying the rejected field. Example: ["doi", "authorships", "primary_topic"]. | |
| filters | No | Filter criteria as field:value pairs. AND across fields (multiple keys). OR within field: pipe-separate ("us|gb"). NOT: prefix "!" ("!us"). Range: "2020-2024". Comparison: ">100", "<50". AND within same field: "+"-separate. Use OpenAlex IDs (not names) for entity filters — resolve names first. Common keys: `openalex` (filter by entity ID, e.g. {"openalex": "W123|W456"}), `cites` (works citing a given work), `publication_year` (range "2020-2024"), `authorships.author.id`, `type`, `is_oa`. | |
| per_page | No | Results per page (1-100). Default 25. Semantic search caps at 50 — when search_mode="semantic", set per_page ≤ 50 (also subject to a 1 req/sec rate limit upstream). | |
| entity_type | Yes | Type of scholarly entity to search. | |
| search_mode | No | Search strategy. "keyword": stemmed full-text (default). "exact": no stemming, matches individual words (use quoted phrases for multi-word exact match). "semantic": AI embedding similarity (max 50 results, 1 req/sec). | keyword |
Output Schema
| Name | Required | Description |
|---|---|---|
| echo | Yes | Compact echo of the input criteria (entity_type, query, filters, sort, search_mode) — surfaces what was actually searched when results are empty. |
| meta | Yes | Result metadata including pagination. |
| notice | No | Recovery guidance when results are empty — echoes the criteria and suggests how to broaden. Absent on successful result pages. |
| results | Yes | OpenAlex entity objects passed through unchanged. Additional fields depend on entity_type and select. |
| totalCount | Yes | Total results matching the query/filters across all pages. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnly, openWorld, and idempotent. Description adds context: id retrieval ignores other params, sort overrides relevance, semantic search has rate limits and caps. No contradictions. Could mention pagination default but already addressed via cursor param.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured and front-loaded. Every sentence adds value, though slightly verbose. Could be tightened but still effective for a complex tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 11 parameters, 1 required, output schema exists, and nested objects, the description thoroughly covers all aspects: search modes, filtering syntax, pagination, sampling, and edge cases. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. Description adds significant value: explains sort's relevance trade-off, per_page's semantic cap, various ID formats, seed's role with sample, and select's behavior with default vs full record. Exceeds baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Search, filter, sort, or retrieve by ID.' It covers all OpenAlex entity types and distinguishes from sibling tools like openalex_resolve_name, which is explicitly referenced for name resolution.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance: 'Use openalex_resolve_name to resolve names to IDs before filtering.' Also specifies when to use different search modes (keyword, exact, semantic) and that id lookups ignore other params. This helps the agent choose between this tool and others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.