unpaywall_fetch_pdf_text
Extract text content from open access research papers using DOI or PDF URL to enable analysis of academic literature.
Instructions
Download and extract text from best OA PDF for a DOI, or from a provided PDF URL.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
doi | No | DOI string or DOI URL. Used if pdf_url is not provided. | |
No | Email to identify requests to Unpaywall (required when resolving via DOI). | ||
pdf_url | No | Direct PDF URL to download and parse (takes precedence over DOI). | |
truncate_chars | No | Max characters of extracted text to return (default 20000). |
Input Schema (JSON Schema)
{
"additionalProperties": false,
"properties": {
"doi": {
"description": "DOI string or DOI URL. Used if pdf_url is not provided.",
"type": "string"
},
"email": {
"description": "Email to identify requests to Unpaywall (required when resolving via DOI).",
"type": "string"
},
"pdf_url": {
"description": "Direct PDF URL to download and parse (takes precedence over DOI).",
"type": "string"
},
"truncate_chars": {
"description": "Max characters of extracted text to return (default 20000).",
"minimum": 1000,
"type": "integer"
}
},
"required": [],
"type": "object"
}