arxiv:get
Retrieve arXiv research papers by ID to access metadata, abstracts, or full text for legal research and citation purposes.
Instructions
Retrieve an arXiv paper by ID (e.g., "2501.02725"). Default: metadata + abstract. With section or save_path: fetches HTML full text (available for papers from ~2024+). Older papers without HTML return metadata + abstract + PDF link.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | arXiv ID (e.g., "2501.02725", "2501.02725v5") | |
| section | No | Section heading or "lines:100-200". Triggers full text fetch. | |
| save_path | No | Save full text to file. Triggers full text fetch. |
Implementation Reference
- src/providers/arxiv/tools/get.ts:8-45 (handler)The `handleGet` function performs the actual logic for the `arxiv:get` tool, fetching paper metadata via `ArxivClient`, checking if full text is requested, and optionally converting and saving the content.
export async function handleGet(client: ArxivClient, args: Record<string, unknown>): Promise<ToolResult> { const { id, section, save_path } = args as { id: string; section?: string; save_path?: string }; // Always fetch metadata from Atom API const { entries } = await client.search({ id_list: id, max_results: 1 }); if (!entries.length) return { content: [{ type: 'text', text: `Paper ${id} not found.` }], isError: true }; const entry = entries[0]; const header = [ `# ${entry.title}`, `\n**Autoren:** ${entry.authors.join(', ')}`, `**Datum:** ${entry.published} | **Kategorien:** ${entry.categories.join(', ')}`, entry.doi ? `**DOI:** ${entry.doi}` : '', entry.journalRef ? `**Journal:** ${entry.journalRef}` : '', `**PDF:** ${entry.pdfUrl}`, ].filter(Boolean).join('\n'); // Full text only when section or save_path requested if (!section && !save_path) { return { content: [{ type: 'text', text: `${header}\n\n## Abstract\n\n${entry.summary}` }] }; } const html = await client.getHtml(entry.id); if (!html) { const msg = `${header}\n\n## Abstract\n\n${entry.summary}\n\n---\n*Full HTML text not available for this paper (pre-2024). Use the PDF link above.*`; return { content: [{ type: 'text', text: msg }] }; } const markdown = `${header}\n\n---\n\n${htmlToMarkdown(html)}`; if (save_path) { mkdirSync(dirname(save_path), { recursive: true }); writeFileSync(save_path, markdown, 'utf-8'); return { content: [{ type: 'text', text: `Saved to ${save_path} (${markdown.length} chars)` }] }; } return { content: [{ type: 'text', text: extractSection(markdown, section!) }] }; } - Defines the schema and description for the `arxiv:get` tool, including inputs `id`, `section`, and `save_path`.
{ name: 'arxiv:get', description: 'Retrieve an arXiv paper by ID (e.g., "2501.02725"). ' + 'Default: metadata + abstract. With `section` or `save_path`: fetches HTML full text (available for papers from ~2024+). ' + 'Older papers without HTML return metadata + abstract + PDF link.', inputSchema: z.object({ id: z.string().describe('arXiv ID (e.g., "2501.02725", "2501.02725v5")'), section: z.string().optional().describe('Section heading or "lines:100-200". Triggers full text fetch.'), save_path: z.string().optional().describe('Save full text to file. Triggers full text fetch.'), }), }, - src/providers/arxiv/provider.ts:13-16 (registration)The tool is registered and dispatched within `ArxivProvider.handleToolCall` by mapping `arxiv:get` to `handleGet`.
async handleToolCall(name: string, args: Record<string, unknown>): Promise<ToolResult> { switch (name) { case 'arxiv:search': return handleSearch(this.client, args); case 'arxiv:get': return handleGet(this.client, args);