Skip to main content
Glama
metaneutrons

German Legal MCP Server

by metaneutrons

arxiv:get

Retrieve arXiv research papers by ID to access metadata, abstracts, or full text for legal research and citation purposes.

Instructions

Retrieve an arXiv paper by ID (e.g., "2501.02725"). Default: metadata + abstract. With section or save_path: fetches HTML full text (available for papers from ~2024+). Older papers without HTML return metadata + abstract + PDF link.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
idYesarXiv ID (e.g., "2501.02725", "2501.02725v5")
sectionNoSection heading or "lines:100-200". Triggers full text fetch.
save_pathNoSave full text to file. Triggers full text fetch.

Implementation Reference

  • The `handleGet` function performs the actual logic for the `arxiv:get` tool, fetching paper metadata via `ArxivClient`, checking if full text is requested, and optionally converting and saving the content.
    export async function handleGet(client: ArxivClient, args: Record<string, unknown>): Promise<ToolResult> {
      const { id, section, save_path } = args as { id: string; section?: string; save_path?: string };
    
      // Always fetch metadata from Atom API
      const { entries } = await client.search({ id_list: id, max_results: 1 });
      if (!entries.length) return { content: [{ type: 'text', text: `Paper ${id} not found.` }], isError: true };
    
      const entry = entries[0];
      const header = [
        `# ${entry.title}`,
        `\n**Autoren:** ${entry.authors.join(', ')}`,
        `**Datum:** ${entry.published} | **Kategorien:** ${entry.categories.join(', ')}`,
        entry.doi ? `**DOI:** ${entry.doi}` : '',
        entry.journalRef ? `**Journal:** ${entry.journalRef}` : '',
        `**PDF:** ${entry.pdfUrl}`,
      ].filter(Boolean).join('\n');
    
      // Full text only when section or save_path requested
      if (!section && !save_path) {
        return { content: [{ type: 'text', text: `${header}\n\n## Abstract\n\n${entry.summary}` }] };
      }
    
      const html = await client.getHtml(entry.id);
      if (!html) {
        const msg = `${header}\n\n## Abstract\n\n${entry.summary}\n\n---\n*Full HTML text not available for this paper (pre-2024). Use the PDF link above.*`;
        return { content: [{ type: 'text', text: msg }] };
      }
    
      const markdown = `${header}\n\n---\n\n${htmlToMarkdown(html)}`;
    
      if (save_path) {
        mkdirSync(dirname(save_path), { recursive: true });
        writeFileSync(save_path, markdown, 'utf-8');
        return { content: [{ type: 'text', text: `Saved to ${save_path} (${markdown.length} chars)` }] };
      }
    
      return { content: [{ type: 'text', text: extractSection(markdown, section!) }] };
    }
  • Defines the schema and description for the `arxiv:get` tool, including inputs `id`, `section`, and `save_path`.
    {
      name: 'arxiv:get',
      description:
        'Retrieve an arXiv paper by ID (e.g., "2501.02725"). ' +
        'Default: metadata + abstract. With `section` or `save_path`: fetches HTML full text (available for papers from ~2024+). ' +
        'Older papers without HTML return metadata + abstract + PDF link.',
      inputSchema: z.object({
        id: z.string().describe('arXiv ID (e.g., "2501.02725", "2501.02725v5")'),
        section: z.string().optional().describe('Section heading or "lines:100-200". Triggers full text fetch.'),
        save_path: z.string().optional().describe('Save full text to file. Triggers full text fetch.'),
      }),
    },
  • The tool is registered and dispatched within `ArxivProvider.handleToolCall` by mapping `arxiv:get` to `handleGet`.
    async handleToolCall(name: string, args: Record<string, unknown>): Promise<ToolResult> {
      switch (name) {
        case 'arxiv:search': return handleSearch(this.client, args);
        case 'arxiv:get': return handleGet(this.client, args);
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. It effectively communicates the conditional output formats (metadata vs HTML full text vs PDF link), the date-based availability limitation for HTML content, and the fallback behavior for older papers. Lacks mention of rate limits or error conditions, but covers the critical behavioral variations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficiently structured sentences with zero redundancy. Front-loaded with core functionality (ID retrieval), followed by conditional behavior and constraints. Every clause conveys necessary information about output variations or availability limits.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool with simple flat schema and no output schema, the description is comprehensive. It compensates for the missing output schema by explicitly documenting the three possible return variations (metadata/abstract, HTML full text, or PDF link) and the conditions that trigger each.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage (baseline 3), the description adds value by embedding the ID example in context, reinforcing the trigger relationship between optional parameters and full-text mode, and—most importantly—providing the temporal constraint (~2024+) that affects parameter utility. This contextual limitation is essential for correct parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action (Retrieve) and resource (arXiv paper by ID), with an example ID format. It implicitly distinguishes from sibling arxiv:search by emphasizing ID-based retrieval versus search queries, and differentiates from unrelated domain siblings (dip, legis, etc.) by explicitly naming the arXiv domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear conditional guidance: default returns metadata+abstract, while section/save_path parameters trigger full text fetch. Explains availability constraints (~2024+ papers only for HTML) and fallback behavior for older papers. Does not explicitly name arxiv:search as the alternative for finding IDs, though the distinction is clear from context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/metaneutrons/german-legal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server