get_article_url
Retrieve direct PDF URLs for arXiv articles using article titles or arXiv IDs to access scholarly papers.
Instructions
Retrieve the direct PDF URL of an article on arXiv.org by title or arXiv ID.
Args: title: Article title. arxiv_id: arXiv ID (e.g., 1706.03762 or arXiv:1706.03762v7).
Returns: URL that can be used to retrieve the article, or structured error JSON.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | ||
| arxiv_id | No |
Implementation Reference
- src/arxiv_server/server.py:157-173 (handler)The primary handler function for the 'get_article_url' tool, registered via @mcp.tool() decorator. It delegates to resolve_article and returns the direct PDF URL or error.@mcp.tool() async def get_article_url(title: Optional[str] = None, arxiv_id: Optional[str] = None) -> str: """ Retrieve the direct PDF URL of an article on arXiv.org by title or arXiv ID. Args: title: Article title. arxiv_id: arXiv ID (e.g., 1706.03762 or arXiv:1706.03762v7). Returns: URL that can be used to retrieve the article, or structured error JSON. """ result = await resolve_article(title=title, arxiv_id=arxiv_id) if isinstance(result, str): return result article_url, _ = result return article_url
- src/arxiv_server/server.py:123-142 (helper)Key helper function that resolves the article identifier (title or arXiv ID) to a direct PDF URL and ID, handling validation, API calls for title search, and error conditions.async def resolve_article(title: Optional[str] = None, arxiv_id: Optional[str] = None) -> Tuple[str, str] | str: """ Resolve to a direct PDF URL and arXiv ID using either a title or an arXiv ID. Preference order: arxiv_id > title. """ if arxiv_id: m = ARXIV_ID_RE.match(arxiv_id.strip()) if not m: return _error("INVALID_ID", f"Not a valid arXiv ID: {arxiv_id}") vid = m.group("id") return (f"https://arxiv.org/pdf/{vid}", vid) if not title: return _error("MISSING_PARAM", "Provide either 'arxiv_id' or 'title'.") info = await fetch_information(title) if isinstance(info, str): return _error("NOT_FOUND", str(info)) resolved_id = info.id.split("/abs/")[-1] direct_pdf_url = f"https://arxiv.org/pdf/{resolved_id}" return (direct_pdf_url, resolved_id)
- src/arxiv_server/server.py:98-122 (helper)Helper for fetching arXiv article information by title using the arXiv API, parsing feed, and selecting best title match.async def fetch_information(title: str): """Get information about the article.""" formatted_title = format_text(title) url = f"{ARXIV_API_BASE}/query" params = { "search_query": f"ti:{formatted_title}", "start": 0, "max_results": 25, } data = await make_api_call(url, params=params) if data is None: return "Unable to retrieve data from arXiv.org." feed = feedparser.parse(data) error_msg = ( "Unable to extract information for the provided title. " "This issue may stem from an incorrect or incomplete title, " "or because the work has not been published on arXiv." ) if not feed.entries: return error_msg best_match = find_best_match(target_title=formatted_title, entries=feed.entries) if best_match is None: return str(error_msg) return best_match
- src/arxiv_server/server.py:83-96 (helper)Utility helper to find the best matching arXiv entry by title similarity using difflib.def find_best_match(target_title: str, entries: list, threshold: float = 0.8): """Find the entry whose title best matches the target title.""" target_title_lower = target_title.lower() best_entry = None best_score = 0.0 for entry in entries: entry_title_lower = entry.title.lower() score = difflib.SequenceMatcher(None, target_title_lower, entry_title_lower).ratio() if score > best_score: best_score = score best_entry = entry if best_score >= threshold: return best_entry return None