get_article_url
Retrieve the arXiv.org URL for a scholarly article by searching with its title. Use this tool to obtain direct links to academic papers for access or citation.
Instructions
Retrieve the URL of an article hosted on arXiv.org based on its title. Use this tool only for retrieving the URL. This tool searches for the article based on its title, and then fetches the corresponding URL from arXiv.org.
Args: title: Article title.
Returns: URL that can be used to retrieve the article.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes |
Implementation Reference
- src/arxiv_server/server.py:108-125 (handler)The handler function for the 'get_article_url' tool. Registered via @mcp.tool() decorator. It calls get_url_and_arxiv_id to fetch the direct PDF URL for the given article title and returns it or an error message.@mcp.tool() async def get_article_url(title: str) -> str: """ Retrieve the URL of an article hosted on arXiv.org based on its title. Use this tool only for retrieving the URL. This tool searches for the article based on its title, and then fetches the corresponding URL from arXiv.org. Args: title: Article title. Returns: URL that can be used to retrieve the article. """ result = await get_url_and_arxiv_id(title) if isinstance(result, str): return result article_url, _ = result return article_url
- src/arxiv_server/server.py:85-92 (helper)Helper function that fetches article information and constructs the direct PDF URL and arXiv ID from the title.async def get_url_and_arxiv_id(title: str) -> tuple[str, str] | str: """Get URL of the article hosted on arXiv.org.""" info = await fetch_information(title) if isinstance(info, str): return info arxiv_id = info.id.split("/abs/")[-1] direct_pdf_url = f"https://arxiv.org/pdf/{arxiv_id}" return (direct_pdf_url, arxiv_id)
- src/arxiv_server/server.py:60-83 (helper)Helper function to query arXiv API by title, parse feed, find best matching entry using fuzzy matching.async def fetch_information(title: str): """Get information about the article.""" formatted_title = format_text(title) url = f"{ARXIV_API_BASE}/query" params = { "search_query": f'ti:{formatted_title}', "start": 0, "max_results": 25 } data = await make_api_call(url, params=params) if data is None: return "Unable to retrieve data from arXiv.org." feed = feedparser.parse(data) error_msg = ( "Unable to extract information for the provided title. " "This issue may stem from an incorrect or incomplete title, " "or because the work has not been published on arXiv." ) if not feed.entries: return error_msg best_match = find_best_match(target_title=formatted_title, entries=feed.entries) if best_match is None: return str(error_msg) return best_match
- src/arxiv_server/server.py:46-58 (helper)Helper utility to find the best matching arXiv entry by title similarity using difflib."""Find the entry whose title best matches the target title.""" target_title_lower = target_title.lower() best_entry = None best_score = 0.0 for entry in entries: entry_title_lower = entry.title.lower() score = difflib.SequenceMatcher(None, target_title_lower, entry_title_lower).ratio() if score > best_score: best_score = score best_entry = entry if best_score >= threshold: return best_entry return None
- src/arxiv_server/server.py:94-106 (helper)Helper function to clean and format the search title text for arXiv API query.def format_text(text: str) -> str: """Clean a given text string by removing escape sequences and leading and trailing whitespaces.""" # Remove common escape sequences text_without_escapes = re.sub(r'\\[ntr]', ' ', text) # Replace colon with space text_without_colon = text_without_escapes.replace(':', ' ') # Remove both single quotes and double quotes text_without_quotes = re.sub(r'[\'"]', '', text_without_colon) # Collapse multiple spaces into one text_single_spaced = re.sub(r'\s+', ' ', text_without_quotes) # Trim leading and trailing spaces cleaned_text = text_single_spaced.strip() return cleaned_text