get_papers_by_date
Retrieve HuggingFace research papers published on a specific date by providing a YYYY-MM-DD formatted date parameter.
Instructions
Get HuggingFace daily papers for a specific date
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| date | Yes | Date in YYYY-MM-DD format |
Implementation Reference
- scraper.py:19-39 (handler)Core implementation of the get_papers_by_date tool. Fetches papers from HuggingFace for the given date, parses the HTML, and optionally fetches additional details like abstracts and authors from individual paper pages and ArXiv.def get_papers_by_date(self, date: str, fetch_details: bool = True) -> List[Dict]: url = f"{self.base_url}/{date}" try: response = self.session.get(url) response.raise_for_status() papers = self._parse_papers(response.text) if fetch_details and papers: # 获取所有论文的详细信息,包括具体作者姓名 for i, paper in enumerate(papers): if paper.get('url'): details = self._fetch_paper_details(paper['url']) if details: paper.update(details) time.sleep(1) # 避免请求过快 return papers except requests.RequestException as e: logging.error(f"Failed to fetch papers for {date}: {e}") return []
- main.py:60-74 (registration)MCP tool registration defining the tool name, description, and input schema (date parameter with YYYY-MM-DD pattern).types.Tool( name="get_papers_by_date", description="Get HuggingFace daily papers for a specific date", inputSchema={ "type": "object", "properties": { "date": { "type": "string", "description": "Date in YYYY-MM-DD format", "pattern": r"^\d{4}-\d{2}-\d{2}$" } }, "required": ["date"] }, ),
- main.py:63-73 (schema)Input schema definition for the get_papers_by_date tool, specifying the required 'date' parameter.inputSchema={ "type": "object", "properties": { "date": { "type": "string", "description": "Date in YYYY-MM-DD format", "pattern": r"^\d{4}-\d{2}-\d{2}$" } }, "required": ["date"] },
- main.py:100-132 (handler)MCP server @server.call_tool() dispatch logic for get_papers_by_date: validates input, calls scraper.get_papers_by_date, handles empty results, and formats the output as formatted text content.if name == "get_papers_by_date": if not arguments or "date" not in arguments: raise ValueError("Date is required") date = arguments["date"] papers = scraper.get_papers_by_date(date) if not papers: return [ types.TextContent( type="text", text=f"No papers found for {date}. Please check if the date is correct and has published papers." ) ] return [ types.TextContent( type="text", text=f"Found {len(papers)} papers for {date}:\n\n" + "\n".join([ f"Title: {paper['title']}\n" f"Authors: {', '.join(paper['authors'])}\n" f"Abstract: {paper['abstract']}\n" f"URL: {paper['url']}\n" f"PDF: {paper['pdf_url']}\n" f"Votes: {paper['votes']}\n" f"Submitted by: {paper['submitted_by']}\n" + "-" * 50 for paper in papers ]) ) ]