calculate_statistics
Compute publication statistics from a list of results. Analyzes total publications, time range, top authors, and top venues to provide actionable insights for research analysis.
Instructions
Calculate statistics from a list of publication results. Arguments:
results (array, required): An array of publication objects, each with at least 'title', 'authors', 'venue', and 'year'. Returns a dictionary with:
total_publications: Total count.
time_range: Dictionary with 'min' and 'max' publication years.
top_authors: List of tuples (author, count) sorted by count.
top_venues: List of tuples (venue, count) sorted by count (empty venue is treated as '(empty)').
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| results | Yes |
Implementation Reference
- src/mcp_dblp/dblp_client.py:498-533 (handler)The core handler function that processes a list of publication results to compute statistics including total count, publication time range, top authors by count, and top venues by count.def calculate_statistics(results: list[dict[str, Any]]) -> dict[str, Any]: """ Calculate statistics from publication results. (Documentation omitted for brevity) """ logger.info(f"Calculating statistics for {len(results)} results") authors = Counter() venues = Counter() years = [] for result in results: for author in result.get("authors", []): authors[author] += 1 venue = result.get("venue", "") # Handle venue as list or string if isinstance(venue, list): venue = ", ".join(venue) if venue else "" if venue: venues[venue] += 1 else: venues["(empty)"] += 1 year = result.get("year") if year: with contextlib.suppress(ValueError, TypeError): years.append(int(year)) stats = { "total_publications": len(results), "time_range": {"min": min(years) if years else None, "max": max(years) if years else None}, "top_authors": sorted(authors.items(), key=lambda x: x[1], reverse=True), "top_venues": sorted(venues.items(), key=lambda x: x[1], reverse=True), } return stats
- src/mcp_dblp/server.py:196-213 (registration)Tool registration in the MCP server's list_tools() method, defining the tool name, description, and input schema for calculate_statistics.types.Tool( name="calculate_statistics", description=( "Calculate statistics from a list of publication results.\n" "Arguments:\n" " - results (array, required): An array of publication objects, each with at least 'title', 'authors', 'venue', and 'year'.\n" "Returns a dictionary with:\n" " - total_publications: Total count.\n" " - time_range: Dictionary with 'min' and 'max' publication years.\n" " - top_authors: List of tuples (author, count) sorted by count.\n" " - top_venues: List of tuples (venue, count) sorted by count (empty venue is treated as '(empty)')." ), inputSchema={ "type": "object", "properties": {"results": {"type": "array"}}, "required": ["results"], }, ),
- src/mcp_dblp/server.py:208-212 (schema)Input schema definition specifying that the tool requires a 'results' array parameter.inputSchema={ "type": "object", "properties": {"results": {"type": "array"}}, "required": ["results"], },
- src/mcp_dblp/server.py:388-400 (registration)Dispatch handler in the server's call_tool method that invokes the calculate_statistics function with provided arguments and formats the response.case "calculate_statistics": if "results" not in arguments: return [ types.TextContent( type="text", text="Error: Missing required parameter 'results'" ) ] result = calculate_statistics(results=arguments.get("results")) return [ types.TextContent( type="text", text=f"Statistics calculated:\n\n{format_dict(result)}" ) ]
- src/mcp_dblp/server.py:24-24 (registration)Import of the calculate_statistics function from dblp_client module.calculate_statistics,