fuzzy_title_search
Find DBLP publications with approximate title matches by specifying a similarity threshold, year range, or venue filter. Customize results with BibTeX entries and control the number of outputs for precise academic research.
Instructions
Search DBLP for publications with fuzzy title matching. Arguments:
title (string, required): Full or partial title of the publication (case-insensitive).
similarity_threshold (number, required): A float between 0 and 1 where 1.0 means an exact match.
max_results (number, optional): Maximum number of publications to return. Default is 10.
year_from (number, optional): Lower bound for publication year.
year_to (number, optional): Upper bound for publication year.
venue_filter (string, optional): Case-insensitive substring filter for publication venues.
include_bibtex (boolean, optional): Whether to include BibTeX entries in the results. Default is false. Returns a list of publication objects sorted by title similarity score.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| include_bibtex | No | ||
| max_results | No | ||
| similarity_threshold | Yes | ||
| title | Yes | ||
| venue_filter | No | ||
| year_from | No | ||
| year_to | No |
Implementation Reference
- src/mcp_dblp/dblp_client.py:237-327 (handler)Core handler function implementing fuzzy title search logic: performs multiple DBLP searches, computes title similarity using difflib.SequenceMatcher, filters by threshold, sorts by similarity score, optionally fetches BibTeX.def fuzzy_title_search( title: str, similarity_threshold: float, max_results: int = 10, year_from: int | None = None, year_to: int | None = None, venue_filter: str | None = None, include_bibtex: bool = False, ) -> list[dict[str, Any]]: """ Search DBLP for publications with fuzzy title matching. Uses multiple search strategies to improve recall: 1. Search with "title:" prefix 2. Search without prefix (broader matching) 3. Calculate similarity scores and rank by best match Note: DBLP's search ranking may not prioritize the exact paper you're looking for. For best results, include author name or year in the title parameter (e.g., "Attention is All You Need Vaswani" or use the regular search() function). Parameters: title (str): Full or partial title of the publication (case-insensitive). similarity_threshold (float): A float between 0 and 1 where 1.0 means an exact match. max_results (int, optional): Maximum number of publications to return. Default is 10. year_from (int, optional): Lower bound for publication year. year_to (int, optional): Upper bound for publication year. venue_filter (str, optional): Case-insensitive substring filter for publication venues. include_bibtex (bool, optional): Whether to include BibTeX entries. Default is False. Returns: List[Dict[str, Any]]: A list of publication objects sorted by title similarity score. """ logger.info(f"Searching for title: '{title}' with similarity threshold {similarity_threshold}") candidates = [] seen_titles = set() # Strategy 1: Search with title prefix title_query = f"title:{title}" results = search( title_query, max_results=max_results * 3, year_from=year_from, year_to=year_to, venue_filter=venue_filter, ) for pub in results: t = pub.get("title", "") if t not in seen_titles: candidates.append(pub) seen_titles.add(t) # Strategy 2: Search without prefix results = search( title, max_results=max_results * 2, year_from=year_from, year_to=year_to, venue_filter=venue_filter, ) for pub in results: t = pub.get("title", "") if t not in seen_titles: candidates.append(pub) seen_titles.add(t) # Calculate similarity scores filtered = [] for pub in candidates: pub_title = pub.get("title", "") ratio = difflib.SequenceMatcher(None, title.lower(), pub_title.lower()).ratio() if ratio >= similarity_threshold: pub["similarity"] = ratio filtered.append(pub) # Sort by similarity score (highest first) filtered = sorted(filtered, key=lambda x: x.get("similarity", 0), reverse=True) filtered = filtered[:max_results] # Fetch BibTeX entries if requested if include_bibtex: for pub in filtered: if "dblp_key" in pub and pub["dblp_key"]: bibtex = fetch_bibtex_entry(pub["dblp_key"]) if bibtex: pub["bibtex"] = bibtex return filtered
- src/mcp_dblp/server.py:127-155 (registration)Tool registration in list_tools() with name, description, and input schema definition.types.Tool( name="fuzzy_title_search", description=( "Search DBLP for publications with fuzzy title matching.\n" "Arguments:\n" " - title (string, required): Full or partial title of the publication (case-insensitive).\n" " - similarity_threshold (number, required): A float between 0 and 1 where 1.0 means an exact match.\n" " - max_results (number, optional): Maximum number of publications to return. Default is 10.\n" " - year_from (number, optional): Lower bound for publication year.\n" " - year_to (number, optional): Upper bound for publication year.\n" " - venue_filter (string, optional): Case-insensitive substring filter for publication venues.\n" " - include_bibtex (boolean, optional): Whether to include BibTeX entries in the results. Default is false.\n" "Returns a list of publication objects sorted by title similarity score." ), inputSchema={ "type": "object", "properties": { "title": {"type": "string"}, "similarity_threshold": {"type": "number"}, "max_results": {"type": "number"}, "year_from": {"type": "number"}, "year_to": {"type": "number"}, "venue_filter": {"type": "string"}, "include_bibtex": {"type": "boolean"}, }, "required": ["title", "similarity_threshold"], }, ), types.Tool(
- src/mcp_dblp/server.py:141-154 (schema)JSON schema defining input parameters and requirements for the fuzzy_title_search tool.inputSchema={ "type": "object", "properties": { "title": {"type": "string"}, "similarity_threshold": {"type": "number"}, "max_results": {"type": "number"}, "year_from": {"type": "number"}, "year_to": {"type": "number"}, "venue_filter": {"type": "string"}, "include_bibtex": {"type": "boolean"}, }, "required": ["title", "similarity_threshold"], }, ),
- src/mcp_dblp/server.py:310-341 (handler)Server-side dispatch handler for fuzzy_title_search tool calls: validates arguments, invokes the core fuzzy_title_search function, formats and returns results.case "fuzzy_title_search": if "title" not in arguments or "similarity_threshold" not in arguments: return [ types.TextContent( type="text", text="Error: Missing required parameter 'title' or 'similarity_threshold'", ) ] include_bibtex = arguments.get("include_bibtex", False) result = fuzzy_title_search( title=arguments.get("title"), similarity_threshold=arguments.get("similarity_threshold"), max_results=arguments.get("max_results", 10), year_from=arguments.get("year_from"), year_to=arguments.get("year_to"), venue_filter=arguments.get("venue_filter"), include_bibtex=include_bibtex, ) if include_bibtex: return [ types.TextContent( type="text", text=f"Found {len(result)} publications with similar titles:\n\n{format_results_with_similarity_and_bibtex(result)}", ) ] else: return [ types.TextContent( type="text", text=f"Found {len(result)} publications with similar titles:\n\n{format_results_with_similarity(result)}", ) ]