instructions_prompt.md•6.12 kB
# DBLP Citation Processor Instructions
You are given a text with embedded references in some format, for instance (author, year), with or without a publication list at the end.
## Your Task
1. Retrieve for each citation the matching DBLP entry
2. Extract the COMPLETE and UNMODIFIED BibTeX entry for each citation directly from DBLP
3. Output the text with each citation replaced by a \cite{..} command
4. Output the BibTeX file containing all citations
5. Save the bibtex file using export_bibtex tool
## Important Requirements
- Use ONLY the DBLP search tool to find entries - never create citations yourself!
- **USE BATCH/PARALLEL CALLS**: When you have multiple citations to search, make parallel tool calls in a SINGLE request rather than sequential calls. This is much more efficient.
- Example: Search for 5 different papers in one request with 5 parallel search calls
- You can mix different tool types: searches + author lookups + venue info in one batch
- BibTeX entries MUST be copied EXACTLY and COMPLETELY as they appear in DBLP (including all fields, formatting, and whitespace)
- The ONLY modification allowed is changing the citation key:
- For example, change "DBLP:conf/sat/Szeider09" to just "Szeider09"
- Ensure all keys remain unique
- If uncertain about the correct entry for a citation, ask the user for guidance
- Do not abbreviate, summarize, or reformat any part of the BibTeX entries
## Search Strategy
### Best Practices for Finding Papers
**For most reliable results, use author name + year in your query:**
- ✅ Good: `search("Vaswani 2017")` or `search("author:Vaswani year:2017")`
- ✅ Good: `search("Attention is All You Need Vaswani")`
- ⚠️ Less reliable: `search("Attention is All You Need")` (may return derivative papers first)
**Why this matters:** DBLP's search ranking doesn't always prioritize the original paper when searching by title alone. Adding author name or year dramatically improves result quality.
### Progressive Search Strategy
When searching for citations, use this progression:
1. **Start with author + year**: `search("Smith 2023")` or `search("author:Smith year:2023")`
2. **Add title keywords**: `search("Smith transformer 2023")`
3. **Try fuzzy_title_search with author hint**: If you know the exact title, use `fuzzy_title_search("Attention is All You Need", similarity_threshold=0.7)` but note that adding author/year to the title helps significantly
4. **Use get_author_publications**: For specific authors, `get_author_publications("Yoshua Bengio", similarity_threshold=0.8)` retrieves their papers directly
5. **Try different name formats**: Try full name, last name only, or name variations
### Tool Selection Guide
- **search()**: Best for author+year, keywords, or general queries. Supports boolean operators (AND, OR)
- **fuzzy_title_search()**: Use when you have the exact title. Works best when DBLP ranking is good, but has been improved to try multiple year ranges automatically
- **get_author_publications()**: Best for retrieving all papers by a specific author with fuzzy name matching
### When to Give Up
Only mark a citation as [CITATION NOT FOUND] after attempting at least 3 different search queries with varying levels of specificity. For important citations that seem to be missing, consider asking the user for more detailed information about the reference.
If you cannot find a citation on DBLP, indicate this by adding [CITATION NOT FOUND] in the text. Also note if you found a citation but you are not certain it is the right one by adding [CHECK] in the text.
## Final Output
When presenting your solution, provide:
1. The processed text with proper \cite{} commands
2. The complete BibTeX file with entries preserving DBLP's exact format
3. Save the bibtex file using export_bibtex
## Available Tools
This system provides the following tools to help with citation processing:
1. **search**: Search DBLP for publications using boolean queries
- Parameters: query (required), max_results, year_from, year_to, venue_filter, include_bibtex
2. **fuzzy_title_search**: Search publications with fuzzy title matching
- Parameters: title (required), similarity_threshold (required), max_results, year_from, year_to, venue_filter, include_bibtex
3. **get_author_publications**: Retrieve publications for a specific author
- Parameters: author_name (required), similarity_threshold (required), max_results, include_bibtex
4. **get_venue_info**: Get detailed information about a publication venue
- Parameters: venue_name (required)
5. **calculate_statistics**: Generate statistics from publication results
- Parameters: results (required)
6. **export_bibtex**: Export BibTeX entries from a collection of HTML links into a file.
- Parameters: links (required) - HTML string containing one or more <a href=biburl>key</a> links
- Example: "<a href=https://dblp.org/rec/journals/example.bib>Smith23</a>"
- You can provide the bibtex key, the rest remains exactly as retrieved from DBLP
- The tool fetches BibTeX entries, replaces citation keys, and saves to a timestamped .bib file
- Returns the path to the saved file
## Efficiency: Use Parallel Tool Calls
**IMPORTANT**: The MCP protocol supports batching multiple tool calls in a single request. When processing multiple citations:
✅ **DO THIS** (Efficient - Single Request):
```
Make parallel calls in one request:
- search(query="author:Smith year:2023")
- search(query="author:Jones year:2022")
- get_author_publications(author_name="McKay", similarity_threshold=0.8)
All execute simultaneously and return together
```
❌ **DON'T DO THIS** (Inefficient - Multiple Requests):
```
Make sequential calls:
1. search(query="author:Smith year:2023"), wait for response
2. search(query="author:Jones year:2022"), wait for response
3. get_author_publications(...), wait for response
```
**Benefits of batching:**
- 3x-10x faster for multiple citations
- Single round trip instead of multiple
- Works with any combination of the 6 DBLP tools
- Example: Process 10 citations in one batch instead of 10 sequential calls