Skip to main content
Glama

find_duplicates

Identify duplicate items in your Zotero library by title, DOI, or scanning the entire collection to maintain organized research materials.

Instructions

Find duplicate items by title or DOI, or scan entire library

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
titleNo
doiNo
collection_keyNo
scan_allNo

Implementation Reference

  • The implementation logic for scanning all duplicates in the library.
    def _scan_duplicates(self, collection_key: str = "") -> list[list[dict]]:
        """Scan library or collection for all duplicate groups."""
        if collection_key:
            items = self.zot.everything(
                self.zot.collection_items(collection_key, itemType="-attachment || note")
            )
        else:
            items = self.zot.everything(
                self.zot.items(itemType="-attachment || note")
            )
    
        by_title: dict[str, list] = defaultdict(list)
        for item in items:
            title = item["data"].get("title", "")
            if title:
                norm = self._normalize_title(title)
                by_title[norm].append(item)
    
        return [
            [self._format_item_summary(i) for i in group]
            for group in by_title.values()
            if len(group) > 1
        ]
  • The main method in ZoteroClient that handles the search for duplicates by title, DOI, or scanning all.
    def find_duplicates(
        self,
        title: str = "",
        doi: str = "",
        collection_key: str = "",
        scan_all: bool = False,
    ) -> list[list[dict]]:
        """Find duplicate items by title/DOI or scan entire library."""
        if scan_all:
            return self._scan_duplicates(collection_key)
    
        if not title and not doi:
            return []
    
        results = []
        if doi:
            items = self.zot.items(q=doi, limit=50)
            items = [i for i in items if i["data"].get("DOI", "").strip() == doi.strip()]
            if len(items) > 1:
                results.append([self._format_item_summary(i) for i in items])
        if title:
            items = self.zot.items(q=title, limit=50)
            norm = self._normalize_title(title)
            matches = [
                i for i in items
                if self._normalize_title(i["data"].get("title", "")) == norm
            ]
            if len(matches) > 1:
                group = [self._format_item_summary(i) for i in matches]
                if not results or {i["key"] for i in group} != {i["key"] for i in results[0]}:
                    results.append(group)
        return results
  • Tool registration and handler wrapper for `find_duplicates`.
    @mcp.tool(description="Find duplicate items by title or DOI, or scan entire library")
    def find_duplicates(
        title: str = "",
        doi: str = "",
        collection_key: str = "",
        scan_all: bool = False,
    ) -> str:
        """Find potential duplicates. Use scan_all=True to scan the whole library."""
        results = _get_client().find_duplicates(title, doi, collection_key, scan_all)
        return json.dumps(results, ensure_ascii=False)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/BirdInTheTree/zotero-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server