zk_find_similar_notes
Identifies and retrieves notes similar to a specified reference note based on similarity threshold and result limit, aiding in Zettelkasten knowledge management.
Instructions
Find notes similar to a given note. Args: note_id: ID of the reference note threshold: Similarity threshold (0.0-1.0) limit: Maximum number of results to return
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| note_id | Yes | ||
| threshold | No |
Implementation Reference
- MCP tool handler implementation for zk_find_similar_notes, including decorator registration, input parameters (note_id, threshold, limit), logic to fetch similar notes from zettel_service, apply limit, and format output string.name="zk_find_similar_notes", description="Discover notes with similar content using semantic similarity analysis.", annotations={ "readOnlyHint": True, "destructiveHint": False, "idempotentHint": True, }, ) def zk_find_similar_notes( note_id: str, threshold: float = 0.3, limit: int = 5 ) -> str: """Discover notes with similar content using semantic similarity analysis. Args: note_id: The unique ID of the reference note to compare against threshold: Minimum similarity score from 0.0 (unrelated) to 1.0 (identical) (default: 0.3) limit: Maximum number of similar notes to return (default: 5) """ try: # Get similar notes similar_notes = self.zettel_service.find_similar_notes( str(note_id), threshold ) # Limit results similar_notes = similar_notes[:limit] if not similar_notes: return f"No similar notes found for {note_id} with threshold {threshold}." # Format results output = f"Found {len(similar_notes)} similar notes for {note_id}:\n\n" for i, (note, similarity) in enumerate(similar_notes, 1): output += f"{i}. {note.title} (ID: {note.id})\n" output += f" Similarity: {similarity:.2f}\n" if note.tags: output += ( f" Tags: {', '.join(tag.name for tag in note.tags)}\n" ) # Add a snippet of content (first 100 chars) content_preview = note.content[:100].replace("\n", " ") if len(note.content) > 100: content_preview += "..." output += f" Preview: {content_preview}\n\n" return output except Exception as e: return self.format_error_response(e)
- Core helper method implementing similarity calculation based on tag overlap (40%), shared outgoing links (20%), incoming links to the reference note (20%), and direct outgoing links (20%). Compares against all notes, filters by threshold, sorts by score.def find_similar_notes(self, note_id: str, threshold: float = 0.5) -> List[Tuple[Note, float]]: """Find notes similar to the given note based on shared tags and links.""" note = self.repository.get(note_id) if not note: raise ValueError(f"Note with ID {note_id} not found") # Get all notes all_notes = self.repository.get_all() results = [] # Set of this note's tags and links note_tags = {tag.name for tag in note.tags} note_links = {link.target_id for link in note.links} # Add notes linked to this note incoming_notes = self.repository.find_linked_notes(note_id, "incoming") note_incoming = {n.id for n in incoming_notes} # For each note, calculate similarity for other_note in all_notes: if other_note.id == note_id: continue # Calculate tag overlap other_tags = {tag.name for tag in other_note.tags} tag_overlap = len(note_tags.intersection(other_tags)) # Calculate link overlap (outgoing) other_links = {link.target_id for link in other_note.links} link_overlap = len(note_links.intersection(other_links)) # Check if other note links to this note incoming_overlap = 1 if other_note.id in note_incoming else 0 # Check if this note links to other note outgoing_overlap = 1 if other_note.id in note_links else 0 # Calculate similarity score # Weight: 40% tags, 20% outgoing links, 20% incoming links, 20% direct connections total_possible = ( max(len(note_tags), len(other_tags)) * 0.4 + max(len(note_links), len(other_links)) * 0.2 + 1 * 0.2 + # Possible incoming link 1 * 0.2 # Possible outgoing link ) # Avoid division by zero if total_possible == 0: similarity = 0.0 else: similarity = ( (tag_overlap * 0.4) + (link_overlap * 0.2) + (incoming_overlap * 0.2) + (outgoing_overlap * 0.2) ) / total_possible if similarity >= threshold: results.append((other_note, similarity)) # Sort by similarity (descending) results.sort(key=lambda x: x[1], reverse=True) return results