docgraph_similar
Find documents topically similar to a given document using TF-IDF term overlap, shared references, and tag overlap. Returns a similarity score and the signal components.
Instructions
Find documents topically similar to a given document using TF-IDF term overlap + shared references + tag overlap (engine=auto/tfidf — the default, always on, no flags). Returns 0 results for a topically unique document (a broad README or changelog commonly has no similar_to edges even when the index is fully built and the engine is working): 0 does NOT mean the engine is off, embeddings are disabled, or the index is broken. Neural similarity is an OPTIONAL add-on layered on top — only if embeddings were stored via docgraph_embeddings action=store (engine=neural) are neural scores added; embeddings being disabled never causes a TF-IDF 0-result. For explicit link tracking use docgraph_graph. Accepts document paths only — heading anchors (doc.md#heading) return empty. The score is a 0-to-1 weighted blend (TF-IDF cosine 50% + shared-reference Jaccard 30% + tag Jaccard 20%); it is NOT a percentage. Each result shows the three signal components that drove its score. No per-vocabulary-term breakdown is available — the engine does not retain individual term contributions, so you cannot identify which specific terms, phrases, or mentions made a score high OR low; any per-term explanation of the TF-IDF component is fabricated. Scores are corpus-relative; 0.4-0.5 can mean near-identical in a corpus with high shared vocabulary.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| document | Yes | Document name or path (document paths only; heading anchors return empty) | |
| engine | No | Similarity engine: auto (default), tfidf, or neural. neural requires --enable-embeddings; returns an error if the server was not started with that flag. To check whether neural is available BEFORE querying, call docgraph_status and inspect the docgraph_embeddings field. | |
| limit | No | Max results (default 10) | |
| project | No | Workspace mode only: scope results to a single project by name (the directory name shown in docgraph_status). Omit to query all projects. No-op in single-store mode. |