Skip to main content
Glama

Find Unused Attachments

find_unused_attachments
Read-onlyIdempotent

Find attachments in an Obsidian vault that are not referenced by any note, aiding vault cleanup before archiving or sync.

Instructions

Locate attachments that no note references — neither via ![[file]] embeds nor [text](file) markdown links. Useful for vault hygiene before archiving or before running a sync. Pair the output with delete operations from your shell, since this tool deliberately doesn't unlink files.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of unused-attachment paths to return (1-10000, default: 200). Total counts are still reported.
includeBytesNoIf true, also stat each unused attachment and report total reclaimable bytes.

Implementation Reference

  • The main handler function for the 'find_unused_attachments' tool. It lists all attachments and notes, scans each note for references (wikilink embeds and markdown links), then returns the set of attachments that are not referenced by any note. Supports optional includeBytes to report reclaimable storage.
    async ({ limit, includeBytes }, extra) => {
      try {
        const reportProgress = makeProgressReporter(extra);
        const attachments = await listAttachments(vaultPath);
        if (attachments.length === 0) {
          return textResult("No attachments in this vault — nothing to check.");
        }
        const attachmentSet = new Set(attachments);
        const basenameIndex = new Map<string, string[]>();
        for (const p of attachments) {
          const base = path.basename(p).toLowerCase();
          const list = basenameIndex.get(base);
          if (list) list.push(p);
          else basenameIndex.set(base, [p]);
        }
    
        const notes = await listNotes(vaultPath);
        await reportProgress(0, notes.length, "Reading notes…");
        const { contents } = await readAllCached(vaultPath, notes, (note, err) => {
          log.warn("find_unused_attachments: note read failed", { note, err });
        });
    
        const referenced = new Set<string>();
        let scanned = 0;
        for (const notePath of notes) {
          const content = contents.get(notePath);
          if (content !== undefined) {
            const { resolved } = collectReferencedAttachments(content, attachmentSet, basenameIndex);
            for (const r of resolved) referenced.add(r);
          }
          scanned++;
          await reportProgress(scanned, notes.length, `Scanned ${scanned}/${notes.length} notes`);
        }
    
        const unused = attachments.filter((p) => !referenced.has(p));
        if (unused.length === 0) {
          return textResult(
            `All ${attachments.length} attachment(s) are referenced — nothing to clean up.`,
          );
        }
    
        const truncated = unused.slice(0, limit);
        const lines: string[] = [
          `Found ${unused.length} unused attachment(s) of ${attachments.length} total${unused.length > limit ? ` (showing first ${limit})` : ""}:`,
          "",
        ];
    
        if (includeBytes) {
          let totalBytes = 0;
          const sizes = new Map<string, number>();
          for (const p of truncated) {
            try {
              const stat = await getAttachmentStats(vaultPath, p);
              sizes.set(p, stat.size);
              totalBytes += stat.size;
            } catch {
              // skip — file may have been removed mid-scan
            }
          }
          lines.push(`Total reclaimable: ${totalBytes.toLocaleString()} bytes`);
          lines.push("");
          for (const p of truncated) {
            const sz = sizes.get(p);
            lines.push(sz !== undefined ? `- ${p}  (${sz.toLocaleString()} bytes)` : `- ${p}`);
          }
        } else {
          for (const p of truncated) lines.push(`- ${p}`);
        }
    
        return textResult(lines.join("\n"));
      } catch (err) {
        log.error("find_unused_attachments failed", {
          tool: "find_unused_attachments",
          err: err as Error,
        });
        return errorResult(`Error finding unused attachments: ${sanitizeError(err)}`);
      }
    },
  • Input schema for 'find_unused_attachments': limit (1-10000, default 200) for max paths returned, and includeBytes (boolean, default false) to also stat files and report total reclaimable bytes.
    inputSchema: {
      limit: z
        .number()
        .int()
        .min(1)
        .max(10000)
        .optional()
        .default(200)
        .describe("Maximum number of unused-attachment paths to return (1-10000, default: 200). Total counts are still reported."),
      includeBytes: z
        .boolean()
        .optional()
        .default(false)
        .describe("If true, also stat each unused attachment and report total reclaimable bytes."),
    },
  • Registration of the 'find_unused_attachments' tool via server.registerTool inside the registerAttachmentTools function, which is exported and called from src/index.ts.
    server.registerTool(
      "find_unused_attachments",
      {
        title: "Find Unused Attachments",
        description:
          "Locate attachments that no note references — neither via `![[file]]` embeds nor `[text](file)` markdown links. Useful for vault hygiene before archiving or before running a sync. Pair the output with `delete` operations from your shell, since this tool deliberately doesn't unlink files.",
        annotations: {
          readOnlyHint: true,
          idempotentHint: true,
          openWorldHint: false,
        },
        inputSchema: {
          limit: z
            .number()
            .int()
            .min(1)
            .max(10000)
            .optional()
            .default(200)
            .describe("Maximum number of unused-attachment paths to return (1-10000, default: 200). Total counts are still reported."),
          includeBytes: z
            .boolean()
            .optional()
            .default(false)
            .describe("If true, also stat each unused attachment and report total reclaimable bytes."),
        },
      },
      async ({ limit, includeBytes }, extra) => {
        try {
          const reportProgress = makeProgressReporter(extra);
          const attachments = await listAttachments(vaultPath);
          if (attachments.length === 0) {
            return textResult("No attachments in this vault — nothing to check.");
          }
          const attachmentSet = new Set(attachments);
          const basenameIndex = new Map<string, string[]>();
          for (const p of attachments) {
            const base = path.basename(p).toLowerCase();
            const list = basenameIndex.get(base);
            if (list) list.push(p);
            else basenameIndex.set(base, [p]);
          }
    
          const notes = await listNotes(vaultPath);
          await reportProgress(0, notes.length, "Reading notes…");
          const { contents } = await readAllCached(vaultPath, notes, (note, err) => {
            log.warn("find_unused_attachments: note read failed", { note, err });
          });
    
          const referenced = new Set<string>();
          let scanned = 0;
          for (const notePath of notes) {
            const content = contents.get(notePath);
            if (content !== undefined) {
              const { resolved } = collectReferencedAttachments(content, attachmentSet, basenameIndex);
              for (const r of resolved) referenced.add(r);
            }
            scanned++;
            await reportProgress(scanned, notes.length, `Scanned ${scanned}/${notes.length} notes`);
          }
    
          const unused = attachments.filter((p) => !referenced.has(p));
          if (unused.length === 0) {
            return textResult(
              `All ${attachments.length} attachment(s) are referenced — nothing to clean up.`,
            );
          }
    
          const truncated = unused.slice(0, limit);
          const lines: string[] = [
            `Found ${unused.length} unused attachment(s) of ${attachments.length} total${unused.length > limit ? ` (showing first ${limit})` : ""}:`,
            "",
          ];
    
          if (includeBytes) {
            let totalBytes = 0;
            const sizes = new Map<string, number>();
            for (const p of truncated) {
              try {
                const stat = await getAttachmentStats(vaultPath, p);
                sizes.set(p, stat.size);
                totalBytes += stat.size;
              } catch {
                // skip — file may have been removed mid-scan
              }
            }
            lines.push(`Total reclaimable: ${totalBytes.toLocaleString()} bytes`);
            lines.push("");
            for (const p of truncated) {
              const sz = sizes.get(p);
              lines.push(sz !== undefined ? `- ${p}  (${sz.toLocaleString()} bytes)` : `- ${p}`);
            }
          } else {
            for (const p of truncated) lines.push(`- ${p}`);
          }
    
          return textResult(lines.join("\n"));
        } catch (err) {
          log.error("find_unused_attachments failed", {
            tool: "find_unused_attachments",
            err: err as Error,
          });
          return errorResult(`Error finding unused attachments: ${sanitizeError(err)}`);
        }
      },
    );
  • src/index.ts:26-26 (registration)
    Import of registerAttachmentTools from src/tools/attachments.js, used to register all attachment tools including find_unused_attachments.
    import { registerAttachmentTools } from "./tools/attachments.js";
  • Helper function collectReferencedAttachments which resolves the set of attachment paths referenced by a single note's content, considering ![[wikilink]] embeds and [markdown links](files). Uses exact relative-path match then basename match.
    function collectReferencedAttachments(
      noteContent: string,
      attachmentSet: ReadonlySet<string>,
      basenameIndex: ReadonlyMap<string, string[]>,
    ): { resolved: Set<string>; unresolved: string[] } {
      const resolved = new Set<string>();
      const unresolved: string[] = [];
    
      const consider = (rawTarget: string): void => {
        const t = rawTarget.split("#")[0].split("^")[0].trim();
        if (!t) return;
    
        // 1) Exact relative-path match (case-insensitive on case-insensitive FS,
        //    but we lowercase consistently to keep cross-platform behavior stable).
        const lower = t.toLowerCase();
        for (const att of attachmentSet) {
          if (att.toLowerCase() === lower) {
            resolved.add(att);
            return;
          }
        }
    
        // 2) Basename match. Obsidian also allows missing extensions on
        //    attachment links, but only for image/PDF formats — we stay strict
        //    and require the extension to keep this code small.
        const base = path.basename(t).toLowerCase();
        const candidates = basenameIndex.get(base);
        if (candidates && candidates.length > 0) {
          for (const c of candidates) resolved.add(c);
          return;
        }
    
        unresolved.push(t);
      };
    
      for (const span of extractWikilinkSpans(noteContent)) {
        if (!span.isEmbed) continue;
        consider(span.target);
      }
      for (const span of extractMarkdownLinkSpans(noteContent)) {
        // Markdown embed: `![text](url.png)`. The `isEmbed` flag captures `!`.
        // Plain `[text](url)` to a file is also a reference, even without `!`.
        consider(span.urlPath);
      }
      return { resolved, unresolved };
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. The description adds key behavioral context: 'deliberately doesn't unlink files' — which reinforces the read-only nature and clarifies the tool's scope. It also implies the output is a list of paths. This extra safety guidance goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: three sentences that front-load the core purpose, then provide usage context and a user guidance note. Every sentence earns its place with no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, no output schema, clear annotations), the description covers purpose, use cases, and a caveat. It doesn't explicitly describe the output format (e.g., list of file paths), but that is reasonably implied. Minor omission, but overall sufficient for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already fully documents both parameters (limit, includeBytes) with clear descriptions and defaults. The tool description adds no additional parameter semantics. With 100% schema coverage, a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool locates attachments not referenced by any note, with specific definition of 'unused' (no embeds or links). The verb 'Locate' and resource 'attachments' are precise, and it distinguishes this tool from siblings like find_orphans (which finds notes) and find_broken_links (which finds links).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit use cases: 'vault hygiene before archiving or before running a sync.' It also advises pairing output with shell delete commands since the tool doesn't unlink. However, it doesn't explicitly compare with sibling tools (e.g., when to use find_orphans vs this), so a minor gap prevents a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rps321321/obsidian-mcp-pro'

If you have feedback or need assistance with the MCP directory API, please join our Discord server