Skip to main content
Glama
maxkuminov

Obsidian MCP (pgvector + Ollama, self-hosted)

find_orphans

Find notes with no resolved inbound or outbound links in Obsidian vault. Scopes search to a folder and limits results for focused cleanup decisions.

Instructions

Notes with zero incoming AND zero outgoing resolved links — useful for vault hygiene ("what's disconnected?") and cleanup decisions.

Args: folder: Optional vault-relative folder prefix to scope the search (e.g. "Cards/"). limit: Maximum results (default 50, hard cap 500).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
folderNo
limitNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Core implementation of find_orphans. Queries NoteMetadata for notes whose ID does not appear in any NoteLink source or target column (i.e., zero incoming and zero outgoing links). Supports optional folder filter and limit, returns a formatted string of orphan note paths.
    @_tracked("find_orphans", ["folder", "limit"])
    async def find_orphans_impl(folder: str | None = None, limit: int = 50) -> str:
        """Notes with zero incoming AND zero outgoing resolved links."""
        from sqlalchemy import select, union
        from src.models.db import NoteLink, NoteMetadata
    
        uid = current_user_id.get()
        limit = max(1, min(limit, 500))
    
        async with async_session() as session:
            # The "connected" subquery collects every NoteLink endpoint id.
            # Since `note_links` has no `user_id`, scoping happens implicitly:
            # the outer `NoteMetadata` query filters to this user's notes, so
            # only those rows are candidates for orphan-ness. Any cross-user
            # NoteLink rows (which would only exist on a corrupted state)
            # would still appear in `connected` and exclude the corresponding
            # note id — that's the safe direction (false negatives, not
            # false orphans).
            sources = select(NoteLink.source_note_id.label("nid")).where(
                NoteLink.source_note_id.isnot(None)
            )
            targets = select(NoteLink.target_note_id.label("nid")).where(
                NoteLink.target_note_id.isnot(None)
            )
            connected = union(sources, targets).subquery()
            stmt = select(NoteMetadata).where(NoteMetadata.id.notin_(select(connected.c.nid)))
            stmt = apply_note_filters(stmt, folder=folder, user_id=uid)
            stmt = stmt.order_by(NoteMetadata.modified_at.desc().nullslast()).limit(limit)
            notes = (await session.execute(stmt)).scalars().all()
    
        if not notes:
            scope = f" in `{folder}`" if folder else ""
            return f"No orphan notes{scope}"
        lines = [f"Found {len(notes)} orphan notes:\n"]
        for n in notes:
            mod = n.modified_at.strftime("%Y-%m-%d") if n.modified_at else "unknown"
            tags_str = f" [{', '.join(n.tags)}]" if n.tags else ""
            lines.append(f"- `{n.file_path}` — {n.title}{tags_str} (modified {mod})")
        return "\n".join(lines)
  • MCP tool registration for find_orphans. Decorated with @mcp.tool() which registers it with the FastMCP server. The function just delegates to find_orphans_impl.
    @mcp.tool()
    async def find_orphans(folder: str | None = None, limit: int = 50) -> str:
        """Notes with zero incoming AND zero outgoing resolved links — useful for
        vault hygiene ("what's disconnected?") and cleanup decisions.
    
        Args:
            folder: Optional vault-relative folder prefix to scope the search
                (e.g. "Cards/").
            limit: Maximum results (default 50, hard cap 500).
        """
        return await find_orphans_impl(folder=folder, limit=limit)
  • Shared helper function used by find_orphans_impl to apply optional folder, tags, frontmatter, and user_id filters to the SQL query.
    def apply_note_filters(
        stmt: Select,
        *,
        folder: str | None = None,
        tags: list[str] | None = None,
        frontmatter: dict | None = None,
        user_id: int | None = None,
    ) -> Select:
        """Append optional `folder`, `tags`, `frontmatter`, `user_id` predicates
        to a select over NoteMetadata.
    
        - `folder`: prefix match on `file_path`. LIKE wildcards (`%`, `_`, `\\`) are escaped.
        - `tags`: ARRAY containment (`notes_metadata.tags @> ARRAY[...]`). AND semantics.
        - `frontmatter`: JSONB containment (`notes_metadata.frontmatter @> :json`). Strict types.
        - `user_id`: scope to one user. `None` (single-user mode / unset) means no
          filter is appended, so existing NULL-user rows are returned. `int` adds
          `.where(NoteMetadata.user_id == user_id)`.
    
        None or empty argument means "no filter" — the predicate is not appended.
        """
        if folder:
            escaped = _escape_like(folder)
            stmt = stmt.where(NoteMetadata.file_path.like(f"{escaped}%", escape="\\"))
        if tags:
            stmt = stmt.where(NoteMetadata.tags.contains(tags))
        if frontmatter:
            stmt = stmt.where(NoteMetadata.frontmatter.contains(frontmatter))
        if user_id is not None:
            stmt = stmt.where(NoteMetadata.user_id == user_id)
        return stmt
  • Import of find_orphans_impl from the tools module, used by the MCP tool registration in server.py.
    from src.mcp_server.tools import (
        create_note_impl,
        delete_note_impl,
        edit_note_impl,
        find_orphans_impl,
        find_related_impl,
        get_backlinks_impl,
        get_links_impl,
        get_neighborhood_impl,
        get_recent_impl,
        get_tags_impl,
        get_vault_guide_impl,
        list_notes_impl,
        move_note_impl,
        read_note_impl,
        search_notes_impl,
        semantic_search_impl,
        set_frontmatter_impl,
    )
  • Decorator used by find_orphans_impl to log usage metrics (timing, params, response size) to the usage_logs table.
    def _tracked(tool_name: str, param_keys: list[str]):
        """Decorator that times the call and logs it to usage_logs."""
        def decorator(fn):
            @wraps(fn)
            async def wrapper(*args, **kwargs):
                start = time.monotonic()
                result = await fn(*args, **kwargs)
                duration_ms = int((time.monotonic() - start) * 1000)
                params = {}
                for i, key in enumerate(param_keys):
                    if i < len(args):
                        params[key] = args[i]
                    elif key in kwargs:
                        params[key] = kwargs[key]
                await _log_usage(tool_name, _truncate_params(params), duration_ms, len(str(result)))
                return result
            return wrapper
        return decorator
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full responsibility. It discloses the condition of 'resolved links' but does not elaborate on behavioral traits such as performance, authentication needs, or what happens with no results. It is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: two sentences for purpose and usage, followed by parameter documentation. No redundant words, and the key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given an output schema exists, the description does not need to detail return format. It covers purpose, usage, and parameters well. A minor gap is the lack of explanation about what constitutes a 'resolved link,' but overall it is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage, the description fully documents both parameters: folder as a vault-relative scope prefix and limit with default and hard cap. This adds essential meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds notes with zero incoming and zero outgoing resolved links, providing a specific verb+resource. It distinguishes from siblings like get_links and get_backlinks which operate on individual notes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says it is useful for vault hygiene and cleanup decisions, giving clear context for when to use it. However, it does not provide explicit when-not-to-use scenarios or mention alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/maxkuminov/obsidian-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server