codebrain_scan_repo

Scan a repository's source files to generate or refresh .brain files, skipping unchanged files via hash comparison. Filters by file extension and excludes directories like .git and node_modules.

Instructions

Scan every source file under root and generate/refresh its .brain file.

Walks the directory tree, filters by file extension, prunes excluded directories, and runs codebrain_scan_file on each match. Hash-gated: unchanged files skip the model call. Per-file failures do not abort the batch — they are reported at the end.

Defaults:

extensions: .py .js .ts .tsx .jsx .java .go .rs
exclude_dirs: .git .venv venv node_modules pycache dist build target

Args: root: Directory to scan recursively. force: If true, regenerate every brain file even when source hash matches. extensions: Override default source extensions (e.g. [".py", ".rb"]). exclude_dirs: Override default directory-name exclusion list.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`root`	Yes
`force`	No
`extensions`	No
`exclude_dirs`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

codebrain/server.py:301-327 (handler)

MCP tool handler for codebrain_scan_repo. Decorated with @mcp.tool(), it delegates to brain_scanner.scan_repo().

@mcp.tool()
async def codebrain_scan_repo(
    root: str,
    force: bool = False,
    extensions: list[str] | None = None,
    exclude_dirs: list[str] | None = None,
) -> str:
    """Scan every source file under `root` and generate/refresh its `.brain` file.

    Walks the directory tree, filters by file extension, prunes excluded
    directories, and runs `codebrain_scan_file` on each match. Hash-gated:
    unchanged files skip the model call. Per-file failures do not abort the
    batch — they are reported at the end.

    Defaults:
      - extensions: .py .js .ts .tsx .jsx .java .go .rs
      - exclude_dirs: .git .venv venv node_modules __pycache__ dist build target

    Args:
        root: Directory to scan recursively.
        force: If true, regenerate every brain file even when source hash matches.
        extensions: Override default source extensions (e.g. [".py", ".rb"]).
        exclude_dirs: Override default directory-name exclusion list.
    """
    return await brain_scanner.scan_repo(
        root, force=force, extensions=extensions, exclude_dirs=exclude_dirs
    )

codebrain/brain_scanner.py:353-397 (handler)

Core logic implementing scan_repo. Walks source files via iter_source_files(), calls scan_file() for each, and reports generated/skipped/failed counts.

async def scan_repo(
    root: str,
    force: bool = False,
    extensions: list[str] | None = None,
    exclude_dirs: list[str] | None = None,
) -> str:
    """Scan every source file under `root` and generate/refresh its `.brain` file.

    Hash-gated: files whose source hash matches the existing `.brain` are
    skipped without invoking the model. Use `force=True` to override.
    Per-file failures do not abort the batch — they are reported at the end.
    """
    root_path = Path(root)
    if not root_path.exists():
        return f"[codebrain error] root not found: {root}"
    if not root_path.is_dir():
        return f"[codebrain error] root is not a directory: {root}"

    generated: list[str] = []
    skipped: list[str] = []
    failed: list[tuple[str, str]] = []

    for source in iter_source_files(root_path, extensions, exclude_dirs):
        display = resolve_display_path(source)
        result = await scan_file(str(source), force=force)
        if result.startswith("generated:"):
            generated.append(display)
        elif result.startswith("skipped"):
            skipped.append(display)
        else:
            failed.append((display, result))

    total = len(generated) + len(skipped) + len(failed)
    lines = [
        f"Scanned {total} files: {len(generated)} generated, "
        f"{len(skipped)} skipped, {len(failed)} failed."
    ]
    if generated:
        lines.append("\nGenerated:")
        lines.extend(f"  - {p}" for p in generated)
    if failed:
        lines.append("\nFailed:")
        lines.extend(f"  - {p} — {reason}" for p, reason in failed)
    return "\n".join(lines)

codebrain/brain_scanner.py:67-69 (helper)

Helper: compute SHA256 hash of source file content, used for skip-gating unchanged files.

def compute_source_hash(content: bytes) -> str:
    """Return `sha256:<hex>` digest of raw file bytes."""
    return "sha256:" + hashlib.sha256(content).hexdigest()

codebrain/brain_scanner.py:332-350 (helper)

Helper: iterates source files under root matching extensions, pruning excluded dirs via os.walk.

def iter_source_files(
    root: Path,
    extensions: list[str] | None = None,
    exclude_dirs: list[str] | None = None,
) -> Iterator[Path]:
    """Yield source files under `root` matching `extensions`, pruning `exclude_dirs`.

    Walks the tree with `os.walk` and mutates the dirs list in-place to prune
    excluded directories before descending. Does NOT yield `.brain` files
    (the extension whitelist takes care of that implicitly).
    """
    ext_set = _normalise_extensions(extensions)
    exclude_set = frozenset(exclude_dirs) if exclude_dirs is not None else DEFAULT_EXCLUDE_DIRS

    for dirpath, dirnames, filenames in os.walk(root):
        dirnames[:] = [d for d in dirnames if d not in exclude_set]
        for fname in filenames:
            if Path(fname).suffix.lower() in ext_set:
                yield Path(dirpath) / fname

CodeBrain

codebrain_scan_repo

Instructions

Input Schema

Output Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API