index_local
Indexes a local folder of documentation files, parsing content by heading hierarchy into sections for efficient retrieval. Supports embeddings and AI summaries when configured.
Instructions
Index a local folder containing documentation files (.md, .txt, .rst). Parses by heading hierarchy into sections for efficient retrieval. Embeddings auto-enable when a provider is configured (GOOGLE_API_KEY, OPENAI_API_KEY, openai-compatible + JDOCMUNCH_OPENAI_COMPAT_URL + JDOCMUNCH_OPENAI_COMPAT_MODEL, or sentence-transformers).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Path to local folder (absolute or relative, supports ~ for home directory) | |
| use_ai_summaries | No | Use AI to generate section summaries (requires ANTHROPIC_API_KEY or GOOGLE_API_KEY). When false, uses heading text. | |
| use_embeddings | No | Generate semantic embeddings for each section, enabling hybrid (BM25+semantic) search. true/false/"auto". "auto" (default) enables embeddings when an embedding provider is configured (GOOGLE_API_KEY, OPENAI_API_KEY, openai-compatible + JDOCMUNCH_OPENAI_COMPAT_URL + JDOCMUNCH_OPENAI_COMPAT_MODEL, or sentence-transformers installed). | auto |
| extra_ignore_patterns | No | Additional gitignore-style patterns to exclude from indexing | |
| follow_symlinks | No | Whether to follow symlinks. Default false for security. | |
| max_files | No | Maximum number of doc files to index. Default 10000. When the cap is hit, the response includes `truncated: true`, `discovered: <total found>`, and `indexed: <max_files>` so the caller can detect data loss programmatically. Raise this for very large corpora. | |
| sort_by | No | Order in which files are truncated when discovered > max_files. 'newest' (default) keeps the most recently-edited files so a fresh edit is always in the index. 'walk_order' preserves filesystem-walk order for deterministic reproducible builds. No effect when corpus fits under the cap. | newest |
| name | No | Optional repo identifier override. Use this when two folders share the same name (e.g. both named 'docs'). If omitted, the folder name is used. Example: 'requests-docs', 'flask-docs'. | |
| incremental | No | When true (default), only re-index files that changed since the last index. Set to false to force a full re-index. | |
| autotune | No | v1.29+ — when true, runs tune_weights against accumulated ranking events at the end of indexing. No-op when telemetry isn't enabled. | |
| paths | No | Optional list of explicit paths to index. When provided, the directory walk is skipped; only these files (and the contents of any directories in the list) are indexed. Entries may be absolute or relative to `path`. Useful for batch-indexing exactly the files an agent already knows about — e.g. the doc files git just touched. |