Skip to main content
Glama

Index Documents for RAG

rag_index
Idempotent

Index local files or directories into a semantic search store for later retrieval by embedding them. Re-indexing replaces old chunks.

Instructions

Embed and index a local file or directory into the semantic search store so you can retrieve relevant passages later with rag_search. Re-indexing the same source replaces its old chunks.

Requires an embedding server (LM Studio or llama.cpp) running with an embedding model loaded.

Args:

  • path (string): File or directory to index. Directories are walked for text/code files.

  • max_files (number): Cap on files when indexing a directory, 1-2000 (default 200).

Returns the number of files and chunks indexed.

Example: { "path": "~/project/docs" }

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathYesFile or directory to index
max_filesNoMax files for a directory
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds context beyond annotations: re-indexing replaces old chunks (explaining idempotentHint), and requires an external server. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently structured: front-loaded purpose, then behavior, prerequisite, arg details with example. Every sentence serves a purpose; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers core functionality, parameters, and example. Lacks error handling details, but sufficient for a straightforward indexing tool with no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds value beyond schema: directories are walked for text/code files, and max_files cap with range. Schema coverage is 100%, so baseline 3; description provides extra file type context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it embeds and indexes local files into a semantic search store, with explicit reference to rag_search for retrieval, distinguishing from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context: prerequisite of an embedding server, re-indexing behavior, and links to rag_search. Does not explicitly exclude alternatives but offers sufficient guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/highercomve/mcptools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server