Skip to main content
Glama

stats

Aggregate file counts, untagged items, favorites, and top tags across your vault or project. Use it as a dashboard to spot cleanup opportunities.

Instructions

Aggregate counts for a scope: file_count, untagged_count, favorite_count, top_tags. With project_id omitted (everything), also returns by_project breakdown. include_token_total: true stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. project_id: null = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use diff_against_disk.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_idNoFilter to a single project. Pass null for KB-only. Omit for everything.
top_tagsNoHow many top tags to return (default 10)
include_token_totalNoIf true, stat every matching file on disk to compute total est_tokens. Default false (cheap).

Implementation Reference

  • The primary handler for the 'stats' tool. It accepts optional project_id (null=KB, omit=everything), top_tags limit, and include_token_total boolean. Queries the SQLite database for file_count, untagged_count, favorite_count, top_tags, and optionally by_project breakdown. If include_token_total is true, it stats every matching file on disk to compute total estimated tokens. Returns a JSON response with scope, counts, and token estimate.
    server.tool(
      "stats",
      "Aggregate counts for a scope: `file_count`, `untagged_count`, `favorite_count`, `top_tags`. With `project_id` omitted (everything), also returns `by_project` breakdown. `include_token_total: true` stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. `project_id: null` = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use `diff_against_disk`.",
      {
        project_id: z.number().nullable().optional().describe("Filter to a single project. Pass null for KB-only. Omit for everything."),
        top_tags: z.number().int().positive().max(100).optional().describe("How many top tags to return (default 10)"),
        include_token_total: z.boolean().optional().describe("If true, stat every matching file on disk to compute total est_tokens. Default false (cheap)."),
      },
      async ({ project_id, top_tags, include_token_total }) => {
        try {
          const db = getDatabase();
          const limit = top_tags ?? 10;
    
          let scopeWhere = "";
          const scopeParams: any[] = [];
          if (project_id === null) {
            scopeWhere = "WHERE files.project_id IS NULL";
          } else if (typeof project_id === "number") {
            scopeWhere = "WHERE files.project_id = ?";
            scopeParams.push(project_id);
          }
    
          const fileCount = (db
            .prepare(`SELECT COUNT(*) AS n FROM files ${scopeWhere}`)
            .get(...scopeParams) as { n: number }).n;
    
          const untaggedCount = (db
            .prepare(
              `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id NOT IN (SELECT DISTINCT file_id FROM file_tags)`
            )
            .get(...scopeParams) as { n: number }).n;
    
          const favoriteCount = (db
            .prepare(
              `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id IN (SELECT file_id FROM favorites)`
            )
            .get(...scopeParams) as { n: number }).n;
    
          const topTags = db
            .prepare(
              `SELECT t.name, COUNT(*) AS count
               FROM file_tags ft
               JOIN tags t ON t.id = ft.tag_id
               JOIN files ON files.id = ft.file_id
               ${scopeWhere}
               GROUP BY t.id
               ORDER BY count DESC, t.name ASC
               LIMIT ?`
            )
            .all(...scopeParams, limit) as { name: string; count: number }[];
    
          const byProject =
            project_id === undefined
              ? (db
                  .prepare(
                    `SELECT p.id, p.name, COUNT(files.id) AS files
                     FROM projects p
                     LEFT JOIN files ON files.project_id = p.id
                     GROUP BY p.id
                     ORDER BY files DESC, p.name ASC`
                  )
                  .all() as { id: number; name: string; files: number }[])
              : null;
    
          let totalEstTokens: number | null = null;
          if (include_token_total) {
            const rows = db
              .prepare(`SELECT path FROM files ${scopeWhere}`)
              .all(...scopeParams) as { path: string }[];
            let total = 0;
            for (const r of rows) {
              try {
                const sz = statSync(r.path).size;
                total += Math.max(1, Math.ceil(sz / 4));
              } catch {}
            }
            totalEstTokens = total;
          }
    
          return {
            content: [
              {
                type: "text",
                text: JSON.stringify(
                  {
                    scope:
                      project_id === undefined
                        ? "all"
                        : project_id === null
                          ? "knowledge_base"
                          : `project:${project_id}`,
                    file_count: fileCount,
                    untagged_count: untaggedCount,
                    favorite_count: favoriteCount,
                    top_tags: topTags,
                    ...(byProject ? { by_project: byProject } : {}),
                    ...(totalEstTokens !== null ? { total_est_tokens: totalEstTokens } : {}),
                  },
                  null,
                  2
                ),
              },
            ],
          };
        } catch (e: any) {
          return {
            isError: true,
            content: [{ type: "text", text: JSON.stringify({ error: e?.message ?? String(e) }, null, 2) }],
          };
        }
      }
    );
  • The Zod schema for the stats tool defines three optional parameters: project_id (number | null), top_tags (positive int up to 100), and include_token_total (boolean). The tool description explains it provides aggregate counts for a scope.
    server.tool(
      "stats",
      "Aggregate counts for a scope: `file_count`, `untagged_count`, `favorite_count`, `top_tags`. With `project_id` omitted (everything), also returns `by_project` breakdown. `include_token_total: true` stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. `project_id: null` = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use `diff_against_disk`.",
      {
        project_id: z.number().nullable().optional().describe("Filter to a single project. Pass null for KB-only. Omit for everything."),
        top_tags: z.number().int().positive().max(100).optional().describe("How many top tags to return (default 10)"),
        include_token_total: z.boolean().optional().describe("If true, stat every matching file on disk to compute total est_tokens. Default false (cheap)."),
      },
      async ({ project_id, top_tags, include_token_total }) => {
        try {
          const db = getDatabase();
          const limit = top_tags ?? 10;
    
          let scopeWhere = "";
          const scopeParams: any[] = [];
          if (project_id === null) {
            scopeWhere = "WHERE files.project_id IS NULL";
          } else if (typeof project_id === "number") {
            scopeWhere = "WHERE files.project_id = ?";
            scopeParams.push(project_id);
          }
    
          const fileCount = (db
            .prepare(`SELECT COUNT(*) AS n FROM files ${scopeWhere}`)
            .get(...scopeParams) as { n: number }).n;
    
          const untaggedCount = (db
            .prepare(
              `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id NOT IN (SELECT DISTINCT file_id FROM file_tags)`
            )
            .get(...scopeParams) as { n: number }).n;
    
          const favoriteCount = (db
            .prepare(
              `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id IN (SELECT file_id FROM favorites)`
            )
            .get(...scopeParams) as { n: number }).n;
    
          const topTags = db
            .prepare(
              `SELECT t.name, COUNT(*) AS count
               FROM file_tags ft
               JOIN tags t ON t.id = ft.tag_id
               JOIN files ON files.id = ft.file_id
               ${scopeWhere}
               GROUP BY t.id
               ORDER BY count DESC, t.name ASC
               LIMIT ?`
            )
            .all(...scopeParams, limit) as { name: string; count: number }[];
    
          const byProject =
            project_id === undefined
              ? (db
                  .prepare(
                    `SELECT p.id, p.name, COUNT(files.id) AS files
                     FROM projects p
                     LEFT JOIN files ON files.project_id = p.id
                     GROUP BY p.id
                     ORDER BY files DESC, p.name ASC`
                  )
                  .all() as { id: number; name: string; files: number }[])
              : null;
    
          let totalEstTokens: number | null = null;
          if (include_token_total) {
            const rows = db
              .prepare(`SELECT path FROM files ${scopeWhere}`)
              .all(...scopeParams) as { path: string }[];
            let total = 0;
            for (const r of rows) {
              try {
                const sz = statSync(r.path).size;
                total += Math.max(1, Math.ceil(sz / 4));
              } catch {}
            }
            totalEstTokens = total;
          }
    
          return {
            content: [
              {
                type: "text",
                text: JSON.stringify(
                  {
                    scope:
                      project_id === undefined
                        ? "all"
                        : project_id === null
                          ? "knowledge_base"
                          : `project:${project_id}`,
                    file_count: fileCount,
                    untagged_count: untaggedCount,
                    favorite_count: favoriteCount,
                    top_tags: topTags,
                    ...(byProject ? { by_project: byProject } : {}),
                    ...(totalEstTokens !== null ? { total_est_tokens: totalEstTokens } : {}),
                  },
                  null,
                  2
                ),
              },
            ],
          };
        } catch (e: any) {
          return {
            isError: true,
            content: [{ type: "text", text: JSON.stringify({ error: e?.message ?? String(e) }, null, 2) }],
          };
        }
      }
    );
  • The 'stats' tool is categorized under 'Discovery' in the web UI's tool category mapping.
    stats: "Discovery",
  • The tool is registered on the MCP server via server.tool(...) at line 1428 (though the registration call pattern is consistent across all tools defined in this file). The McpServer instance is created at line 134.
    server.tool(
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description covers all behavioral aspects: read-only, no side effects, auth, or rate limits. Also explains the include_token_total parameter's performance impact (stat every file, slower on large vaults).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, front-loaded with purpose and key details. No wasted words; each sentence provides valuable information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all necessary aspects: scope, parameters, performance, usage advice. No output schema, but return values are implied. Slightly more detail on output format would be beneficial, but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds extra context for project_id (null vs omit semantics) and include_token_total (performance trade-off), enhancing understanding beyond schema. Top_tags is only in schema, but that is sufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it aggregates counts for a scope, listing specific metrics (file_count, untagged_count, etc.) and distinguishes behavior based on project_id omission (returns by_project breakdown). This is specific and differentiates from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use as a cheap dashboard or for spotting untagged content, and for live disk-vs-index drift to use diff_against_disk instead. Provides clear context for when and when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/safiyu/ctxnest'

If you have feedback or need assistance with the MCP directory API, please join our Discord server