stats
Aggregate file counts, untagged items, favorites, and top tags across your vault or project. Use it as a dashboard to spot cleanup opportunities.
Instructions
Aggregate counts for a scope: file_count, untagged_count, favorite_count, top_tags. With project_id omitted (everything), also returns by_project breakdown. include_token_total: true stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. project_id: null = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use diff_against_disk.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| project_id | No | Filter to a single project. Pass null for KB-only. Omit for everything. | |
| top_tags | No | How many top tags to return (default 10) | |
| include_token_total | No | If true, stat every matching file on disk to compute total est_tokens. Default false (cheap). |
Implementation Reference
- apps/mcp/src/index.ts:1428-1539 (handler)The primary handler for the 'stats' tool. It accepts optional project_id (null=KB, omit=everything), top_tags limit, and include_token_total boolean. Queries the SQLite database for file_count, untagged_count, favorite_count, top_tags, and optionally by_project breakdown. If include_token_total is true, it stats every matching file on disk to compute total estimated tokens. Returns a JSON response with scope, counts, and token estimate.
server.tool( "stats", "Aggregate counts for a scope: `file_count`, `untagged_count`, `favorite_count`, `top_tags`. With `project_id` omitted (everything), also returns `by_project` breakdown. `include_token_total: true` stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. `project_id: null` = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use `diff_against_disk`.", { project_id: z.number().nullable().optional().describe("Filter to a single project. Pass null for KB-only. Omit for everything."), top_tags: z.number().int().positive().max(100).optional().describe("How many top tags to return (default 10)"), include_token_total: z.boolean().optional().describe("If true, stat every matching file on disk to compute total est_tokens. Default false (cheap)."), }, async ({ project_id, top_tags, include_token_total }) => { try { const db = getDatabase(); const limit = top_tags ?? 10; let scopeWhere = ""; const scopeParams: any[] = []; if (project_id === null) { scopeWhere = "WHERE files.project_id IS NULL"; } else if (typeof project_id === "number") { scopeWhere = "WHERE files.project_id = ?"; scopeParams.push(project_id); } const fileCount = (db .prepare(`SELECT COUNT(*) AS n FROM files ${scopeWhere}`) .get(...scopeParams) as { n: number }).n; const untaggedCount = (db .prepare( `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id NOT IN (SELECT DISTINCT file_id FROM file_tags)` ) .get(...scopeParams) as { n: number }).n; const favoriteCount = (db .prepare( `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id IN (SELECT file_id FROM favorites)` ) .get(...scopeParams) as { n: number }).n; const topTags = db .prepare( `SELECT t.name, COUNT(*) AS count FROM file_tags ft JOIN tags t ON t.id = ft.tag_id JOIN files ON files.id = ft.file_id ${scopeWhere} GROUP BY t.id ORDER BY count DESC, t.name ASC LIMIT ?` ) .all(...scopeParams, limit) as { name: string; count: number }[]; const byProject = project_id === undefined ? (db .prepare( `SELECT p.id, p.name, COUNT(files.id) AS files FROM projects p LEFT JOIN files ON files.project_id = p.id GROUP BY p.id ORDER BY files DESC, p.name ASC` ) .all() as { id: number; name: string; files: number }[]) : null; let totalEstTokens: number | null = null; if (include_token_total) { const rows = db .prepare(`SELECT path FROM files ${scopeWhere}`) .all(...scopeParams) as { path: string }[]; let total = 0; for (const r of rows) { try { const sz = statSync(r.path).size; total += Math.max(1, Math.ceil(sz / 4)); } catch {} } totalEstTokens = total; } return { content: [ { type: "text", text: JSON.stringify( { scope: project_id === undefined ? "all" : project_id === null ? "knowledge_base" : `project:${project_id}`, file_count: fileCount, untagged_count: untaggedCount, favorite_count: favoriteCount, top_tags: topTags, ...(byProject ? { by_project: byProject } : {}), ...(totalEstTokens !== null ? { total_est_tokens: totalEstTokens } : {}), }, null, 2 ), }, ], }; } catch (e: any) { return { isError: true, content: [{ type: "text", text: JSON.stringify({ error: e?.message ?? String(e) }, null, 2) }], }; } } ); - apps/mcp/src/index.ts:1428-1539 (schema)The Zod schema for the stats tool defines three optional parameters: project_id (number | null), top_tags (positive int up to 100), and include_token_total (boolean). The tool description explains it provides aggregate counts for a scope.
server.tool( "stats", "Aggregate counts for a scope: `file_count`, `untagged_count`, `favorite_count`, `top_tags`. With `project_id` omitted (everything), also returns `by_project` breakdown. `include_token_total: true` stat()s every matching file on disk to compute a body-size estimate — measurably slower on large vaults; default false. `project_id: null` = KB only; omit = all. Read-only; no side effects, auth, or rate limits. Use as a cheap dashboard or to spot untagged content for cleanup; for live disk-vs-index drift use `diff_against_disk`.", { project_id: z.number().nullable().optional().describe("Filter to a single project. Pass null for KB-only. Omit for everything."), top_tags: z.number().int().positive().max(100).optional().describe("How many top tags to return (default 10)"), include_token_total: z.boolean().optional().describe("If true, stat every matching file on disk to compute total est_tokens. Default false (cheap)."), }, async ({ project_id, top_tags, include_token_total }) => { try { const db = getDatabase(); const limit = top_tags ?? 10; let scopeWhere = ""; const scopeParams: any[] = []; if (project_id === null) { scopeWhere = "WHERE files.project_id IS NULL"; } else if (typeof project_id === "number") { scopeWhere = "WHERE files.project_id = ?"; scopeParams.push(project_id); } const fileCount = (db .prepare(`SELECT COUNT(*) AS n FROM files ${scopeWhere}`) .get(...scopeParams) as { n: number }).n; const untaggedCount = (db .prepare( `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id NOT IN (SELECT DISTINCT file_id FROM file_tags)` ) .get(...scopeParams) as { n: number }).n; const favoriteCount = (db .prepare( `SELECT COUNT(*) AS n FROM files ${scopeWhere}${scopeWhere ? " AND" : "WHERE"} files.id IN (SELECT file_id FROM favorites)` ) .get(...scopeParams) as { n: number }).n; const topTags = db .prepare( `SELECT t.name, COUNT(*) AS count FROM file_tags ft JOIN tags t ON t.id = ft.tag_id JOIN files ON files.id = ft.file_id ${scopeWhere} GROUP BY t.id ORDER BY count DESC, t.name ASC LIMIT ?` ) .all(...scopeParams, limit) as { name: string; count: number }[]; const byProject = project_id === undefined ? (db .prepare( `SELECT p.id, p.name, COUNT(files.id) AS files FROM projects p LEFT JOIN files ON files.project_id = p.id GROUP BY p.id ORDER BY files DESC, p.name ASC` ) .all() as { id: number; name: string; files: number }[]) : null; let totalEstTokens: number | null = null; if (include_token_total) { const rows = db .prepare(`SELECT path FROM files ${scopeWhere}`) .all(...scopeParams) as { path: string }[]; let total = 0; for (const r of rows) { try { const sz = statSync(r.path).size; total += Math.max(1, Math.ceil(sz / 4)); } catch {} } totalEstTokens = total; } return { content: [ { type: "text", text: JSON.stringify( { scope: project_id === undefined ? "all" : project_id === null ? "knowledge_base" : `project:${project_id}`, file_count: fileCount, untagged_count: untaggedCount, favorite_count: favoriteCount, top_tags: topTags, ...(byProject ? { by_project: byProject } : {}), ...(totalEstTokens !== null ? { total_est_tokens: totalEstTokens } : {}), }, null, 2 ), }, ], }; } catch (e: any) { return { isError: true, content: [{ type: "text", text: JSON.stringify({ error: e?.message ?? String(e) }, null, 2) }], }; } } ); - apps/web/src/lib/mcp-tool-categories.ts:66-66 (registration)The 'stats' tool is categorized under 'Discovery' in the web UI's tool category mapping.
stats: "Discovery", - apps/mcp/src/index.ts:141-141 (registration)The tool is registered on the MCP server via server.tool(...) at line 1428 (though the registration call pattern is consistent across all tools defined in this file). The McpServer instance is created at line 134.
server.tool(