Skip to main content
Glama

verify_index

Read-onlyIdempotent

Checks the local SQLite index for structural integrity, foreign-key violations, and embedding consistency. Returns a report with status and suggested repairs. Use as a preflight before reindexing.

Instructions

Read-only structural check of the local SQLite index: SQLite integrity_check, foreign-key violations, required-table presence, FTS5 integrity-check, embedding dimension consistency, and orphan embedding detection. Returns a check-by-check report with status (ok/warn/error) and a suggested repair mode for any non-ok finding. Never writes. Use as a preflight before reindex/embed_repo or when search is misbehaving. Returns JSON: { ok, status, checks: [{ name, status, detail, count?, suggested_repair? }] }.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The core handler function `verifyIndex` that executes the read-only structural checks: SQLite integrity_check, foreign-key violations, required-table presence, FTS5 integrity-check, embedding dimension consistency, and orphan embedding detection. Returns a VerifyReport.
    export function verifyIndex(db: Database.Database): VerifyReport {
      const checks: VerifyCheck[] = [];
    
      // ── 1. SQLite integrity_check ─────────────────────────────────────────
      try {
        const rows = db.prepare('PRAGMA integrity_check').all() as IntegrityRow[];
        if (rows.length === 1 && rows[0].integrity_check === 'ok') {
          checks.push({ name: 'sqlite_integrity', status: 'ok', detail: 'PRAGMA integrity_check: ok' });
        } else {
          checks.push({
            name: 'sqlite_integrity',
            status: 'error',
            detail: `PRAGMA integrity_check returned ${rows.length} issues: ${rows
              .slice(0, 3)
              .map((r) => r.integrity_check)
              .join('; ')}${rows.length > 3 ? '; …' : ''}`,
            suggested_repair: 'reindex --force',
          });
        }
      } catch (e) {
        checks.push({
          name: 'sqlite_integrity',
          status: 'error',
          detail: `PRAGMA integrity_check failed: ${e instanceof Error ? e.message : String(e)}`,
          suggested_repair: 'reindex --force',
        });
      }
    
      // ── 2. Foreign-key check ──────────────────────────────────────────────
      try {
        db.pragma('foreign_keys = ON');
        const violations = db.prepare('PRAGMA foreign_key_check').all() as FkViolationRow[];
        if (violations.length === 0) {
          checks.push({ name: 'foreign_keys', status: 'ok', detail: 'No foreign-key violations' });
        } else {
          checks.push({
            name: 'foreign_keys',
            status: 'error',
            detail: `${violations.length} foreign-key violation(s): ${violations
              .slice(0, 3)
              .map((v) => `${v.table}#${v.rowid} → ${v.parent}`)
              .join(', ')}${violations.length > 3 ? ', …' : ''}`,
            count: violations.length,
            suggested_repair: 'drop-orphans',
          });
        }
      } catch (e) {
        checks.push({
          name: 'foreign_keys',
          status: 'warn',
          detail: `foreign_key_check probe failed: ${e instanceof Error ? e.message : String(e)}`,
        });
      }
    
      // ── 3. Required tables ────────────────────────────────────────────────
      const missing: string[] = [];
      for (const t of REQUIRED_TABLES) {
        if (!tableExists(db, t)) missing.push(t);
      }
      if (missing.length === 0) {
        checks.push({
          name: 'required_tables',
          status: 'ok',
          detail: `All ${REQUIRED_TABLES.length} required tables present`,
        });
      } else {
        checks.push({
          name: 'required_tables',
          status: 'error',
          detail: `Missing tables: ${missing.join(', ')}`,
          suggested_repair: 'reindex --force',
        });
      }
    
      // ── 4. FTS5 integrity probe ───────────────────────────────────────────
      // symbols_fts uses external-content (content='symbols'), so COUNT(*) on it
      // always matches the source table — it isn't a useful drift signal. The
      // FTS5 'integrity-check' command, on the other hand, walks the inverted
      // index and reports physical corruption.
      if (tableExists(db, 'symbols_fts')) {
        try {
          db.prepare(`INSERT INTO symbols_fts(symbols_fts) VALUES ('integrity-check')`).run();
          checks.push({
            name: 'fts_integrity',
            status: 'ok',
            detail: 'symbols_fts integrity-check passed',
          });
        } catch (e) {
          checks.push({
            name: 'fts_integrity',
            status: 'warn',
            detail: `symbols_fts integrity-check failed: ${e instanceof Error ? e.message : String(e)}`,
            suggested_repair: 'rebuild-fts',
          });
        }
      }
    
      // ── 5. Embedding dimension consistency ────────────────────────────────
      if (tableExists(db, 'symbol_embeddings') && tableExists(db, 'embedding_meta')) {
        let expectedDim: number | null = null;
        try {
          const meta = db.prepare('SELECT dim FROM embedding_meta WHERE id = 1').get() as
            | { dim: number }
            | undefined;
          expectedDim = meta?.dim ?? null;
        } catch {
          expectedDim = null;
        }
        const total = tableRowCount(db, 'symbol_embeddings');
        if (total === 0) {
          checks.push({ name: 'embedding_dim', status: 'ok', detail: 'No embeddings yet' });
        } else if (expectedDim === null) {
          checks.push({
            name: 'embedding_dim',
            status: 'warn',
            detail: `${total} embeddings but no embedding_meta.dim — dimension cannot be verified`,
            count: total,
            suggested_repair: 'drop-vec',
          });
        } else {
          const expectedBytes = expectedDim * 4; // Float32
          const stmt = db.prepare(
            'SELECT COUNT(*) AS c FROM symbol_embeddings WHERE LENGTH(embedding) != ?',
          );
          const r = stmt.get(expectedBytes) as { c: number };
          const wrong = r?.c ?? 0;
          if (wrong === 0) {
            checks.push({
              name: 'embedding_dim',
              status: 'ok',
              detail: `${total} embeddings × ${expectedDim}d match`,
            });
          } else {
            checks.push({
              name: 'embedding_dim',
              status: 'error',
              detail: `${wrong} of ${total} embeddings have a wrong byte length (expected ${expectedBytes} for ${expectedDim}d)`,
              count: wrong,
              suggested_repair: 'drop-vec',
            });
          }
        }
      }
    
      // ── 6. Orphan embeddings ──────────────────────────────────────────────
      if (tableExists(db, 'symbol_embeddings') && tableExists(db, 'symbols')) {
        try {
          const r = db
            .prepare(
              'SELECT COUNT(*) AS c FROM symbol_embeddings e LEFT JOIN symbols s ON s.id = e.symbol_id WHERE s.id IS NULL',
            )
            .get() as { c: number };
          const orphans = r?.c ?? 0;
          if (orphans === 0) {
            checks.push({ name: 'orphan_embeddings', status: 'ok', detail: 'No orphan embeddings' });
          } else {
            checks.push({
              name: 'orphan_embeddings',
              status: 'warn',
              detail: `${orphans} embedding row(s) reference deleted symbols`,
              count: orphans,
              suggested_repair: 'drop-orphans',
            });
          }
        } catch (e) {
          checks.push({
            name: 'orphan_embeddings',
            status: 'warn',
            detail: `Orphan probe failed: ${e instanceof Error ? e.message : String(e)}`,
          });
        }
      }
    
      const status: VerifyCheckStatus = checks.some((c) => c.status === 'error')
        ? 'error'
        : checks.some((c) => c.status === 'warn')
          ? 'warn'
          : 'ok';
      return { ok: status === 'ok', status, checks };
    }
  • Type definitions for `VerifyCheck` (individual check result with name, status, detail, count, suggested_repair) and `VerifyReport` (aggregate ok/status/checks). Also `VerifyCheckStatus` union type.
    export type VerifyCheckStatus = 'ok' | 'warn' | 'error';
    
    export interface VerifyCheck {
      name: string;
      status: VerifyCheckStatus;
      detail: string;
      /** Optional row-count or delta surfaced for the human reader. */
      count?: number;
      /** Suggested repair mode to clear this check (when applicable). */
      suggested_repair?: string;
    }
    
    export interface VerifyReport {
      ok: boolean;
      /** Highest severity surfaced. */
      status: VerifyCheckStatus;
      checks: VerifyCheck[];
    }
  • MCP tool registration for 'verify_index' using `server.tool()`. Describes the tool as a read-only structural check of the SQLite index. Calls `verifyIndex(store.db)` and returns JSON report.
    server.tool(
      'verify_index',
      'Read-only structural check of the local SQLite index: SQLite integrity_check, foreign-key violations, required-table presence, FTS5 integrity-check, embedding dimension consistency, and orphan embedding detection. Returns a check-by-check report with status (ok/warn/error) and a suggested repair mode for any non-ok finding. Never writes. Use as a preflight before reindex/embed_repo or when search is misbehaving. Returns JSON: { ok, status, checks: [{ name, status, detail, count?, suggested_repair? }] }.',
      {},
      async () => {
        const report = verifyIndex(store.db);
        return {
          content: [{ type: 'text', text: j(report) }],
          isError: report.status === 'error',
        };
      },
    );
  • Imports: `verifyIndex` is imported from `../../db/verify.js` at line 7. The registration function `registerCoreTools` is exported at line 19 and called from the server setup.
    import path from 'node:path';
    import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
    import { z } from 'zod';
    import { optionalNonEmptyString } from './_zod-helpers.js';
    import { EmbeddingPipeline } from '../../ai/embedding-pipeline.js';
    import { repairIndex, type RepairMode } from '../../db/repair.js';
    import { verifyIndex } from '../../db/verify.js';
    import { LOCKS_DIR, projectHash } from '../../global.js';
    import { IndexingPipeline } from '../../indexer/pipeline.js';
    import { buildProjectContext } from '../../indexer/project-context.js';
    import { shouldSkipRecentReindex } from '../../indexer/recent-reindex-cache.js';
    import { logger } from '../../logger.js';
    import type { ServerContext } from '../../server/types.js';
    import { LockError, withLock } from '../../utils/pid-lock.js';
    import { checkFileForDuplicates } from '../analysis/duplication.js';
    import { getMinimalContext } from '../project/minimal-context.js';
    import { getIndexHealth, getProjectMap } from '../project/project.js';
    
    export function registerCoreTools(server: McpServer, ctx: ServerContext): void {
      const {
        store,
        registry,
        config,
        projectRoot,
        guardPath,
  • Helper utilities used by `verifyIndex`: `rowExists`, `tableExists`, and `tableRowCount` for database introspection.
    function rowExists(db: Database.Database, sql: string): boolean {
      try {
        const r = db.prepare(sql).get();
        return r !== undefined;
      } catch {
        return false;
      }
    }
    
    function tableExists(db: Database.Database, name: string): boolean {
      return rowExists(
        db,
        `SELECT name FROM sqlite_master WHERE type IN ('table','view') AND name = '${name.replace(/'/g, "''")}'`,
      );
    }
    
    function tableRowCount(db: Database.Database, name: string): number {
      try {
        const r = db.prepare(`SELECT COUNT(*) AS c FROM ${name}`).get() as { c: number };
        return r?.c ?? 0;
      } catch {
        return 0;
      }
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds significant behavioral context beyond annotations: lists specific checks (SQLite integrity, foreign keys, FTS5, embeddings, etc.), describes the return format (check-by-check report with status and suggested repair), and confirms read-only nature ('Never writes'). Annotations already indicate readOnlyHint, but description enriches with exact verifications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two succinct sentences: first lists all checks, second gives usage context and output format. Front-loaded with purpose. Every sentence adds value; no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters, no output schema, and a simple read-only check, the description is thorough. It lists all checks, specifies output format, gives usage guidance, and warns about write-inhibition. Adequately covers the tool's role among 70+ siblings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters, so baseline is 4. Description adds no parameter info (none exist) but provides output structure details, which is helpful. No further parameter semantics needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states a specific verb-resource (verify index) and enumerates exact checks: integrity, foreign keys, table presence, FTS5, embedding dimension, orphan detection. It distinguishes from siblings by positioning as a preflight before reindex/embed_repo and explicitly noting it never writes, contrasting with mutation tools like repair_index.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use as a preflight before reindex/embed_repo or when search is misbehaving', providing clear context. Also states 'Never writes', which tells when not to use (when a write is needed). Indirectly excludes repairing or writing siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nikolai-vysotskyi/trace-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server