Skip to main content
Glama

gia_retrieve

Read-onlyIdempotent

Retrieve documents with governance-enforced semantic search, including permission checks, audit logging, and charter-based access restrictions.

Instructions

Governed semantic search — hash-verified, permission-checked, TTL-enforced document retrieval with full audit trail. Every retrieval is logged: what was retrieved, was it authorized, was it tampered with. When charter_id is provided, the retrieval is bound to that charter's contextAccess enforcement (domain allow/denyList, trust floor, classification floor, max chunks per query) and refusals are recorded with full provenance. classification_floor (MANDATORY|ADVISORY|INFORMATIONAL) lets callers demand the stricter of their own floor vs the charter's — chunks below the effective floor are denied with CLASSIFICATION_BELOW_FLOOR. Classification: ADVISORY — read-only search, results are permission-gated.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query — what information to find
domainYesDomain to search within (e.g., va-claims, finance, eu-ai-act)
max_resultsNoMaximum chunks to return (default 5)
similarity_thresholdNoMinimum similarity score 0-1 (default 0.7)
agent_idYesAgent performing the retrieval
run_idNoCurrent run/pipeline ID
charter_idNoOptional charter ID — when set, retrieval is enforced against the charter's contextAccess block (domain allow/denyList, trust floor, classification floor, max chunks). Refusals are audit-logged with refusedAtCharterGate marker.
classification_floorNoMinimum chunk classification to return. Combined with charter floor via stricter-wins — callers cannot relax charter authority.

Implementation Reference

  • The MCP tool handler for 'gia_retrieve' registered via server.tool(). Performs governed semantic search by making a POST request to /api/retrieval/search with query, domain, max_results, similarity_threshold, agent_id, run_id, charter_id, and classification_floor parameters. Returns results or structured error payload.
    server.tool(
      'gia_retrieve',
      'Governed semantic search — hash-verified, permission-checked, TTL-enforced document retrieval with full audit trail. Every retrieval is logged: what was retrieved, was it authorized, was it tampered with. When charter_id is provided, the retrieval is bound to that charter\'s contextAccess enforcement (domain allow/denyList, trust floor, classification floor, max chunks per query) and refusals are recorded with full provenance. classification_floor (MANDATORY|ADVISORY|INFORMATIONAL) lets callers demand the stricter of their own floor vs the charter\'s — chunks below the effective floor are denied with CLASSIFICATION_BELOW_FLOOR. Classification: ADVISORY — read-only search, results are permission-gated.',
      {
        query: z.string().describe('Search query — what information to find'),
        domain: z.string().describe('Domain to search within (e.g., va-claims, finance, eu-ai-act)'),
        max_results: z.number().optional().default(5).describe('Maximum chunks to return (default 5)'),
        similarity_threshold: z.number().optional().default(0.7).describe('Minimum similarity score 0-1 (default 0.7)'),
        agent_id: z.string().describe('Agent performing the retrieval'),
        run_id: z.string().optional().describe('Current run/pipeline ID'),
        charter_id: z.string().optional().describe('Optional charter ID — when set, retrieval is enforced against the charter\'s contextAccess block (domain allow/denyList, trust floor, classification floor, max chunks). Refusals are audit-logged with refusedAtCharterGate marker.'),
        classification_floor: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Minimum chunk classification to return. Combined with charter floor via stricter-wins — callers cannot relax charter authority.'),
      },
      {
        title: 'Governed Document Retrieval',
        readOnlyHint: true,
        idempotentHint: true,
        destructiveHint: false,
        openWorldHint: false,
      },
      async (args) => {
        try {
          const result = await apiCall<unknown>('/api/retrieval/search', 'POST', {
            query: args.query,
            domains: [args.domain],
            maxResults: args.max_results,
            similarityThreshold: args.similarity_threshold,
            agentId: args.agent_id,
            runId: args.run_id,
            charterId: args.charter_id,
            classificationFloor: args.classification_floor,
          });
    
          return {
            content: [{
              type: 'text' as const,
              text: JSON.stringify(result, null, 2),
            }],
          };
        } catch (err: unknown) {
          return {
            content: [{
              type: 'text' as const,
              text: JSON.stringify(
                errorPayload(err, 'gia_retrieve', {
                  query: args.query,
                  domain: args.domain,
                  charter_id: args.charter_id,
                  classification_floor: args.classification_floor,
                }),
                null,
                2,
              ),
            }],
            isError: true,
          };
        }
      }
    );
  • Zod schema definitions for the input parameters of gia_retrieve: query (string), domain (string), max_results (optional number, default 5), similarity_threshold (optional number, default 0.7), agent_id (string), run_id (optional string), charter_id (optional string), classification_floor (optional enum of MANDATORY|ADVISORY|INFORMATIONAL).
      query: z.string().describe('Search query — what information to find'),
      domain: z.string().describe('Domain to search within (e.g., va-claims, finance, eu-ai-act)'),
      max_results: z.number().optional().default(5).describe('Maximum chunks to return (default 5)'),
      similarity_threshold: z.number().optional().default(0.7).describe('Minimum similarity score 0-1 (default 0.7)'),
      agent_id: z.string().describe('Agent performing the retrieval'),
      run_id: z.string().optional().describe('Current run/pipeline ID'),
      charter_id: z.string().optional().describe('Optional charter ID — when set, retrieval is enforced against the charter\'s contextAccess block (domain allow/denyList, trust floor, classification floor, max chunks). Refusals are audit-logged with refusedAtCharterGate marker.'),
      classification_floor: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Minimum chunk classification to return. Combined with charter floor via stricter-wins — callers cannot relax charter authority.'),
    },
  • Registration of gia_retrieve via registerGovernedRetrievalTools(), called at the tenant visibility tier. Uses GOVERNED_RETRIEVAL_TIER='tenant' and is registered outside the standard TOOL_REGISTRY loop because it doesn't require an engine parameter.
    // Governed retrieval (special: no engine param)
    if (allowedTiers.has(GOVERNED_RETRIEVAL_TIER)) {
      registerGovernedRetrievalTools(instrumentedServer);
      registeredCount++;
    }
  • Export function registerGovernedRetrievalTools that registers gia_retrieve (and gia_ingest_document) with the MCP server. Called from server.ts with the instrumented server.
    export function registerGovernedRetrievalTools(server: McpServer): void {
    
      // =========================================================================
      // gia_retrieve — Governed semantic search
      // =========================================================================
      server.tool(
        'gia_retrieve',
        'Governed semantic search — hash-verified, permission-checked, TTL-enforced document retrieval with full audit trail. Every retrieval is logged: what was retrieved, was it authorized, was it tampered with. When charter_id is provided, the retrieval is bound to that charter\'s contextAccess enforcement (domain allow/denyList, trust floor, classification floor, max chunks per query) and refusals are recorded with full provenance. classification_floor (MANDATORY|ADVISORY|INFORMATIONAL) lets callers demand the stricter of their own floor vs the charter\'s — chunks below the effective floor are denied with CLASSIFICATION_BELOW_FLOOR. Classification: ADVISORY — read-only search, results are permission-gated.',
        {
          query: z.string().describe('Search query — what information to find'),
          domain: z.string().describe('Domain to search within (e.g., va-claims, finance, eu-ai-act)'),
          max_results: z.number().optional().default(5).describe('Maximum chunks to return (default 5)'),
          similarity_threshold: z.number().optional().default(0.7).describe('Minimum similarity score 0-1 (default 0.7)'),
          agent_id: z.string().describe('Agent performing the retrieval'),
          run_id: z.string().optional().describe('Current run/pipeline ID'),
          charter_id: z.string().optional().describe('Optional charter ID — when set, retrieval is enforced against the charter\'s contextAccess block (domain allow/denyList, trust floor, classification floor, max chunks). Refusals are audit-logged with refusedAtCharterGate marker.'),
          classification_floor: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Minimum chunk classification to return. Combined with charter floor via stricter-wins — callers cannot relax charter authority.'),
        },
        {
          title: 'Governed Document Retrieval',
          readOnlyHint: true,
          idempotentHint: true,
          destructiveHint: false,
          openWorldHint: false,
        },
        async (args) => {
          try {
            const result = await apiCall<unknown>('/api/retrieval/search', 'POST', {
              query: args.query,
              domains: [args.domain],
              maxResults: args.max_results,
              similarityThreshold: args.similarity_threshold,
              agentId: args.agent_id,
              runId: args.run_id,
              charterId: args.charter_id,
              classificationFloor: args.classification_floor,
            });
    
            return {
              content: [{
                type: 'text' as const,
                text: JSON.stringify(result, null, 2),
              }],
            };
          } catch (err: unknown) {
            return {
              content: [{
                type: 'text' as const,
                text: JSON.stringify(
                  errorPayload(err, 'gia_retrieve', {
                    query: args.query,
                    domain: args.domain,
                    charter_id: args.charter_id,
                    classification_floor: args.classification_floor,
                  }),
                  null,
                  2,
                ),
              }],
              isError: true,
            };
          }
        }
      );
    
      // =========================================================================
      // gia_ingest_document — Governed document ingestion
      // =========================================================================
      server.tool(
        'gia_ingest_document',
        'Governed document ingestion — upload text content for governed retrieval. Content is chunked, embedded, hash-verified, and stored with full audit trail. Each chunk gets SHA-256 integrity hash. Classification: ADVISORY — creates governed content, audited.',
        {
          title: z.string().describe('Document title'),
          content: z.string().describe('Full text content to ingest'),
          domain: z.string().describe('Domain classification (e.g., va-claims, finance, eu-ai-act)'),
          trust_level: z.enum(['SYSTEM', 'ORG', 'CASE', 'EPHEMERAL']).default('CASE').describe('Trust level (SYSTEM > ORG > CASE > EPHEMERAL)'),
          classification: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Document classification. Controls which retrievals can surface it when a classification floor is in force (MANDATORY > ADVISORY > INFORMATIONAL). Defaults to ADVISORY.'),
          allowed_roles: z.array(z.string()).optional().describe('Roles allowed to retrieve this document'),
          ttl_hours: z.number().optional().describe('Time-to-live in hours (auto-expires)'),
        },
        {
          title: 'Governed Document Ingestion',
          readOnlyHint: false,
          idempotentHint: false,
          destructiveHint: false,
          openWorldHint: false,
        },
        async (args) => {
          try {
            const result = await apiCall<unknown>('/api/retrieval/ingest', 'POST', {
              title: args.title,
              content: args.content,
              filename: `${args.title.toLowerCase().replace(/\s+/g, '-')}.txt`,
              domain: args.domain,
              trustLevel: args.trust_level,
              classification: args.classification,
              allowedRoles: args.allowed_roles,
              ttlHours: args.ttl_hours,
            });
    
            return {
              content: [{
                type: 'text' as const,
                text: JSON.stringify(result, null, 2),
              }],
            };
          } catch (err: unknown) {
            return {
              content: [{
                type: 'text' as const,
                text: JSON.stringify(
                  errorPayload(err, 'gia_ingest_document', {
                    title: args.title,
                    domain: args.domain,
                    classification: args.classification,
                  }),
                  null,
                  2,
                ),
              }],
              isError: true,
            };
          }
        }
      );
    }
  • The errorPayload helper function that formats errors for gia_retrieve (and gia_ingest_document) failures. Handles RetrievalApiError instances specially to preserve structured error codes returned by the API.
    function errorPayload(
      err: unknown,
      tool: string,
      context: Record<string, unknown>,
    ): { error: true; code: string; message: string; httpStatus?: number; details?: Record<string, unknown>; tool: string } & Record<string, unknown> {
      if (err instanceof RetrievalApiError) {
        return {
          error: true,
          code: err.code,
          message: err.message,
          httpStatus: err.httpStatus,
          details: err.details,
          tool,
          ...context,
        };
      }
      return {
        error: true,
        code: 'INTERNAL_ERROR',
        message: errMsg(err),
        tool,
        ...context,
      };
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, destructiveHint, idempotentHint), the description reveals hash verification, permission checks, TTL enforcement, audit logging, charter context enforcement with domain allow/denyLists, trust floor, classification floor, max chunks, and 'stricter-wins' rule for classification floors. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a strong lead sentence that summarizes the tool's core purpose. The following sentences add critical behavioral details without redundancy. It is slightly verbose but every sentence contributes meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers key governance features and parameter-specific behavior, which is sufficient given the lack of an output schema. It explains refusal logging and charter enforcement well, though it could briefly mention what the return format or error cases might look like.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value for 'charter_id' and 'classification_floor' by explaining enforcement and 'stricter-wins' behavior. Other parameters like 'query' and 'domain' are sufficiently described in the schema, so the description does not need to add more.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a governed semantic search with hash verification, permission checks, TTL enforcement, and audit trail, distinguishing it from sibling tools by focusing on governance and charter enforcement. The classification 'ADVISORY — read-only search' further clarifies its non-destructive purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use charter_id and classification_floor, and how they affect enforcement and refusal logging. However, it does not explicitly compare to sibling retrieval tools like 'gia_apply_pack' or 'gia_dry_run_pack', leaving the agent to infer usage boundaries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/knowledgepa3/gia-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server