gia_ingest_document
Ingest text content into a governed retrieval system with integrity verification and audit trail. Documents are chunked, embedded, and hash-verified for trustworthy retrieval.
Instructions
Governed document ingestion — upload text content for governed retrieval. Content is chunked, embedded, hash-verified, and stored with full audit trail. Each chunk gets SHA-256 integrity hash. Classification: ADVISORY — creates governed content, audited.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes | Document title | |
| content | Yes | Full text content to ingest | |
| domain | Yes | Domain classification (e.g., va-claims, finance, eu-ai-act) | |
| trust_level | No | Trust level (SYSTEM > ORG > CASE > EPHEMERAL) | CASE |
| classification | No | Document classification. Controls which retrievals can surface it when a classification floor is in force (MANDATORY > ADVISORY > INFORMATIONAL). Defaults to ADVISORY. | |
| allowed_roles | No | Roles allowed to retrieve this document | |
| ttl_hours | No | Time-to-live in hours (auto-expires) |
Implementation Reference
- The async handler function for gia_ingest_document. It POSTs to /api/retrieval/ingest with the document's title, content, filename (slugified), domain, trustLevel, classification, allowedRoles, and ttlHours. Returns JSON on success or an error payload on failure.
async (args) => { try { const result = await apiCall<unknown>('/api/retrieval/ingest', 'POST', { title: args.title, content: args.content, filename: `${args.title.toLowerCase().replace(/\s+/g, '-')}.txt`, domain: args.domain, trustLevel: args.trust_level, classification: args.classification, allowedRoles: args.allowed_roles, ttlHours: args.ttl_hours, }); return { content: [{ type: 'text' as const, text: JSON.stringify(result, null, 2), }], }; } catch (err: unknown) { return { content: [{ type: 'text' as const, text: JSON.stringify( errorPayload(err, 'gia_ingest_document', { title: args.title, domain: args.domain, classification: args.classification, }), null, 2, ), }], isError: true, }; } } ); - Zod schema for gia_ingest_document: title, content, domain, trust_level (enum SYSTEM/ORG/CASE/EPHEMERAL, default CASE), classification (optional enum MANDATORY/ADVISORY/INFORMATIONAL), allowed_roles (optional string array), ttl_hours (optional number).
{ title: z.string().describe('Document title'), content: z.string().describe('Full text content to ingest'), domain: z.string().describe('Domain classification (e.g., va-claims, finance, eu-ai-act)'), trust_level: z.enum(['SYSTEM', 'ORG', 'CASE', 'EPHEMERAL']).default('CASE').describe('Trust level (SYSTEM > ORG > CASE > EPHEMERAL)'), classification: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Document classification. Controls which retrievals can surface it when a classification floor is in force (MANDATORY > ADVISORY > INFORMATIONAL). Defaults to ADVISORY.'), allowed_roles: z.array(z.string()).optional().describe('Roles allowed to retrieve this document'), ttl_hours: z.number().optional().describe('Time-to-live in hours (auto-expires)'), }, - src/mcp/tools/governed-retrieval.ts:186-242 (registration)Registration of gia_ingest_document via server.tool() with name, description, schema, and metadata hints (readOnlyHint: false, idempotentHint: false).
server.tool( 'gia_ingest_document', 'Governed document ingestion — upload text content for governed retrieval. Content is chunked, embedded, hash-verified, and stored with full audit trail. Each chunk gets SHA-256 integrity hash. Classification: ADVISORY — creates governed content, audited.', { title: z.string().describe('Document title'), content: z.string().describe('Full text content to ingest'), domain: z.string().describe('Domain classification (e.g., va-claims, finance, eu-ai-act)'), trust_level: z.enum(['SYSTEM', 'ORG', 'CASE', 'EPHEMERAL']).default('CASE').describe('Trust level (SYSTEM > ORG > CASE > EPHEMERAL)'), classification: z.enum(['MANDATORY', 'ADVISORY', 'INFORMATIONAL']).optional().describe('Document classification. Controls which retrievals can surface it when a classification floor is in force (MANDATORY > ADVISORY > INFORMATIONAL). Defaults to ADVISORY.'), allowed_roles: z.array(z.string()).optional().describe('Roles allowed to retrieve this document'), ttl_hours: z.number().optional().describe('Time-to-live in hours (auto-expires)'), }, { title: 'Governed Document Ingestion', readOnlyHint: false, idempotentHint: false, destructiveHint: false, openWorldHint: false, }, async (args) => { try { const result = await apiCall<unknown>('/api/retrieval/ingest', 'POST', { title: args.title, content: args.content, filename: `${args.title.toLowerCase().replace(/\s+/g, '-')}.txt`, domain: args.domain, trustLevel: args.trust_level, classification: args.classification, allowedRoles: args.allowed_roles, ttlHours: args.ttl_hours, }); return { content: [{ type: 'text' as const, text: JSON.stringify(result, null, 2), }], }; } catch (err: unknown) { return { content: [{ type: 'text' as const, text: JSON.stringify( errorPayload(err, 'gia_ingest_document', { title: args.title, domain: args.domain, classification: args.classification, }), null, 2, ), }], isError: true, }; } } ); - src/mcp/server.ts:214-217 (registration)Registration call in the MCP server setup: registerGovernedRetrievalTools(instrumentedServer) is called when the 'tenant' tier is allowed.
// Governed retrieval (special: no engine param) if (allowedTiers.has(GOVERNED_RETRIEVAL_TIER)) { registerGovernedRetrievalTools(instrumentedServer); registeredCount++; - The errorPayload helper function used by the handler to structure error responses with code, message, httpStatus, details, and tool name.
function errorPayload( err: unknown, tool: string, context: Record<string, unknown>, ): { error: true; code: string; message: string; httpStatus?: number; details?: Record<string, unknown>; tool: string } & Record<string, unknown> { if (err instanceof RetrievalApiError) { return { error: true, code: err.code, message: err.message, httpStatus: err.httpStatus, details: err.details, tool, ...context, }; } return { error: true, code: 'INTERNAL_ERROR', message: errMsg(err), tool, ...context, }; }