memory_ingest
Ingest large documents by automatically chunking, embedding, and storing content with provenance for structured memory retrieval.
Instructions
Ingest a full document: automatically chunks it based on content type (text, markdown, code, legal), embeds each chunk, and stores with provenance. Use this for large documents.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | Full document content to ingest | |
| title | No | Document title | |
| source | No | Origin of the content (e.g., file path, URL, system name) | |
| document_type | No | Type of document (e.g., contract, policy, code, incident, decision) | |
| scope | No | Memory scope for isolation | global |
| namespace | No | Namespace within scope (e.g., project name, team name) | |
| department | No | Department (e.g., legal, engineering, hr, sales, finance) | |
| author | No | Who created this content | |
| access_level | No | Access classification level | |
| tags | No | Tags for categorization | |
| metadata | No | Domain-specific metadata (e.g., {contract_type: 'NDA', parties: ['A','B']}) | |
| content_type | No | Content type determines chunking strategy | text |
| chunk_size | No | Target chunk size in characters (~4 chars per token) | |
| chunk_overlap | No | Overlap between chunks in characters for context preservation |