raw_ingest
Ingest raw source documents into a knowledge base by adding local files, downloading from URLs, or importing Confluence pages and Jira issues.
Instructions
Ingest raw source documents into the knowledge base. Select mode to control the ingestion method:
add: Add a local file or content string (immutable, SHA-256 verified). Supports directory imports. Single images (<10MB) returned inline — you MUST immediately call wiki_write to describe them.fetch: Download a file from a URL into raw/ (arXiv abstract URLs auto-converted to PDF). Single images returned inline — you MUST immediately call wiki_write to describe them.import_confluence: Recursively import Confluence pages with attachments and hierarchy. Supports both Cloud (*.atlassian.net/wiki/...) and Server / Data Center ({host}/spaces/...). Defaults to reading the CONFLUENCE_API_TOKEN env var; passauth_envto point at any other variable. Token format accepted:email:api-token(Cloud Basic),Bearer <pat>(explicit), or a bare PAT (Bearer prefix added automatically).import_jira: Import a Jira issue with comments, attachments, and linked issues. Supports both Cloud and Server / Data Center; auto-falls-back to REST API v2 on older Server / DC. Defaults to reading the JIRA_API_TOKEN env var; passauth_envto point at any other variable. Token format accepted:email:api-token(Cloud Basic),Bearer <pat>(explicit), or a bare PAT (Bearer prefix added automatically).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| mode | Yes | Ingestion mode: add (local file/content), fetch (URL download), import_confluence (Confluence pages), import_jira (Jira issues) | |
| filename | No | [add] Filename in raw/ (e.g. 'paper.pdf'). For directory imports, becomes subdirectory prefix (e.g. 'my-docs'). | |
| content | No | [add] File content as string. Either content or source_path is required. | |
| source_path | No | [add] Absolute path to local file or directory to copy into raw/. If directory, all files imported recursively. Either content or source_path is required. | |
| source_url | No | [add/fetch] Original URL where the document was downloaded from | |
| description | No | [add/fetch] Brief description of what this source contains | |
| tags | No | [add/fetch] Tags for categorization | |
| auto_version | No | [add] If true and file already exists, create a versioned copy (e.g. report_v2.xlsx) instead of failing. Default: false. | |
| pattern | No | [add] File pattern filter for directory imports (e.g. '*.html', '*.{html,css}'). Ignored for single files. | |
| url | No | [fetch] URL to download from. arXiv abs URLs auto-converted to PDF links. [import_confluence] Confluence page URL. [import_jira] Jira issue URL. | |
| recursive | No | [import_confluence] Import child pages recursively (default: false) | |
| depth | No | [import_confluence] Max recursion depth (-1 = unlimited, default: 50 when recursive=true) | |
| auth_env | No | [import_confluence] Auth env var name (default: CONFLUENCE_API_TOKEN). [import_jira] Auth env var name (default: JIRA_API_TOKEN) | |
| include_comments | No | [import_jira] Include issue comments (default: true) | |
| include_attachments | No | [import_jira] Download attachments (default: true) | |
| include_links | No | [import_jira] Import linked issues (default: true) | |
| link_depth | No | [import_jira] Levels of linked issues to follow (default: 1) |