akai_flash_qla
Accelerates inference by performing chunked prefill using a native C kernel, reducing memory overhead during large context processing.
Instructions
akai-flashqla — AkaiGDN Chunked Prefill (native C kernel). (category: inference)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| args | No | CLI arguments to pass to the operator | |
| stdin | No | Optional stdin data |