llama_start
Launch a local llama-server instance as a child process using a specified GGUF model, with configurable port, context size, GPU layers, and CPU threads.
Instructions
Start llama-server as a child process with the specified model
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Path to GGUF model file | |
| port | No | Port to listen on | |
| ctx_size | No | Context size | |
| n_gpu_layers | No | GPU layers (-1 = all) | |
| threads | No | CPU threads |