ollama_generate
Generate text completions for one-shot tasks like code generation, summarization, translation, or text transformation using Ollama models without conversation history.
Instructions
Generate a single-turn text completion from an Ollama model without conversation history. Use this tool for one-shot tasks: code generation, text transformation, summarization, translation, or any prompt that does not require prior context. Do not use this for multi-turn conversations where message history matters; use ollama_chat instead. Prerequisites: The 'model' must already be installed. Call ollama_list_models to verify; use ollama_pull_model to download if missing. Behavior: Read-only, not idempotent — each call produces a different generation even with identical inputs. No authentication required. No rate limits. Network-dependent; response time varies with model size and prompt length. Safe to retry on timeout. On model-not-found error, returns an error object without throwing.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Exact Ollama model identifier. Must match a 'name' from ollama_list_models (e.g., 'llama3.1:8b', 'codellama:13b'). If unsure, call ollama_list_models first. | |
| prompt | Yes | The input text prompt to generate a completion from. Can be any length — the model's context window is the only limit. | |
| temperature | No | Sampling temperature. 0.0 = deterministic, 2.0 = maximum randomness. Default is model-dependent. Use low values for factual/code tasks. | |
| max_tokens | No | Maximum tokens to generate. Maps to Ollama 'num_predict'. Use -1 for unlimited (model stops at its natural end token). | |
| system | No | System prompt to set model behavior, persona, or output format constraints for this generation. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| response | No | The generated text completion. | |
| model | Yes | The model that generated the response. | |
| total_duration | No | Total time in nanoseconds including load and inference. | |
| eval_count | No | Number of tokens generated. | |
| error | No | Error message if the request failed. Only present on failure. |