generate
Run a one-shot text completion against a local Ollama model. Returns the full response, timing, and tokens per second.
Instructions
Run a one-shot text completion against a local model (non-streaming). Returns the full response text plus timing and tokens/second.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Model name (e.g. "llama3.1:8b"). | |
| prompt | Yes | Prompt text. | |
| system | No | Optional system prompt. | |
| options | No | Ollama sampling/decoding options — e.g. {"temperature": 0.7, "num_predict": 100, "top_p": 0.9}. |
Implementation Reference
- server.js:179-208 (handler)The async function 'generate' that executes the tool logic. It validates required 'model' and 'prompt' string arguments, constructs a request body (with optional 'system' and 'options' fields), sends a POST to '/api/generate' via httpRequest, and returns a textResult with the model's response plus timing metrics (eval_count, eval_duration_ms, tokens_per_second, etc.).
async function generate(args) { const badModel = requireString(args, 'model'); if (badModel) return errorResult(badModel); const badPrompt = requireString(args, 'prompt'); if (badPrompt) return errorResult(badPrompt); const body = { model: args.model, prompt: args.prompt, stream: false, }; if (args.system && typeof args.system === 'string') body.system = args.system; if (args.options && typeof args.options === 'object') body.options = args.options; const r = await httpRequest('POST', '/api/generate', body); if (r.error) return errorResult(r.error); const d = r.data || {}; return textResult({ model: d.model || args.model, response: d.response || '', done_reason: d.done_reason || null, eval_count: d.eval_count || null, eval_duration_ms: d.eval_duration ? Math.round(d.eval_duration / 1e6) : null, prompt_eval_count: d.prompt_eval_count || null, total_duration_ms: d.total_duration ? Math.round(d.total_duration / 1e6) : null, tokens_per_second: d.eval_count && d.eval_duration ? Math.round((d.eval_count / (d.eval_duration / 1e9)) * 100) / 100 : null, }); } - server.js:307-326 (schema)Input schema and description for the 'generate' tool. Defines required properties: 'model' (string), 'prompt' (string), and optional 'system' (string) and 'options' (object for Ollama sampling parameters).
{ name: 'generate', description: 'Run a one-shot text completion against a local model (non-streaming). Returns the full response text plus timing and tokens/second.', annotations: { title: 'Generate text', readOnlyHint: false, destructiveHint: false, openWorldHint: true }, inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Model name (e.g. "llama3.1:8b").' }, prompt: { type: 'string', description: 'Prompt text.' }, system: { type: 'string', description: 'Optional system prompt.' }, options: { type: 'object', description: 'Ollama sampling/decoding options — e.g. {"temperature": 0.7, "num_predict": 100, "top_p": 0.9}.', additionalProperties: true, }, }, required: ['model', 'prompt'], additionalProperties: false, }, }, - server.js:385-394 (registration)The HANDLERS mapping object that registers the 'generate' function (line 390) along with all other tool handlers, used by the JSON-RPC dispatch logic.
const HANDLERS = { ollama_status: ollamaStatus, list_models: listModels, list_running: listRunning, show_model: showModel, generate: generate, chat: chat, pull_model: pullModel, delete_model: deleteModel, };