chat
Run non-streaming chat completions with message history using local models. Returns the assistant's reply and timing.
Instructions
Run a chat completion against a local model with message history (non-streaming). Returns the assistant's reply plus timing.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Model name. | |
| messages | Yes | Chat history. Each item: {role: "system"|"user"|"assistant", content: string}. | |
| options | No | Ollama sampling/decoding options. |
Implementation Reference
- server.js:211-245 (handler)The 'chat' tool handler function. Validates inputs (model string, messages array with role/content), sends a POST /api/chat to Ollama (non-streaming), and returns the assistant's reply with timing metadata (eval_count, eval_duration_ms, tokens_per_second, etc.).
async function chat(args) { const badModel = requireString(args, 'model'); if (badModel) return errorResult(badModel); if (!Array.isArray(args.messages) || !args.messages.length) { return errorResult('messages is required (non-empty array of {role, content} objects)'); } for (const m of args.messages) { if (!m || typeof m !== 'object' || typeof m.role !== 'string' || typeof m.content !== 'string') { return errorResult('each message must be {role: "system"|"user"|"assistant", content: string}'); } } const body = { model: args.model, messages: args.messages, stream: false, }; if (args.options && typeof args.options === 'object') body.options = args.options; const r = await httpRequest('POST', '/api/chat', body); if (r.error) return errorResult(r.error); const d = r.data || {}; return textResult({ model: d.model || args.model, message: d.message || null, done_reason: d.done_reason || null, eval_count: d.eval_count || null, eval_duration_ms: d.eval_duration ? Math.round(d.eval_duration / 1e6) : null, prompt_eval_count: d.prompt_eval_count || null, total_duration_ms: d.total_duration ? Math.round(d.total_duration / 1e6) : null, tokens_per_second: d.eval_count && d.eval_duration ? Math.round((d.eval_count / (d.eval_duration / 1e9)) * 100) / 100 : null, }); } - server.js:327-355 (schema)The input schema for the 'chat' tool registration. Defines required parameters: 'model' (string), 'messages' (array of {role: enum[system|user|assistant], content: string}), and optional 'options' (object for sampling/decoding settings).
{ name: 'chat', description: 'Run a chat completion against a local model with message history (non-streaming). Returns the assistant\'s reply plus timing.', annotations: { title: 'Chat completion', readOnlyHint: false, destructiveHint: false, openWorldHint: true }, inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Model name.' }, messages: { type: 'array', description: 'Chat history. Each item: {role: "system"|"user"|"assistant", content: string}.', items: { type: 'object', properties: { role: { type: 'string', enum: ['system', 'user', 'assistant'] }, content: { type: 'string' }, }, required: ['role', 'content'], }, }, options: { type: 'object', description: 'Ollama sampling/decoding options.', additionalProperties: true, }, }, required: ['model', 'messages'], additionalProperties: false, }, - server.js:385-394 (registration)The HANDLERS map that registers the 'chat' function under the key 'chat', enabling dispatch from the JSON-RPC 'tools/call' handler.
const HANDLERS = { ollama_status: ollamaStatus, list_models: listModels, list_running: listRunning, show_model: showModel, generate: generate, chat: chat, pull_model: pullModel, delete_model: deleteModel, }; - server.js:275-383 (registration)The TOOLS array containing all tool definitions exposed via 'tools/list'. Entry at index 5 (lines 327-356) defines the 'chat' tool with name, description, annotations, and inputSchema.
const TOOLS = [ { name: 'ollama_status', description: 'Health check: whether the Ollama server is reachable and its version. Use this as a precondition before other tools if you\'re unsure whether Ollama is running.', annotations: { title: 'Ollama server status', readOnlyHint: true, destructiveHint: false, openWorldHint: false }, inputSchema: { type: 'object', properties: {}, additionalProperties: false }, }, { name: 'list_models', description: 'List locally-installed models: name, size in bytes, digest, modified timestamp, family (e.g. llama), parameter size (e.g. 8.0B), and quantization level (e.g. Q4_K_M).', annotations: { title: 'List installed models', readOnlyHint: true, destructiveHint: false, openWorldHint: false }, inputSchema: { type: 'object', properties: {}, additionalProperties: false }, }, { name: 'list_running', description: 'List models currently loaded into VRAM with their size, VRAM footprint, and expiry timestamp. Empty list means Ollama is idle.', annotations: { title: 'List running models', readOnlyHint: true, destructiveHint: false, openWorldHint: false }, inputSchema: { type: 'object', properties: {}, additionalProperties: false }, }, { name: 'show_model', description: 'Show detailed information for a specific model: modelfile excerpt, parameters, template, capabilities, architecture details, quantization level.', annotations: { title: 'Show model details', readOnlyHint: true, destructiveHint: false, openWorldHint: false }, inputSchema: { type: 'object', properties: { name: { type: 'string', description: 'Model name (e.g. "llama3.1:8b" or "forge:b6c1").' }, }, required: ['name'], additionalProperties: false, }, }, { name: 'generate', description: 'Run a one-shot text completion against a local model (non-streaming). Returns the full response text plus timing and tokens/second.', annotations: { title: 'Generate text', readOnlyHint: false, destructiveHint: false, openWorldHint: true }, inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Model name (e.g. "llama3.1:8b").' }, prompt: { type: 'string', description: 'Prompt text.' }, system: { type: 'string', description: 'Optional system prompt.' }, options: { type: 'object', description: 'Ollama sampling/decoding options — e.g. {"temperature": 0.7, "num_predict": 100, "top_p": 0.9}.', additionalProperties: true, }, }, required: ['model', 'prompt'], additionalProperties: false, }, }, { name: 'chat', description: 'Run a chat completion against a local model with message history (non-streaming). Returns the assistant\'s reply plus timing.', annotations: { title: 'Chat completion', readOnlyHint: false, destructiveHint: false, openWorldHint: true }, inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Model name.' }, messages: { type: 'array', description: 'Chat history. Each item: {role: "system"|"user"|"assistant", content: string}.', items: { type: 'object', properties: { role: { type: 'string', enum: ['system', 'user', 'assistant'] }, content: { type: 'string' }, }, required: ['role', 'content'], }, }, options: { type: 'object', description: 'Ollama sampling/decoding options.', additionalProperties: true, }, }, required: ['model', 'messages'], additionalProperties: false, }, }, { name: 'pull_model', description: 'Download a model from the Ollama registry. Blocks until complete — can take a long time for multi-GB models. For very large pulls, prefer `ollama pull` in a terminal where you can watch progress.', annotations: { title: 'Pull model', readOnlyHint: false, destructiveHint: false, openWorldHint: true }, inputSchema: { type: 'object', properties: { name: { type: 'string', description: 'Model name to pull (e.g. "llama3.1:8b").' }, }, required: ['name'], additionalProperties: false, }, }, { name: 'delete_model', description: 'Delete a locally-installed model. Does not affect the remote registry copy. Free the disk space of a model you no longer need.', annotations: { title: 'Delete model', readOnlyHint: false, destructiveHint: true, openWorldHint: false }, inputSchema: { type: 'object', properties: { name: { type: 'string', description: 'Model name to delete.' }, }, required: ['name'], additionalProperties: false, }, }, ]; - server.js:109-114 (helper)The requireString helper used by the chat handler to validate that the 'model' argument is a non-empty string.
function requireString(args, field) { if (typeof args[field] !== 'string' || !args[field].trim()) { return `${field} is required (non-empty string)`; } return null; }