Skip to main content
Glama

chat

Send a chat history to a local Ollama model and receive the assistant's reply along with timing details. Supports system, user, and assistant messages.

Instructions

Run a chat completion against a local model with message history (non-streaming). Returns the assistant's reply plus timing.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelYesModel name.
messagesYesChat history. Each item: {role: "system"|"user"|"assistant", content: string}.
optionsNoOllama sampling/decoding options.

Implementation Reference

  • The main handler function for the 'chat' tool. Validates args (model string, messages array with {role,content}), calls Ollama's POST /api/chat endpoint, and returns assistant reply with timing metrics.
    // ─── Tool: chat ───────────────────────────────────────────────────────────
    async function chat(args) {
      const badModel = requireString(args, 'model');
      if (badModel) return errorResult(badModel);
      if (!Array.isArray(args.messages) || !args.messages.length) {
        return errorResult('messages is required (non-empty array of {role, content} objects)');
      }
      for (const m of args.messages) {
        if (!m || typeof m !== 'object' || typeof m.role !== 'string' || typeof m.content !== 'string') {
          return errorResult('each message must be {role: "system"|"user"|"assistant", content: string}');
        }
      }
    
      const body = {
        model: args.model,
        messages: args.messages,
        stream: false,
      };
      if (args.options && typeof args.options === 'object') body.options = args.options;
    
      const r = await httpRequest('POST', '/api/chat', body);
      if (r.error) return errorResult(r.error);
      const d = r.data || {};
      return textResult({
        model: d.model || args.model,
        message: d.message || null,
        done_reason: d.done_reason || null,
        eval_count: d.eval_count || null,
        eval_duration_ms: d.eval_duration ? Math.round(d.eval_duration / 1e6) : null,
        prompt_eval_count: d.prompt_eval_count || null,
        total_duration_ms: d.total_duration ? Math.round(d.total_duration / 1e6) : null,
        tokens_per_second: d.eval_count && d.eval_duration
          ? Math.round((d.eval_count / (d.eval_duration / 1e9)) * 100) / 100
          : null,
      });
    }
  • Schema registration for the 'chat' tool in the TOOLS array. Defines name, description, annotations, and inputSchema requiring 'model' (string) and 'messages' (array of {role: system|user|assistant, content: string}) with optional 'options' object.
    {
      name: 'chat',
      description: 'Run a chat completion against a local model with message history (non-streaming). Returns the assistant\'s reply plus timing.',
      annotations: { title: 'Chat completion', readOnlyHint: false, destructiveHint: false, openWorldHint: true },
      inputSchema: {
        type: 'object',
        properties: {
          model: { type: 'string', description: 'Model name.' },
          messages: {
            type: 'array',
            description: 'Chat history. Each item: {role: "system"|"user"|"assistant", content: string}.',
            items: {
              type: 'object',
              properties: {
                role: { type: 'string', enum: ['system', 'user', 'assistant'] },
                content: { type: 'string' },
              },
              required: ['role', 'content'],
            },
          },
          options: {
            type: 'object',
            description: 'Ollama sampling/decoding options.',
            additionalProperties: true,
          },
        },
        required: ['model', 'messages'],
        additionalProperties: false,
      },
    },
  • server.js:385-394 (registration)
    The HANDLERS map that binds the tool name 'chat' to the chat function, enabling JSON-RPC dispatch at line 414.
    const HANDLERS = {
      ollama_status: ollamaStatus,
      list_models: listModels,
      list_running: listRunning,
      show_model: showModel,
      generate: generate,
      chat: chat,
      pull_model: pullModel,
      delete_model: deleteModel,
    };
  • The httpRequest helper used by the chat handler to make the POST request to Ollama's /api/chat endpoint.
    function httpRequest(method, path, body) {
      return new Promise((resolve) => {
        let url;
        try {
          url = new URL(path, OLLAMA_URL);
        } catch (e) {
          resolve({ error: `invalid URL: ${e.message}` });
          return;
        }
        const lib = url.protocol === 'https:' ? https : http;
        const opts = {
          method,
          hostname: url.hostname,
          port: url.port || (url.protocol === 'https:' ? 443 : 80),
          path: url.pathname + url.search,
          headers: { 'accept': 'application/json' },
        };
        let bodyBuf = null;
        if (body !== undefined) {
          bodyBuf = Buffer.from(JSON.stringify(body), 'utf8');
          opts.headers['content-type'] = 'application/json';
          opts.headers['content-length'] = bodyBuf.length;
        }
        const req = lib.request(opts, (res) => {
          let chunks = Buffer.alloc(0);
          res.on('data', (d) => { chunks = Buffer.concat([chunks, d]); });
          res.on('end', () => {
            const text = chunks.toString('utf8');
            if (res.statusCode >= 400) {
              resolve({ status: res.statusCode, error: `HTTP ${res.statusCode}: ${text.slice(0, 500)}` });
              return;
            }
            // Some endpoints return text/plain (e.g. GET /); try JSON first, fall back to text.
            try { resolve({ status: res.statusCode, data: JSON.parse(text) }); }
            catch (_) { resolve({ status: res.statusCode, data: null, text }); }
          });
        });
        req.setTimeout(REQUEST_TIMEOUT_MS, () => {
          req.destroy(new Error(`request timed out after ${REQUEST_TIMEOUT_MS}ms`));
        });
        req.on('error', (e) => {
          // Give a friendly connection-refused message.
          const msg = /ECONNREFUSED|ENOTFOUND/.test(e.code || e.message)
            ? `cannot reach Ollama at ${OLLAMA_URL} — is the server running? Start it with \`ollama serve\` or open the Ollama app.`
            : e.message;
          resolve({ error: msg });
        });
        if (bodyBuf) req.write(bodyBuf);
        req.end();
      });
    }
  • The requireString helper used by the chat handler to validate the 'model' argument.
    function requireString(args, field) {
      if (typeof args[field] !== 'string' || !args[field].trim()) {
        return `${field} is required (non-empty string)`;
      }
      return null;
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral details beyond annotations: it specifies non-streaming, includes timing in the return, and uses message history. However, it does not describe the return format in detail (e.g., structure of 'reply' and 'timing'), which is needed since no output schema is provided. There is no contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that communicates purpose, scope, and key behavioral details without redundancy or unnecessary words. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has three parameters (including a complex nested object) and no output schema, the description is somewhat minimal. It mentions return content (reply and timing) but lacks specifics on timing structure or options behavior. It covers the core purpose adequately but could be more comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already fully describes each parameter. The description adds no additional meaning to parameters beyond stating that 'messages' includes chat history with roles. This meets the baseline but does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('run a chat completion'), the resource ('local model'), and key differentiators ('with message history', 'non-streaming', 'returns reply plus timing'). This clearly distinguishes it from siblings like 'generate' (likely single-turn) and file manipulation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidance on when to use this tool versus alternatives like 'generate' or streaming options. It fails to mention prerequisites (e.g., model must exist) or when not to use it, leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/LukeLamb/claude-ollama-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server