chat_completion
Generate AI responses using local Ollama models through an OpenAI-compatible API for chat-based applications.
Instructions
OpenAI-compatible chat completion API
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Name of the Ollama model to use | |
| messages | Yes | Array of messages in the conversation | |
| temperature | No | Sampling temperature (0-2) | |
| timeout | No | Timeout in milliseconds (default: 60000) |
Implementation Reference
- src/index.ts:471-536 (handler)The handler function that implements the core logic of the chat_completion tool. Converts messages to a prompt, calls Ollama generate API, and returns formatted OpenAI-compatible response.private async handleChatCompletion(args: any) { try { // Convert chat messages to a single prompt const prompt = args.messages .map((msg: any) => { switch (msg.role) { case 'system': return `System: ${msg.content}\n`; case 'user': return `User: ${msg.content}\n`; case 'assistant': return `Assistant: ${msg.content}\n`; default: return ''; } }) .join(''); // Make request to Ollama API with configurable timeout and raw mode const response = await axios.post<OllamaGenerateResponse>( `${OLLAMA_HOST}/api/generate`, { model: args.model, prompt, stream: false, temperature: args.temperature, raw: true, // Add raw mode for more direct responses }, { timeout: args.timeout || DEFAULT_TIMEOUT, } ); return { content: [ { type: 'text', text: JSON.stringify({ id: 'chatcmpl-' + Date.now(), object: 'chat.completion', created: Math.floor(Date.now() / 1000), model: args.model, choices: [ { index: 0, message: { role: 'assistant', content: response.data.response, }, finish_reason: 'stop', }, ], }, null, 2), }, ], }; } catch (error) { if (axios.isAxiosError(error)) { throw new McpError( ErrorCode.InternalError, `Ollama API error: ${error.response?.data?.error || error.message}` ); } throw new McpError(ErrorCode.InternalError, `Unexpected error: ${formatError(error)}`); } }
- src/index.ts:210-248 (schema)Input schema definition for the chat_completion tool, specifying parameters like model, messages, temperature, and timeout.inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Name of the Ollama model to use', }, messages: { type: 'array', items: { type: 'object', properties: { role: { type: 'string', enum: ['system', 'user', 'assistant'], }, content: { type: 'string', }, }, required: ['role', 'content'], }, description: 'Array of messages in the conversation', }, temperature: { type: 'number', description: 'Sampling temperature (0-2)', minimum: 0, maximum: 2, }, timeout: { type: 'number', description: 'Timeout in milliseconds (default: 60000)', minimum: 1000, }, }, required: ['model', 'messages'], additionalProperties: false, },
- src/index.ts:207-249 (registration)Registration of the chat_completion tool in the ListTools response, including name, description, and input schema.{ name: 'chat_completion', description: 'OpenAI-compatible chat completion API', inputSchema: { type: 'object', properties: { model: { type: 'string', description: 'Name of the Ollama model to use', }, messages: { type: 'array', items: { type: 'object', properties: { role: { type: 'string', enum: ['system', 'user', 'assistant'], }, content: { type: 'string', }, }, required: ['role', 'content'], }, description: 'Array of messages in the conversation', }, temperature: { type: 'number', description: 'Sampling temperature (0-2)', minimum: 0, maximum: 2, }, timeout: { type: 'number', description: 'Timeout in milliseconds (default: 60000)', minimum: 1000, }, }, required: ['model', 'messages'], additionalProperties: false, }, },
- src/index.ts:274-275 (registration)Dispatch in CallToolRequestHandler switch statement that routes chat_completion calls to the handler function.case 'chat_completion': return await this.handleChatCompletion(request.params.arguments);