chat
Engage in real-time conversations with LibreModel (Gigi) using customizable parameters like message input, temperature, and token limits. Ideal for interactive chat with local LLM instances via the LibreModel MCP Server.
Instructions
Have a conversation with LibreModel (Gigi)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| max_tokens | No | Maximum tokens to generate | |
| message | Yes | Your message to LibreModel | |
| system_prompt | No | Optional system prompt to prefix the conversation | |
| temperature | No | Sampling temperature (0.0-2.0) | |
| top_k | No | Top-k sampling parameter | |
| top_p | No | Nucleus sampling parameter |
Implementation Reference
- src/index.ts:79-109 (handler)The handler function for the 'chat' tool. It processes input arguments, calls the callLlamaServer helper to interact with the LibreModel server, formats the response with token info, and handles errors.}, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
- src/index.ts:71-77 (schema)Input schema using Zod for validating and describing parameters of the 'chat' tool: message, temperature, max_tokens, top_p, top_k, system_prompt.inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation")
- src/index.ts:68-109 (registration)Registers the 'chat' tool with the MCP server using registerTool, specifying title, description, inputSchema, and inline handler function.this.server.registerTool("chat", { title: "Chat with LibreModel", description: "Have a conversation with LibreModel (Gigi)", inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation") } }, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
- src/index.ts:297-338 (helper)Helper method that constructs the prompt, makes a POST request to the llama-server /completion endpoint, and returns the completion response. Used by the chat handler.private async callLlamaServer(params: { message: string; temperature: number; max_tokens: number; top_p: number; top_k: number; system_prompt: string; }): Promise<LlamaCompletionResponse> { const prompt = params.system_prompt ? `${params.system_prompt}\n\nHuman: ${params.message}\n\nAssistant:` : `Human: ${params.message}\n\nAssistant:`; const requestBody: LlamaCompletionRequest = { prompt, temperature: params.temperature, n_predict: params.max_tokens, top_p: params.top_p, top_k: params.top_k, stop: this.config.stopSequences, stream: false }; const response = await fetch(`${this.config.url}/completion`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(requestBody) }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } const data = await response.json() as LlamaCompletionResponse; if (!data.content) { throw new Error("No content in response from llama-server"); } return data; }