chat
Chat with LibreModel (Gigi) through Claude Desktop to interact with local LLM instances using configurable sampling parameters for conversation control.
Instructions
Have a conversation with LibreModel (Gigi)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | Your message to LibreModel | |
| temperature | No | Sampling temperature (0.0-2.0) | |
| max_tokens | No | Maximum tokens to generate | |
| top_p | No | Nucleus sampling parameter | |
| top_k | No | Top-k sampling parameter | |
| system_prompt | No | Optional system prompt to prefix the conversation |
Implementation Reference
- src/index.ts:79-109 (handler)Handler function for the 'chat' tool that invokes callLlamaServer with user parameters and formats the response as MCP content.}, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
- src/index.ts:71-78 (schema)Zod input schema defining parameters for the 'chat' tool.inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation") }
- src/index.ts:68-109 (registration)Registration of the 'chat' tool on the MCP server, including title, description, schema, and handler.this.server.registerTool("chat", { title: "Chat with LibreModel", description: "Have a conversation with LibreModel (Gigi)", inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation") } }, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
- src/index.ts:297-338 (helper)Helper method that constructs the prompt and makes the HTTP POST request to the llama-server /completion endpoint, used by the chat handler.private async callLlamaServer(params: { message: string; temperature: number; max_tokens: number; top_p: number; top_k: number; system_prompt: string; }): Promise<LlamaCompletionResponse> { const prompt = params.system_prompt ? `${params.system_prompt}\n\nHuman: ${params.message}\n\nAssistant:` : `Human: ${params.message}\n\nAssistant:`; const requestBody: LlamaCompletionRequest = { prompt, temperature: params.temperature, n_predict: params.max_tokens, top_p: params.top_p, top_k: params.top_k, stop: this.config.stopSequences, stream: false }; const response = await fetch(`${this.config.url}/completion`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(requestBody) }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } const data = await response.json() as LlamaCompletionResponse; if (!data.content) { throw new Error("No content in response from llama-server"); } return data; }