Skip to main content
Glama

chat

Engage in real-time conversations with LibreModel (Gigi) using customizable parameters like message input, temperature, and token limits. Ideal for interactive chat with local LLM instances via the LibreModel MCP Server.

Instructions

Have a conversation with LibreModel (Gigi)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
max_tokensNoMaximum tokens to generate
messageYesYour message to LibreModel
system_promptNoOptional system prompt to prefix the conversation
temperatureNoSampling temperature (0.0-2.0)
top_kNoTop-k sampling parameter
top_pNoNucleus sampling parameter

Implementation Reference

  • The handler function for the 'chat' tool. It processes input arguments, calls the callLlamaServer helper to interact with the LibreModel server, formats the response with token info, and handles errors.
    }, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
  • Input schema using Zod for validating and describing parameters of the 'chat' tool: message, temperature, max_tokens, top_p, top_k, system_prompt.
    inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation")
  • src/index.ts:68-109 (registration)
    Registers the 'chat' tool with the MCP server using registerTool, specifying title, description, inputSchema, and inline handler function.
    this.server.registerTool("chat", { title: "Chat with LibreModel", description: "Have a conversation with LibreModel (Gigi)", inputSchema: { message: z.string().describe("Your message to LibreModel"), temperature: z.number().min(0.0).max(2.0).default(this.config.defaultTemperature).describe("Sampling temperature (0.0-2.0)"), max_tokens: z.number().min(1).max(2048).default(this.config.defaultMaxTokens).describe("Maximum tokens to generate"), top_p: z.number().min(0.0).max(1.0).default(this.config.defaultTopP).describe("Nucleus sampling parameter"), top_k: z.number().min(1).default(this.config.defaultTopK).describe("Top-k sampling parameter"), system_prompt: z.string().default("").describe("Optional system prompt to prefix the conversation") } }, async (args) => { try { const response = await this.callLlamaServer({ message: args.message, temperature: args.temperature || this.config.defaultTemperature, max_tokens: args.max_tokens || this.config.defaultMaxTokens, top_p: args.top_p || this.config.defaultTopP, top_k: args.top_k || this.config.defaultTopK, system_prompt: args.system_prompt || "" }); return { content: [ { type: "text", text: `**LibreModel (Gigi) responds:**\n\n${response.content}\n\n---\n*Tokens: ${response.tokens_predicted} | Model: ${response.model || "LibreModel"}*` } ] }; } catch (error) { return { content: [ { type: "text", text: `**Error communicating with LibreModel:**\n${error instanceof Error ? error.message : String(error)}` } ], isError: true }; } });
  • Helper method that constructs the prompt, makes a POST request to the llama-server /completion endpoint, and returns the completion response. Used by the chat handler.
    private async callLlamaServer(params: { message: string; temperature: number; max_tokens: number; top_p: number; top_k: number; system_prompt: string; }): Promise<LlamaCompletionResponse> { const prompt = params.system_prompt ? `${params.system_prompt}\n\nHuman: ${params.message}\n\nAssistant:` : `Human: ${params.message}\n\nAssistant:`; const requestBody: LlamaCompletionRequest = { prompt, temperature: params.temperature, n_predict: params.max_tokens, top_p: params.top_p, top_k: params.top_k, stop: this.config.stopSequences, stream: false }; const response = await fetch(`${this.config.url}/completion`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(requestBody) }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } const data = await response.json() as LlamaCompletionResponse; if (!data.content) { throw new Error("No content in response from llama-server"); } return data; }

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openconstruct/llama-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server