Skip to main content
Glama

venice_chat

Send messages to Venice AI's LLM models for generating responses, with configurable parameters like model selection, temperature, and token limits.

Instructions

Send a message to Venice AI and get a response from an LLM

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelNoModel ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)llama-3.3-70b
messageYesThe user message to send
system_promptNoOptional system prompt
temperatureNoSampling temperature (0-2)
max_tokensNoMaximum tokens to generate

Implementation Reference

  • The handler function that implements the core logic of the venice_chat tool: constructs messages, calls Venice AI chat completions API, handles errors, and formats the response.
    async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; }
  • Zod schema defining the input parameters for the venice_chat tool.
    { model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), },
  • Registers the venice_chat tool on the MCP server with name, description, input schema, and handler function.
    server.tool( "venice_chat", "Send a message to Venice AI and get a response from an LLM", { model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), }, async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; } );
  • src/index.ts:16-16 (registration)
    Global registration call that invokes registerInferenceTools, which includes the venice_chat tool.
    registerInferenceTools(server);
  • Helper utility function used by the venice_chat handler to make authenticated HTTP requests to the Venice AI API.
    export async function veniceAPI(endpoint: string, options: RequestInit = {}): Promise<Response> { const url = `${BASE_URL}${endpoint}`; const headers: Record<string, string> = { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json", ...(options.headers as Record<string, string> || {}), }; return fetch(url, { ...options, headers }); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/georgeglarson/venice-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server