venice_chat
Send messages to Venice AI's open-source LLMs for chat responses, with configurable models, prompts, and generation parameters.
Instructions
Send a message to Venice AI and get a response from an LLM
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b) | llama-3.3-70b |
| message | Yes | The user message to send | |
| system_prompt | No | Optional system prompt | |
| temperature | No | Sampling temperature (0-2) | |
| max_tokens | No | Maximum tokens to generate |
Implementation Reference
- src/tools/inference/index.ts:29-42 (handler)The handler function for the venice_chat tool. It constructs messages, calls the Venice AI chat completions API via veniceAPI, and returns the response content or error.async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; }
- src/tools/inference/index.ts:22-28 (schema)Input schema using Zod for validating parameters: model, message, system_prompt, temperature, max_tokens.{ model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), },
- src/tools/inference/index.ts:19-43 (registration)The server.tool registration for venice_chat, including name, description, input schema, and inline handler function.server.tool( "venice_chat", "Send a message to Venice AI and get a response from an LLM", { model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), }, async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; } );
- src/index.ts:16-16 (registration)Main server initialization calls registerInferenceTools(server), which registers the venice_chat tool among other inference tools.registerInferenceTools(server);
- src/client/venice-api.ts:9-17 (helper)veniceAPI helper function used in the handler to make authenticated HTTP requests to the Venice AI API.export async function veniceAPI(endpoint: string, options: RequestInit = {}): Promise<Response> { const url = `${BASE_URL}${endpoint}`; const headers: Record<string, string> = { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json", ...(options.headers as Record<string, string> || {}), }; return fetch(url, { ...options, headers }); }