venice_chat
Send messages to Venice AI's LLM models for generating responses, with configurable parameters like model selection, temperature, and token limits.
Instructions
Send a message to Venice AI and get a response from an LLM
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b) | llama-3.3-70b |
| message | Yes | The user message to send | |
| system_prompt | No | Optional system prompt | |
| temperature | No | Sampling temperature (0-2) | |
| max_tokens | No | Maximum tokens to generate |
Implementation Reference
- src/tools/inference/index.ts:29-42 (handler)The handler function that implements the core logic of the venice_chat tool: constructs messages, calls Venice AI chat completions API, handles errors, and formats the response.async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; }
- src/tools/inference/index.ts:22-28 (schema)Zod schema defining the input parameters for the venice_chat tool.{ model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), },
- src/tools/inference/index.ts:19-43 (registration)Registers the venice_chat tool on the MCP server with name, description, input schema, and handler function.server.tool( "venice_chat", "Send a message to Venice AI and get a response from an LLM", { model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), }, async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; } );
- src/index.ts:16-16 (registration)Global registration call that invokes registerInferenceTools, which includes the venice_chat tool.registerInferenceTools(server);
- src/client/venice-api.ts:9-17 (helper)Helper utility function used by the venice_chat handler to make authenticated HTTP requests to the Venice AI API.export async function veniceAPI(endpoint: string, options: RequestInit = {}): Promise<Response> { const url = `${BASE_URL}${endpoint}`; const headers: Record<string, string> = { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json", ...(options.headers as Record<string, string> || {}), }; return fetch(url, { ...options, headers }); }