venice_chat

venice_chat

Send messages to Venice AI's open-source LLMs for chat responses, with configurable models, prompts, and generation parameters.

Instructions

Send a message to Venice AI and get a response from an LLM

Input Schema

TableJSON Schema

Name	Required	Description	Default
`model`	No	Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)	llama-3.3-70b
`message`	Yes	The user message to send
`system_prompt`	No	Optional system prompt
`temperature`	No	Sampling temperature (0-2)
`max_tokens`	No	Maximum tokens to generate

Implementation Reference

src/tools/inference/index.ts:29-42 (handler)
The handler function for the venice_chat tool. It constructs messages, calls the Venice AI chat completions API via veniceAPI, and returns the response content or error.
async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; }
src/tools/inference/index.ts:22-28 (schema)
Input schema using Zod for validating parameters: model, message, system_prompt, temperature, max_tokens.
{ model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), },
src/tools/inference/index.ts:19-43 (registration)
The server.tool registration for venice_chat, including name, description, input schema, and inline handler function.
server.tool( "venice_chat", "Send a message to Venice AI and get a response from an LLM", { model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"), message: z.string().describe("The user message to send"), system_prompt: z.string().optional().describe("Optional system prompt"), temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"), max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"), }, async ({ model, message, system_prompt, temperature, max_tokens }) => { const messages: Array<{role: string; content: string}> = []; if (system_prompt) messages.push({ role: "system", content: system_prompt }); messages.push({ role: "user", content: message }); const response = await veniceAPI("/chat/completions", { method: "POST", body: JSON.stringify({ model, messages, temperature, max_tokens }), }); const data = await response.json() as ChatCompletionResponse; if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] }; return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] }; } );
src/index.ts:16-16 (registration)
Main server initialization calls registerInferenceTools(server), which registers the venice_chat tool among other inference tools.
registerInferenceTools(server);
src/client/venice-api.ts:9-17 (helper)
veniceAPI helper function used in the handler to make authenticated HTTP requests to the Venice AI API.
export async function veniceAPI(endpoint: string, options: RequestInit = {}): Promise<Response> { const url = `${BASE_URL}${endpoint}`; const headers: Record<string, string> = { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json", ...(options.headers as Record<string, string> || {}), }; return fetch(url, { ...options, headers }); }

Venice MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API