Skip to main content
Glama

venice_chat

Send messages to Venice AI's open-source LLMs for chat responses, with configurable models, prompts, and generation parameters.

Instructions

Send a message to Venice AI and get a response from an LLM

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelNoModel ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)llama-3.3-70b
messageYesThe user message to send
system_promptNoOptional system prompt
temperatureNoSampling temperature (0-2)
max_tokensNoMaximum tokens to generate

Implementation Reference

  • The handler function for the venice_chat tool. It constructs messages, calls the Venice AI chat completions API via veniceAPI, and returns the response content or error.
    async ({ model, message, system_prompt, temperature, max_tokens }) => {
      const messages: Array<{role: string; content: string}> = [];
      if (system_prompt) messages.push({ role: "system", content: system_prompt });
      messages.push({ role: "user", content: message });
    
      const response = await veniceAPI("/chat/completions", {
        method: "POST",
        body: JSON.stringify({ model, messages, temperature, max_tokens }),
      });
    
      const data = await response.json() as ChatCompletionResponse;
      if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] };
      return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] };
    }
  • Input schema using Zod for validating parameters: model, message, system_prompt, temperature, max_tokens.
    {
      model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"),
      message: z.string().describe("The user message to send"),
      system_prompt: z.string().optional().describe("Optional system prompt"),
      temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"),
      max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"),
    },
  • The server.tool registration for venice_chat, including name, description, input schema, and inline handler function.
    server.tool(
      "venice_chat",
      "Send a message to Venice AI and get a response from an LLM",
      {
        model: z.string().optional().default("llama-3.3-70b").describe("Model ID (e.g., llama-3.3-70b, deepseek-r1-llama-70b)"),
        message: z.string().describe("The user message to send"),
        system_prompt: z.string().optional().describe("Optional system prompt"),
        temperature: z.number().optional().default(0.7).describe("Sampling temperature (0-2)"),
        max_tokens: z.number().optional().default(2048).describe("Maximum tokens to generate"),
      },
      async ({ model, message, system_prompt, temperature, max_tokens }) => {
        const messages: Array<{role: string; content: string}> = [];
        if (system_prompt) messages.push({ role: "system", content: system_prompt });
        messages.push({ role: "user", content: message });
    
        const response = await veniceAPI("/chat/completions", {
          method: "POST",
          body: JSON.stringify({ model, messages, temperature, max_tokens }),
        });
    
        const data = await response.json() as ChatCompletionResponse;
        if (!response.ok) return { content: [{ type: "text" as const, text: `Error: ${data.error?.message || response.statusText}` }] };
        return { content: [{ type: "text" as const, text: data.choices?.[0]?.message?.content || "No response" }] };
      }
    );
  • src/index.ts:16-16 (registration)
    Main server initialization calls registerInferenceTools(server), which registers the venice_chat tool among other inference tools.
    registerInferenceTools(server);
  • veniceAPI helper function used in the handler to make authenticated HTTP requests to the Venice AI API.
    export async function veniceAPI(endpoint: string, options: RequestInit = {}): Promise<Response> {
      const url = `${BASE_URL}${endpoint}`;
      const headers: Record<string, string> = {
        "Authorization": `Bearer ${API_KEY}`,
        "Content-Type": "application/json",
        ...(options.headers as Record<string, string> || {}),
      };
      return fetch(url, { ...options, headers });
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/georgeglarson/venice-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server