groq_chat_completion
Execute fast LLM completions via Groq's API using chat messages or prompts. Supports open models like Llama 3, Mixtral, and Gemma with configurable temperature, max tokens, and stop sequences.
Instructions
Run a fast LLM inference with Groq. Supports Llama 3, Mixtral, Gemma, and other open models at high speed.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| api_key | Yes | Groq API key from console.groq.com/keys | |
| model | No | Model ID (e.g. llama-3.3-70b-versatile, mixtral-8x7b-32768, gemma2-9b-it). Default: llama-3.3-70b-versatile | |
| messages | No | Array of {role, content} messages | |
| prompt | No | Single user message (alternative to messages) | |
| system_prompt | No | System prompt (used with prompt shorthand) | |
| max_tokens | No | Maximum tokens to generate | |
| temperature | No | Sampling temperature (0-2) | |
| top_p | No | Top-p sampling (0-1) | |
| stop | No | Stop sequence(s) |