agnes_chat
Generate text and reason through multi-turn conversations. Supports system prompts, tool calling, streaming, and thinking mode with models offering up to 1M-token context.
Instructions
Capability 1 — Text generation & reasoning. OpenAI-compatible chat completions. Supports multi-turn conversation, system prompts, tool/function calling, streaming, and Thinking mode. Models: agnes-2.0-flash, agnes-1.5-flash. agnes-2.0-flash supports up to 1M-token context (set max_tokens up to 1048576). Vision is also available here by passing image_url content parts; use agnes_vision for a simpler image-understanding interface.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Chat model name. | agnes-2.0-flash |
| messages | Yes | Conversation messages (system/user/assistant/tool). | |
| temperature | No | Sampling randomness. Lower = more deterministic. | |
| top_p | No | Nucleus sampling probability. | |
| max_tokens | No | Max output tokens. agnes-2.0-flash supports a 1M context window; bound here is 1,048,576. | |
| stream | No | Enable SSE streaming (consumed server-side, full text returned). | |
| tools | No | Tool/function definitions for agent workflows. | |
| tool_choice | No | How the model uses tools. | |
| enable_thinking | No | Enable Thinking mode via chat_template_kwargs.enable_thinking (OpenAI-compatible). | |
| thinking_budget | No | Anthropic-compatible thinking budget_tokens (implies thinking.type='enabled'). Recommended: 2048+. | |
| frequency_penalty | No | Reduce repetition (agnes-1.5-flash). | |
| presence_penalty | No | Encourage new topics (agnes-1.5-flash). | |
| repetition_penalty | No | Repetition control coefficient (agnes-1.5-flash). | |
| stop | No | Custom stop sequences (agnes-1.5-flash). | |
| seed | No | Random seed for reproducibility (agnes-1.5-flash). |