ask
Query multiple AI backends with smart routing and automatic fallback chains. Features dynamic token scaling, Unity detection, and response tracking across local, Gemini, DeepSeek, and Qwen models.
Instructions
๐ค MULTI-AI Direct Query - Ask any backend with BLAZING FAST smart fallback chains! Features automatic Unity detection, dynamic token scaling, and response headers with backend tracking.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
enable_chunking | No | Enable automatic request chunking for extremely large generations (fallback if truncated) | |
force_backend | No | Force specific backend (bypasses smart routing) - use backend keys like "local", "gemini", "nvidia_deepseek", "nvidia_qwen" | |
max_tokens | No | Maximum response length (auto-calculated if not specified: Unity=16K, Complex=8K, Simple=2K) | |
model | Yes | AI backend to query: local (Qwen2.5-Coder-7B-Instruct-FP8-Dynamic, 128K+ tokens), gemini (Gemini Enhanced, 32K tokens), deepseek3.1 (NVIDIA DeepSeek V3.1, 8K tokens), qwen3 (NVIDIA Qwen3 Coder 480B, 32K tokens) | |
prompt | Yes | Your question or prompt (Unity/complex generations automatically get high token limits) | |
thinking | No | Enable thinking mode for DeepSeek (shows reasoning) |
Input Schema (JSON Schema)
{
"properties": {
"enable_chunking": {
"default": false,
"description": "Enable automatic request chunking for extremely large generations (fallback if truncated)",
"type": "boolean"
},
"force_backend": {
"description": "Force specific backend (bypasses smart routing) - use backend keys like \"local\", \"gemini\", \"nvidia_deepseek\", \"nvidia_qwen\"",
"type": "string"
},
"max_tokens": {
"description": "Maximum response length (auto-calculated if not specified: Unity=16K, Complex=8K, Simple=2K)",
"type": "number"
},
"model": {
"description": "AI backend to query: local (Qwen2.5-Coder-7B-Instruct-FP8-Dynamic, 128K+ tokens), gemini (Gemini Enhanced, 32K tokens), deepseek3.1 (NVIDIA DeepSeek V3.1, 8K tokens), qwen3 (NVIDIA Qwen3 Coder 480B, 32K tokens)",
"enum": [
"local",
"gemini",
"deepseek3.1",
"qwen3"
],
"type": "string"
},
"prompt": {
"description": "Your question or prompt (Unity/complex generations automatically get high token limits)",
"type": "string"
},
"thinking": {
"default": true,
"description": "Enable thinking mode for DeepSeek (shows reasoning)",
"type": "boolean"
}
},
"required": [
"model",
"prompt"
],
"type": "object"
}