Skip to main content
Glama
Platano78

Smart-AI-Bridge

ask

Query multiple AI backends with smart routing and automatic fallback chains. Features dynamic token scaling, Unity detection, and response tracking across local, Gemini, DeepSeek, and Qwen models.

Instructions

🤖 MULTI-AI Direct Query - Ask any backend with BLAZING FAST smart fallback chains! Features automatic Unity detection, dynamic token scaling, and response headers with backend tracking.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
enable_chunkingNoEnable automatic request chunking for extremely large generations (fallback if truncated)
force_backendNoForce specific backend (bypasses smart routing) - use backend keys like "local", "gemini", "nvidia_deepseek", "nvidia_qwen"
max_tokensNoMaximum response length (auto-calculated if not specified: Unity=16K, Complex=8K, Simple=2K)
modelYesAI backend to query: local (Qwen2.5-Coder-7B-Instruct-FP8-Dynamic, 128K+ tokens), gemini (Gemini Enhanced, 32K tokens), deepseek3.1 (NVIDIA DeepSeek V3.1, 8K tokens), qwen3 (NVIDIA Qwen3 Coder 480B, 32K tokens)
promptYesYour question or prompt (Unity/complex generations automatically get high token limits)
thinkingNoEnable thinking mode for DeepSeek (shows reasoning)

Implementation Reference

  • Executes the 'ask' tool by mapping the model parameter to a backend, sending the prompt via the MultiAIRouter, recording usage analytics, and handling errors with fallback logging.
    async handleAsk(args) {
      const { model, prompt, thinking = false, force_backend = false } = args;
    
      const startTime = Date.now();
    
      try {
        const backend = model === 'deepseek3.1' ? 'nvidia_deepseek'
                      : model === 'qwen3' ? 'nvidia_qwen'
                      : model;
    
        const result = await this.router.makeRequest(prompt, backend, { thinking });
    
        await this.usageAnalytics.recordInvocation({
          tool_name: 'ask',
          backend_used: backend,
          processing_time_ms: Date.now() - startTime,
          success: true
        });
    
        return result;
      } catch (error) {
        await this.usageAnalytics.recordInvocation({
          tool_name: 'ask',
          backend_used: model,
          processing_time_ms: Date.now() - startTime,
          success: false,
          error
        });
        throw error;
      }
    }
  • Core registration of the 'ask' tool definition in the coreToolDefinitions array, specifying name, description, handler, and schema. This object is later added to this.coreTools map.
    {
      name: 'ask',
      description: '🤖 MULTI-AI Direct Query - Ask any backend with BLAZING FAST smart fallback chains! Features automatic Unity detection, dynamic token scaling, and response headers with backend tracking.',
      handler: 'handleAsk',
      schema: {
        type: 'object',
        properties: {
          model: {
            type: 'string',
            enum: ['local', 'gemini', 'deepseek3.1', 'qwen3'],
            description: 'AI backend to query: local (Qwen2.5-Coder-7B-Instruct, 128K+ tokens), gemini (Gemini Enhanced, 32K tokens), deepseek3.1 (NVIDIA DeepSeek V3.1, 8K tokens), qwen3 (NVIDIA Qwen3 Coder 480B, 32K tokens)'
          },
          prompt: {
            type: 'string',
            description: 'Your question or prompt (Unity/complex generations automatically get high token limits)'
          },
          thinking: {
            type: 'boolean',
            default: false,
            description: 'Enable reasoning mode for DeepSeek V3.1 (opt-in)'
          },
          force_backend: {
            type: 'boolean',
            default: false,
            description: 'Force use of specified backend even if unhealthy (bypass smart fallback)'
          }
        },
        required: ['model', 'prompt']
      }
    },
  • Input schema for the 'ask' tool defining properties like model (enum), prompt (required string), thinking (boolean), and force_backend (boolean).
    schema: {
      type: 'object',
      properties: {
        model: {
          type: 'string',
          enum: ['local', 'gemini', 'deepseek3.1', 'qwen3'],
          description: 'AI backend to query: local (Qwen2.5-Coder-7B-Instruct, 128K+ tokens), gemini (Gemini Enhanced, 32K tokens), deepseek3.1 (NVIDIA DeepSeek V3.1, 8K tokens), qwen3 (NVIDIA Qwen3 Coder 480B, 32K tokens)'
        },
        prompt: {
          type: 'string',
          description: 'Your question or prompt (Unity/complex generations automatically get high token limits)'
        },
        thinking: {
          type: 'boolean',
          default: false,
          description: 'Enable reasoning mode for DeepSeek V3.1 (opt-in)'
        },
        force_backend: {
          type: 'boolean',
          default: false,
          description: 'Force use of specified backend even if unhealthy (bypass smart fallback)'
        }
      },
      required: ['model', 'prompt']
    }
  • Alias definitions that map 'MKG_generate' and 'deepseek_generate' to the core 'ask' tool, enabling backwards compatibility and alternative names for the tool.
      // MKG aliases
      { alias: 'MKG_analyze', coreTool: 'review' },
      { alias: 'MKG_generate', coreTool: 'ask' },
      { alias: 'MKG_review', coreTool: 'review' },
      { alias: 'MKG_edit', coreTool: 'edit_file' },
      { alias: 'MKG_health', coreTool: 'health' },
    
      // DeepSeek aliases
      { alias: 'deepseek_analyze', coreTool: 'review' },
      { alias: 'deepseek_generate', coreTool: 'ask' },
      { alias: 'deepseek_review', coreTool: 'review' },
      { alias: 'deepseek_edit', coreTool: 'edit_file' },
      { alias: 'deepseek_health', coreTool: 'health' }
    ];

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Platano78/Smart-AI-Bridge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server