Skip to main content
Glama

query-models

Compare AI model responses to a single question by querying multiple models simultaneously, enabling side-by-side analysis of different perspectives.

Instructions

Query multiple AI models via Ollama and get their responses to compare perspectives

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
questionYesThe question to ask all models
modelsNoArray of model names to query (defaults to configured models)
system_promptNoOptional system prompt to provide context to all models (overridden by model_system_prompts if provided)
model_system_promptsNoOptional object mapping model names to specific system prompts

Implementation Reference

  • The core handler function that implements the 'query-models' tool logic. It fetches responses from multiple Ollama models in parallel, applies appropriate system prompts, handles individual model errors, and formats a comparative response.
    async ({ question, models, system_prompt, model_system_prompts }) => {
      try {
        // Use provided models or fall back to default models from environment
        const modelsToQuery = models || DEFAULT_MODELS;
        
        debugLog(`Using models: ${modelsToQuery.join(", ")}`);
        
        // Query each model in parallel
        const responses = await Promise.all(
          modelsToQuery.map(async (modelName) => {
            try {
              // Determine which system prompt to use for this model
              let modelSystemPrompt = system_prompt || "You are a helpful AI assistant answering a user's question.";
              
              // If model-specific prompts are provided, use those instead
              if (model_system_prompts && model_system_prompts[modelName]) {
                modelSystemPrompt = model_system_prompts[modelName];
              }
              // If no prompt is specified at all, use our default role-specific prompts if available
              else if (!system_prompt && modelName in DEFAULT_SYSTEM_PROMPTS) {
                modelSystemPrompt = DEFAULT_SYSTEM_PROMPTS[modelName];
              }
    
              debugLog(`Querying ${modelName} with system prompt: ${modelSystemPrompt.substring(0, 50)}...`);
              
              const response = await fetch(`${OLLAMA_API_URL}/api/generate`, {
                method: "POST",
                headers: {
                  "Content-Type": "application/json",
                },
                body: JSON.stringify({
                  model: modelName,
                  prompt: question,
                  system: modelSystemPrompt,
                  stream: false,
                }),
              });
    
              if (!response.ok) {
                throw new Error(`HTTP error! status: ${response.status}`);
              }
    
              const data = await response.json() as OllamaResponse;
              return {
                model: modelName,
                response: data.response,
                systemPrompt: modelSystemPrompt
              };
            } catch (modelError) {
              console.error(`Error querying model ${modelName}:`, modelError);
              return {
                model: modelName,
                response: `Error: Could not get response from ${modelName}. Make sure this model is available in Ollama.`,
                error: true
              };
            }
          })
        );
    
        // Format the response in a way that's easy for Claude to analyze
        const formattedText = `# Responses from Multiple Models\n\n${responses.map(resp => {
          const roleInfo = resp.systemPrompt ? 
            `*Role: ${resp.systemPrompt.substring(0, 100)}${resp.systemPrompt.length > 100 ? '...' : ''}*\n\n` : '';
          
          return `## ${resp.model.toUpperCase()} RESPONSE:\n${roleInfo}${resp.response}\n\n`;
        }).join("")}\n\nConsider the perspectives above when formulating your response. You may agree or disagree with any of these models. Note that these are all compact 1-1.5B parameter models, so take that into account when evaluating their responses.`;
    
        return {
          content: [
            {
              type: "text",
              text: formattedText,
            },
          ],
        };
      } catch (error) {
        console.error("Error in query-models tool:", error);
        return {
          isError: true,
          content: [
            {
              type: "text",
              text: `Error querying models: ${error instanceof Error ? error.message : String(error)}`,
            },
          ],
        };
      }
    }
  • Zod schema defining the input parameters for the 'query-models' tool: question (required), models (optional array), system_prompt (optional), model_system_prompts (optional record).
    {
      question: z.string().describe("The question to ask all models"),
      models: z.array(z.string()).optional().describe("Array of model names to query (defaults to configured models)"),
      system_prompt: z.string().optional().describe("Optional system prompt to provide context to all models (overridden by model_system_prompts if provided)"),
      model_system_prompts: z.record(z.string()).optional().describe("Optional object mapping model names to specific system prompts"),
    },
  • src/index.ts:131-228 (registration)
    The server.tool() call that registers the 'query-models' tool, providing name, description, input schema, and handler function.
    server.tool(
      "query-models",
      "Query multiple AI models via Ollama and get their responses to compare perspectives",
      {
        question: z.string().describe("The question to ask all models"),
        models: z.array(z.string()).optional().describe("Array of model names to query (defaults to configured models)"),
        system_prompt: z.string().optional().describe("Optional system prompt to provide context to all models (overridden by model_system_prompts if provided)"),
        model_system_prompts: z.record(z.string()).optional().describe("Optional object mapping model names to specific system prompts"),
      },
      async ({ question, models, system_prompt, model_system_prompts }) => {
        try {
          // Use provided models or fall back to default models from environment
          const modelsToQuery = models || DEFAULT_MODELS;
          
          debugLog(`Using models: ${modelsToQuery.join(", ")}`);
          
          // Query each model in parallel
          const responses = await Promise.all(
            modelsToQuery.map(async (modelName) => {
              try {
                // Determine which system prompt to use for this model
                let modelSystemPrompt = system_prompt || "You are a helpful AI assistant answering a user's question.";
                
                // If model-specific prompts are provided, use those instead
                if (model_system_prompts && model_system_prompts[modelName]) {
                  modelSystemPrompt = model_system_prompts[modelName];
                }
                // If no prompt is specified at all, use our default role-specific prompts if available
                else if (!system_prompt && modelName in DEFAULT_SYSTEM_PROMPTS) {
                  modelSystemPrompt = DEFAULT_SYSTEM_PROMPTS[modelName];
                }
    
                debugLog(`Querying ${modelName} with system prompt: ${modelSystemPrompt.substring(0, 50)}...`);
                
                const response = await fetch(`${OLLAMA_API_URL}/api/generate`, {
                  method: "POST",
                  headers: {
                    "Content-Type": "application/json",
                  },
                  body: JSON.stringify({
                    model: modelName,
                    prompt: question,
                    system: modelSystemPrompt,
                    stream: false,
                  }),
                });
    
                if (!response.ok) {
                  throw new Error(`HTTP error! status: ${response.status}`);
                }
    
                const data = await response.json() as OllamaResponse;
                return {
                  model: modelName,
                  response: data.response,
                  systemPrompt: modelSystemPrompt
                };
              } catch (modelError) {
                console.error(`Error querying model ${modelName}:`, modelError);
                return {
                  model: modelName,
                  response: `Error: Could not get response from ${modelName}. Make sure this model is available in Ollama.`,
                  error: true
                };
              }
            })
          );
    
          // Format the response in a way that's easy for Claude to analyze
          const formattedText = `# Responses from Multiple Models\n\n${responses.map(resp => {
            const roleInfo = resp.systemPrompt ? 
              `*Role: ${resp.systemPrompt.substring(0, 100)}${resp.systemPrompt.length > 100 ? '...' : ''}*\n\n` : '';
            
            return `## ${resp.model.toUpperCase()} RESPONSE:\n${roleInfo}${resp.response}\n\n`;
          }).join("")}\n\nConsider the perspectives above when formulating your response. You may agree or disagree with any of these models. Note that these are all compact 1-1.5B parameter models, so take that into account when evaluating their responses.`;
    
          return {
            content: [
              {
                type: "text",
                text: formattedText,
              },
            ],
          };
        } catch (error) {
          console.error("Error in query-models tool:", error);
          return {
            isError: true,
            content: [
              {
                type: "text",
                text: `Error querying models: ${error instanceof Error ? error.message : String(error)}`,
              },
            ],
          };
        }
      }
    );
  • Companion tool 'list-available-models' specifically designed to list Ollama models for use with 'query-models', including info on default models.
    server.tool(
      "list-available-models",
      "List all available models in Ollama that can be used with query-models",
      {},
      async () => {
        try {
          const response = await fetch(`${OLLAMA_API_URL}/api/tags`);
          
          if (!response.ok) {
            throw new Error(`HTTP error! status: ${response.status}`);
          }
          
          const data = await response.json() as { models: OllamaModel[] };
          
          if (!data.models || !Array.isArray(data.models)) {
            return {
              content: [
                {
                  type: "text",
                  text: "No models found or unexpected response format from Ollama API."
                }
              ]
            };
          }
          
          // Format model information
          const modelInfo = data.models.map(model => {
            const size = (model.size / (1024 * 1024 * 1024)).toFixed(2); // Convert to GB
            const paramSize = model.details?.parameter_size || "Unknown";
            const quantLevel = model.details?.quantization_level || "Unknown";
            
            return `- **${model.name}**: ${paramSize} parameters, ${size} GB, ${quantLevel} quantization`;
          }).join("\n");
          
          // Show which models are currently configured as defaults
          const defaultModelsInfo = DEFAULT_MODELS.map(model => {
            const isAvailable = data.models.some(m => m.name === model);
            return `- **${model}**: ${isAvailable ? "✓ Available" : "⚠️ Not available"}`;
          }).join("\n");
          
          return {
            content: [
              {
                type: "text",
                text: `# Available Ollama Models\n\n${modelInfo}\n\n## Current Default Models\n\n${defaultModelsInfo}\n\nYou can use any of the available models with the query-models tool by specifying them in the 'models' parameter.`
              }
            ]
          };
        } catch (error) {
          console.error("Error listing models:", error);
          return {
            isError: true,
            content: [
              {
                type: "text",
                text: `Error listing models: ${error instanceof Error ? error.message : String(error)}\n\nMake sure Ollama is running and accessible at ${OLLAMA_API_URL}.`
              }
            ]
          };
        }
      }
    );
  • Default system prompts for the specific models used by 'query-models' tool, falling back to env vars.
    const DEFAULT_SYSTEM_PROMPTS: SystemPrompts = {
      "gemma3:1b": process.env.GEMMA_SYSTEM_PROMPT || 
        "You are a creative and innovative AI assistant. Think outside the box and offer novel perspectives.",
      "llama3.2:1b": process.env.LLAMA_SYSTEM_PROMPT || 
        "You are a supportive and empathetic AI assistant focused on human well-being. Provide considerate and balanced advice.",
      "deepseek-r1:1.5b": process.env.DEEPSEEK_SYSTEM_PROMPT || 
        "You are a logical and analytical AI assistant. Think step-by-step and explain your reasoning clearly."
    };
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions querying multiple models and getting responses for comparison, but fails to disclose critical behavioral traits such as whether this is a read-only operation, potential rate limits, authentication needs, error handling, or the format of responses. For a tool with no annotations and complex functionality, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently conveys the core functionality without waste. It is front-loaded with the main action ('query multiple AI models') and includes essential context ('via Ollama', 'to compare perspectives'), making every word earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (querying multiple models with optional prompts), lack of annotations, and no output schema, the description is incomplete. It does not explain return values, error conditions, or behavioral constraints, leaving significant gaps for an AI agent to understand how to invoke and interpret results effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value beyond the schema by implying the tool queries 'multiple' models and compares responses, but does not provide additional semantics, syntax, or format details for parameters. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('query multiple AI models', 'get their responses') and resources ('via Ollama'), and distinguishes it from the sibling tool 'list-available-models' by focusing on querying rather than listing models. It explicitly mentions the comparative aspect ('to compare perspectives'), which adds valuable context beyond basic querying.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for comparing model responses to a question, but provides no explicit guidance on when to use this tool versus alternatives (e.g., querying a single model) or any prerequisites. It mentions 'defaults to configured models' for the models parameter, which offers some contextual hint, but lacks clear when/when-not directives or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/YuChenSSR/multi-ai-advisor-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server