Skip to main content
Glama

synthesize

Query 2-5 AI models in parallel and combine their responses into one answer that integrates the key insights from each.

Instructions

Query 2-5 models in parallel, then combine their best ideas into one answer. Returns a synthesized response that's better than any single model.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelsYesList of model IDs to synthesize from (2-5 models)
promptYesThe prompt to send to all models
synthesizer_modelNoOptional model ID to use as synthesizer. Auto-picks if not specified.
system_promptNo
temperatureNo
max_tokensNo

Implementation Reference

  • Main handler function that fans out to 2-5 models in parallel, collects responses, picks a synthesizer model, sends all responses to the synthesizer, and returns the combined result.
    export async function synthesize(
      provider: Provider,
      input: SynthesizeInput
    ): Promise<string> {
      const startTime = Date.now();
    
      // Step 1: Fan out to all models in parallel
      const results = await Promise.allSettled(
        input.models.map((model) =>
          provider.query(model, input.prompt, {
            system_prompt: input.system_prompt,
            temperature: input.temperature,
            max_tokens: input.max_tokens,
          })
        )
      );
    
      const responses: ModelResponse[] = results.map((result, i) => {
        if (result.status === "fulfilled") {
          return {
            model: input.models[i],
            content: result.value.content,
            latency_ms: result.value.latency_ms,
            tokens: result.value.usage?.total_tokens,
          };
        }
        return {
          model: input.models[i],
          content: "",
          latency_ms: 0,
          error:
            result.reason instanceof Error
              ? result.reason.message
              : String(result.reason),
        };
      });
    
      const successful = responses.filter((r) => !r.error);
      const failed = responses.filter((r) => r.error);
    
      if (successful.length < 2) {
        return `## Synthesis Failed\n\nOnly ${successful.length} model(s) responded. Need at least 2 for synthesis.\n\nErrors:\n${failed.map((f) => `- ${f.model}: ${f.error}`).join("\n")}`;
      }
    
      // Step 2: Pick a synthesizer model
      const synthModel = input.synthesizer_model ?? await pickSynthesizer(provider, input.models);
    
      if (!synthModel) {
        return formatWithoutSynthesis(successful, failed, Date.now() - startTime);
      }
    
      // Step 3: Send all responses to the synthesizer
      logger.info(`synthesize: using ${synthModel} as synthesizer`);
      const synthStart = Date.now();
    
      const responseSummary = successful
        .map((r) => `## ${r.model}\n${r.content}`)
        .join("\n\n---\n\n");
    
      const synthPrompt = `You are combining ${successful.length} AI model responses into one final answer.
    
    Question: "${input.prompt}"
    
    Responses:
    
    ${responseSummary}
    
    Write ONE definitive answer. Take the best insights from each, drop the filler. Do not reference the models, do not say "one model suggested." Just give the answer as if you're the expert who considered all perspectives.
    
    Keep it shorter than the longest individual response. No preamble, no "here's the synthesis." Just the answer.`;
    
      try {
        const synthResult = await provider.query(synthModel, synthPrompt, {
          temperature: input.temperature ?? 0.3,
          max_tokens: input.max_tokens,
        });
    
        const synthLatency = Date.now() - synthStart;
        const totalTime = Date.now() - startTime;
    
        return formatSynthesis({
          synthesized: synthResult.content,
          synthModel,
          synthLatency,
          sources: successful,
          failed,
          totalTime,
        });
      } catch (err) {
        logger.warn(`synthesizer failed: ${err instanceof Error ? err.message : String(err)}`);
        return formatWithoutSynthesis(successful, failed, Date.now() - startTime);
      }
    }
  • Zod schema for input validation: models (2-5), prompt, optional synthesizer_model, system_prompt, temperature, and max_tokens.
    export const synthesizeSchema = z.object({
      models: z
        .array(z.string())
        .min(2)
        .max(5)
        .describe("List of model IDs to synthesize from (2-5 models)"),
      prompt: z.string().describe("The prompt to send to all models"),
      synthesizer_model: z
        .string()
        .optional()
        .describe("Optional model ID to use as synthesizer. Auto-picks if not specified."),
      system_prompt: z.string().optional(),
      temperature: z.number().min(0).max(2).optional(),
      max_tokens: z.number().int().positive().optional().default(1024),
    });
  • src/server.ts:167-188 (registration)
    MCP server.tool() registration of the 'synthesize' tool with its description, schema, and handler that delegates to the synthesize() function.
    // --- synthesize ---
    server.tool(
      "synthesize",
      "Query 2-5 models in parallel, then combine their best ideas into one answer. Returns a synthesized response that's better than any single model.",
      synthesizeSchema.shape,
      async (input) => {
        logger.info(
          `synthesize: querying ${input.models.length} models, synthesizer: ${input.synthesizer_model ?? "auto"}`
        );
        try {
          const result = await synthesize(provider, input);
          return { content: [{ type: "text" as const, text: result }] };
        } catch (err) {
          const message = err instanceof Error ? err.message : String(err);
          logger.error(`synthesize failed: ${message}`);
          return {
            content: [{ type: "text" as const, text: `Error: ${message}` }],
            isError: true,
          };
        }
      }
    );
  • Helper function that picks a synthesizer model, preferring one not already in the source model list (fallback to first available model).
    async function pickSynthesizer(provider: Provider, sourceModels: string[]): Promise<string | null> {
      try {
        const available = await provider.listModels();
        if (available.length === 0) return null;
    
        const sourceSet = new Set(sourceModels.map((m) => m.toLowerCase()));
        const outside = available.find(
          (m) => !sourceSet.has(m.id.toLowerCase()) && !sourceSet.has(m.id.split("/").pop()?.toLowerCase() ?? "")
        );
    
        if (outside) return outside.id;
        return available[0].id;
      } catch {
        return null;
      }
    }
  • Helper functions formatSynthesis (success path) and formatWithoutSynthesis (fallback) for formatting the output into markdown.
    function formatSynthesis(result: SynthesisResult): string {
      const lines: string[] = [
        `## Synthesized Response (${result.sources.length} models, ${result.totalTime}ms total)`,
        "",
        `**Synthesizer:** ${result.synthModel} (${result.synthLatency}ms)`,
        `**Sources:** ${result.sources.map((s) => s.model).join(", ")}`,
        "",
        result.synthesized,
        "",
      ];
    
      // Source summary table
      lines.push("### Source Metrics");
      lines.push("");
      lines.push("| Model | Latency | Tokens |");
      lines.push("|-------|---------|--------|");
      for (const s of result.sources) {
        lines.push(`| ${s.model} | ${s.latency_ms}ms | ${s.tokens ?? "n/a"} |`);
      }
      lines.push("");
    
      if (result.failed.length > 0) {
        lines.push("### Errors");
        for (const f of result.failed) {
          lines.push(`- **${f.model}:** ${f.error}`);
        }
        lines.push("");
      }
    
      return lines.join("\n");
    }
    
    /**
     * Fallback if no synthesizer is available - just return
     * all responses like compare_models would.
     */
    function formatWithoutSynthesis(
      successful: ModelResponse[],
      failed: ModelResponse[],
      totalTime: number
    ): string {
      const lines: string[] = [
        `## Synthesis Failed - Showing Raw Responses (${totalTime}ms total)`,
        "",
        "*No synthesizer model available. Showing individual responses instead.*",
        "",
      ];
    
      for (const r of successful) {
        lines.push(`### ${r.model}`);
        lines.push("");
        lines.push(r.content);
        lines.push("");
      }
    
      if (failed.length > 0) {
        lines.push("### Errors");
        for (const f of failed) {
          lines.push(`- **${f.model}:** ${f.error}`);
        }
        lines.push("");
      }
    
      return lines.join("\n");
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden of disclosing behavior. It reveals parallel execution and combination of best ideas, but omits details on failure handling, timeouts, or the subjective 'better than any single model' claim. Additional context on potential issues would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loads the key action (parallel query and synthesis), and contains no extraneous information. Every word serves a purpose, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 6 parameters and no output schema, the description is moderately complete. It covers the core functionality and result, but lacks details on parameter usage, return format, and edge cases. Additional information would be beneficial given the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no information about parameters; it solely describes the tool's operation. With 50% schema coverage (only 3 of 6 parameters have descriptions), the description fails to compensate for the gap, leaving agents to rely on parameter names alone for meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool queries 2-5 models in parallel and combines their ideas into a single answer. It distinguishes itself from siblings like 'ask_model' (single model) and 'compare_models' (comparison, not synthesis), but lacks explicit differentiation from 'consensus' which may have similar purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a synthesized answer better than any single model is desired, but provides no when-not-to-use guidance or alternatives. Sibling tools like 'compare_models' or 'consensus' are not mentioned, leaving the agent to infer the best tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Pickle-Pixel/HydraMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server