Skip to main content
Glama

Collect Input

collect_input

Collect contextual user input through an interactive interface. Request text, images, or pixel art submissions to gather specific information for processing.

Instructions

get image, text, or pixel art input from user. This is used to get contextual input from the user of different kinds.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
kindNo
initialImageNoInitial image to load for editing (file path)
gridWidthNoGrid width for pixel art (default: 16)
gridHeightNoGrid height for pixel art (default: 16)
widthNoCanvas width for image mode (default: 512)
heightNoCanvas height for image mode (default: 512)
messageNoCustom message to show to the user

Implementation Reference

  • The handler function for the 'collect_input' tool. Normalizes input parameters, launches the input prompt UI, processes the result (caching images), and returns MCP-formatted content or throws errors appropriately.
    }, async ({ kind, initialImage, gridWidth, gridHeight, width, height, message }) => {
        const baseSpec = normalizeSpec(kind);
        
        // Apply custom parameters
        const spec = {
            ...baseSpec,
            ...(initialImage && { initialImage }),
            ...(message && { message }),
            ...(baseSpec.kind === 'pixelart' && gridWidth && { gridWidth }),
            ...(baseSpec.kind === 'pixelart' && gridHeight && { gridHeight }),
            ...(baseSpec.kind === 'image' && width && { width }),
            ...(baseSpec.kind === 'image' && height && { height })
        };
    
        try {
            const result = await launchInputPrompt({ spec });
    
            if (result.kind === "text") {
                return { content: [{ type: "text", text: result.value }] };
            }
    
            if (result.kind === "image" || result.kind === "pixelart") {
                // Save image to cache
                const cachedPath = await saveImageToCache(result.dataUrl);
    
                // Return the file path as text instead of base64 image data
                return {
                    content: [{ 
                        type: "text", 
                        text: cachedPath 
                    }],
                    isError: false
                };
            }
    
            throw new Error(`Unsupported input result kind: ${(result as { kind?: string } | undefined)?.kind ?? "unknown"}`);
        } catch (error) {
            if (error instanceof InputCancelledError) {
                throw new Error("User cancelled the input");
            }
            if (error instanceof InputFailedError) {
                throw error; // Re-throw with original message
            }
            throw error; // Re-throw any other errors
        }
    });
  • Tool metadata including title, description, and inputSchema definition using Zod for validation of parameters like kind, dimensions, and message.
    title: "Collect Input", 
    description: "get image, text, or pixel art input from user. This is used to get contextual input from the user of different kinds. ",
    inputSchema: { 
        kind: z.enum(["text", "image", "pixelart"]).optional(),
        initialImage: z.string().optional().describe("Initial image to load for editing (file path)"),
        gridWidth: z.number().int().min(4).max(128).optional().describe("Grid width for pixel art (default: 16)"),
        gridHeight: z.number().int().min(4).max(128).optional().describe("Grid height for pixel art (default: 16)"),
        width: z.number().int().min(32).max(4096).optional().describe("Canvas width for image mode (default: 512)"),
        height: z.number().int().min(32).max(4096).optional().describe("Canvas height for image mode (default: 512)"),
        message: z.string().optional().describe("Custom message to show to the user")
    },
  • src/server.ts:18-18 (registration)
    Registration of the 'collect_input' tool with the MCP server.
    server.registerTool("collect_input", {
  • Primary helper function that launches an Electron window with the UI for collecting input (text, image, or pixelart) based on the spec, captures the result via stdout, and resolves with SubmissionResult or rejects with errors.
    export async function launchInputPrompt({
      spec
    }: {
      spec: InputSpec;
    }): Promise<SubmissionResult> {
      await ensureUiBuilt();
      
      // Process initialImage if present
      let processedSpec = spec;
      if ((spec.kind === 'image' || spec.kind === 'pixelart') && spec.initialImage) {
        const dataURL = await loadImageAsDataURL(spec.initialImage);
        processedSpec = { ...spec, initialImage: dataURL };
      }
      
      const electronModule: any = await import("electron");
      const electronBinary =
        typeof electronModule === "string"
          ? electronModule
          : typeof electronModule.default === "string"
          ? electronModule.default
          : electronModule.path;
    
      if (!electronBinary) {
        throw new Error("Electron binary not found, make sure electron is installed");
      }
    
      return new Promise<SubmissionResult>((resolvePromise, rejectPromise) => {
        const child = spawn(electronBinary, [electronEntrypoint], {
          stdio: ["ignore", "pipe", "inherit"],
          env: {
            ...process.env,
            MCP_INPUT_SPEC: JSON.stringify(processedSpec)
          }
        });
    
        let stdout = "";
        child.stdout.on("data", (chunk: Buffer) => {
          stdout += chunk.toString();
        });
    
        child.once("error", (error) => {
          rejectPromise(error);
        });
    
        child.once("exit", (code) => {
          if (code !== 0) {
            rejectPromise(new InputFailedError(`Electron process exited with code ${code}`));
            return;
          }
    
          if (!stdout.trim()) {
            rejectPromise(new InputFailedError("No response from Electron process"));
            return;
          }
    
          try {
            const parsed = JSON.parse(stdout);
            
            // Handle the different action types
            if (parsed.action === "submit") {
              resolvePromise(parsed.result);
            } else if (parsed.action === "cancel") {
              rejectPromise(new InputCancelledError());
            } else if (parsed.action === "error") {
              rejectPromise(new InputFailedError(parsed.message));
            } else {
              rejectPromise(new InputFailedError(`Unknown action: ${parsed.action}`));
            }
          } catch (error) {
            rejectPromise(new InputFailedError(`Invalid JSON response: ${stdout}`));
          }
        });
      });
    }
  • Helper function to normalize the input kind into a full InputSpec using Zod schemas with defaults.
    export function normalizeSpec(kind: InputKind | undefined): InputSpec {
      const resolved = kind ?? "text";
    
      if (resolved === "image") {
        return ImageInputSpecSchema.parse({ kind: "image" });
      }
    
      if (resolved === "pixelart") {
        return PixelArtInputSpecSchema.parse({ kind: "pixelart" });
      }
    
      return TextInputSpecSchema.parse({ kind: "text" });
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavioral traits. It mentions 'get input from user' which implies interactive user prompting, but doesn't specify if this blocks execution, requires user authentication, has rate limits, or what happens on cancellation. The description is minimal and leaves key behavioral aspects undefined.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, but the second sentence is redundant ('This is used to get contextual input from the user of different kinds') and adds no value. It could be more front-loaded and eliminate waste. However, it's not overly verbose, just inefficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and 7 parameters with moderate complexity (including enums and defaults), the description is incomplete. It doesn't explain the return values, error conditions, or how parameters interact (e.g., 'gridWidth' only relevant for 'pixelart'). For a user-input tool with multiple modes, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is high at 86%, so the baseline is 3. The description adds no parameter-specific information beyond what's in the schema (e.g., it doesn't explain how 'kind' affects other parameters or the interaction flow). It mentions 'different kinds' which loosely relates to the 'kind' enum but provides no additional semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool 'get image, text, or pixel art input from user' which specifies the verb ('get') and resources ('image, text, or pixel art input'), but it's vague about the mechanism (e.g., UI prompt, file upload). The second sentence 'This is used to get contextual input from the user of different kinds' is redundant and adds no clarity. It doesn't distinguish from siblings, but none exist.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, prerequisites, or constraints. It merely restates the purpose without indicating appropriate contexts or exclusions. Since there are no sibling tools, this is less critical, but still lacks any usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/swairshah/InputMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server