Skip to main content
Glama

kobold_txt2img

Generate images from text descriptions using Stable Diffusion. Provide a text prompt to create custom visuals for projects, content, or creative exploration.

Instructions

Generate image from text prompt

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
apiUrlNohttp://localhost:5001
promptYes
negative_promptNo
widthNo
heightNo
stepsNo
cfg_scaleNo
sampler_nameNo
seedNo

Implementation Reference

  • Zod schema defining the input parameters for the kobold_txt2img tool, including prompt, dimensions, steps, etc., extending the base config schema.
    const Txt2ImgSchema = BaseConfigSchema.extend({
        prompt: z.string(),
        negative_prompt: z.string().optional(),
        width: z.number().optional(),
        height: z.number().optional(),
        steps: z.number().optional(),
        cfg_scale: z.number().optional(),
        sampler_name: z.string().optional(),
        seed: z.number().optional(),
    });
  • src/index.ts:244-248 (registration)
    Registration of the kobold_txt2img tool in the list of available tools returned by ListToolsRequest.
    {
        name: "kobold_txt2img",
        description: "Generate image from text prompt",
        inputSchema: zodToJsonSchema(Txt2ImgSchema),
    },
  • Core handler logic for executing POST endpoint tools, including kobold_txt2img: parses arguments, proxies POST request to KoboldAI Stable Diffusion API endpoint, returns JSON response.
    if (postEndpoints[name]) {
        const { endpoint, schema } = postEndpoints[name];
        const parsed = schema.safeParse(args);
        if (!parsed.success) {
            throw new Error(`Invalid arguments: ${parsed.error}`);
        }
    
        const result = await makeRequest(`${apiUrl}${endpoint}`, 'POST', requestData);
        return {
            content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
            isError: false,
        };
  • Specific endpoint configuration for kobold_txt2img tool within the postEndpoints mapping used by the generic handler.
    kobold_txt2img: { endpoint: '/sdapi/v1/txt2img', schema: Txt2ImgSchema },
  • Utility function for making HTTP requests to the KoboldAI API, used by all tool handlers including kobold_txt2img.
    async function makeRequest(url: string, method = 'GET', body: Record<string, unknown> | null = null) {
        const options: RequestInit = {
            method,
            headers: body ? { 'Content-Type': 'application/json' } : undefined,
        };
        
        if (body && method !== 'GET') {
            options.body = JSON.stringify(body);
        }
    
        const response = await fetch(url, options);
        if (!response.ok) {
            throw new Error(`KoboldAI API error: ${response.statusText}`);
        }
        
        return response.json();
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool generates images but doesn't mention performance aspects (e.g., generation time, rate limits), authentication needs, or output behavior (e.g., image format, size limits). This leaves significant gaps for a tool with 9 parameters and no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single, front-loaded sentence: 'Generate image from text prompt'. There is no wasted verbiage, and it directly communicates the core function without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, no annotations, no output schema), the description is incomplete. It doesn't address key aspects like the tool's dependencies (e.g., API setup), output format, error handling, or how parameters interact. For a generative AI tool with multiple configuration options, this leaves too much unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage for its 9 parameters, and the description adds no semantic information beyond the tool's purpose. It doesn't explain what parameters like 'cfg_scale', 'sampler_name', or 'seed' mean, nor does it provide context for defaults or constraints (e.g., valid ranges for width/height). This fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Generate image from text prompt' clearly states the tool's function (text-to-image generation) but is vague about specifics like the AI model or output format. It distinguishes from siblings like kobold_chat or kobold_tts by focusing on image generation, but lacks detail on how it differs from kobold_img2img (text-to-image vs. image-to-image).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It doesn't mention sibling tools like kobold_img2img for image-to-image tasks or kobold_generate for text generation, nor does it specify prerequisites (e.g., needing an API running). The description implies usage for image generation but offers no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PhialsBasement/KoboldCPP-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server