Skip to main content
Glama

completion

Generates text between a prompt and optional suffix using fill-in-the-middle completion. Returns raw output without chat formatting.

Instructions

DeepSeek V4 Pro FIM completion tool for prompt/suffix fill-in-the-middle workflows. Defaults to deepseek-v4-pro. Use this when you need raw completion text instead of chat message formatting. Set include_raw_response=true only when you need the full provider payload for debugging.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelNodeepseek-v4-pro
promptYes
suffixNo
max_tokensNo
temperatureNo
top_pNo
streamNo
logprobsNo
echoNo
stopNo
presence_penaltyNo
frequency_penaltyNo
include_raw_responseNo
extra_bodyNo

Implementation Reference

  • Tool registration for 'completion' using server.registerTool with description, inputSchema, and the async handler function (lines 294-334). Also listed in ENDPOINT_MATRIX at line 37.
    server.registerTool(
      "completion",
      {
        description:
          "DeepSeek V4 Pro FIM completion tool for prompt/suffix fill-in-the-middle workflows. Defaults to `deepseek-v4-pro`. Use this when you need raw completion text instead of chat message formatting. Set `include_raw_response=true` only when you need the full provider payload for debugging.",
        inputSchema: completionToolInputSchema,
      },
      async (input) => {
        try {
          const normalizedInput = input as CompletionToolInput;
          const request = buildCompletionRequest(normalizedInput, options.defaultModel);
          const result = await options.client.createCompletion(request);
          const choice = result.response.choices[0];
          const includeRawResponse = normalizedInput.include_raw_response;
    
          const structuredContent: Record<string, unknown> = {
            model: result.response.model,
            text: choice?.text ?? "",
            finish_reason: choice?.finish_reason ?? null,
            usage: result.response.usage ?? null,
            stream_chunk_count: result.streamChunkCount ?? null,
          };
    
          if (includeRawResponse) {
            structuredContent.raw_response = result.response;
          }
    
          return {
            content: [
              {
                type: "text",
                text: choice?.text || "(no completion text returned)",
              },
            ],
            structuredContent,
          };
        } catch (error) {
          return makeToolErrorResult(error);
        }
      },
    );
  • Input schema for the 'completion' tool defined with Zod. Includes fields: model (default deepseek-v4-pro), prompt (required), suffix, max_tokens, temperature, top_p, stream, logprobs, echo, stop, presence_penalty, frequency_penalty, include_raw_response, extra_body.
    export const completionToolInputSchema = z.object({
      model: z.string().default("deepseek-v4-pro"),
      prompt: z.string().min(1),
      suffix: z.string().optional(),
      max_tokens: z.number().int().positive().optional(),
      temperature: z.number().min(0).max(2).optional(),
      top_p: z.number().min(0).max(1).optional(),
      stream: z.boolean().default(false),
      logprobs: z.number().int().min(0).max(20).optional(),
      echo: z.boolean().optional(),
      stop: stopSchema.optional(),
      presence_penalty: z.number().min(-2).max(2).optional(),
      frequency_penalty: z.number().min(-2).max(2).optional(),
      include_raw_response: z.boolean().default(false),
      extra_body: z.record(z.string(), z.unknown()).optional(),
    });
  • Helper function buildCompletionRequest that transforms the typed CompletionToolInput into a DeepSeekCompletionRequest, copying optional fields and extra_body.
    function buildCompletionRequest(
      input: CompletionToolInput,
      defaultModel: string,
    ): DeepSeekCompletionRequest {
      const request: DeepSeekCompletionRequest = {
        model: input.model ?? defaultModel,
        prompt: input.prompt,
      };
    
      const optionalFields: (keyof CompletionToolInput)[] = [
        "suffix",
        "max_tokens",
        "temperature",
        "top_p",
        "stream",
        "logprobs",
        "echo",
        "stop",
        "presence_penalty",
        "frequency_penalty",
      ];
      const requestRecord = request as Record<string, unknown>;
    
      for (const field of optionalFields) {
        const value = input[field];
        if (value !== undefined) {
          requestRecord[field] = value;
        }
      }
    
      if (input.extra_body) {
        Object.assign(request, input.extra_body);
      }
    
      return request;
    }
  • The createCompletion method on DeepSeekApiClient that makes the actual API call to /beta/completions, supporting both streaming (aggregated via aggregateCompletionChunks) and non-streaming modes, returning a CompletionExecutionResult.
    async createCompletion(request: DeepSeekCompletionRequest): Promise<CompletionExecutionResult> {
      if (request.stream) {
        const chunks = await this.requestSseJson<unknown>({
          method: "POST",
          path: "/beta/completions",
          body: request as Record<string, unknown>,
          stream: true,
        });
    
        return {
          response: aggregateCompletionChunks(chunks, String(request.model)),
          streamChunkCount: chunks.length,
        };
      }
    
      const response = await this.requestJson<DeepSeekCompletionResponse>({
        method: "POST",
        path: "/beta/completions",
        body: request as Record<string, unknown>,
        stream: false,
      });
    
      return { response };
    }
  • Type definitions for DeepSeekCompletionRequest (request shape), DeepSeekCompletionChoice, DeepSeekCompletionResponse, and CompletionExecutionResult used by the completion tool.
    export interface DeepSeekCompletionRequest {
      model: DeepSeekModelId;
      prompt: string;
      suffix?: string;
      max_tokens?: number;
      temperature?: number;
      top_p?: number;
      stream?: boolean;
      logprobs?: number;
      echo?: boolean;
      stop?: string | string[];
      presence_penalty?: number;
      frequency_penalty?: number;
      [key: string]: unknown;
    }
    
    export interface DeepSeekUsage {
      prompt_tokens: number;
      completion_tokens: number;
      total_tokens: number;
      prompt_cache_hit_tokens?: number;
      prompt_cache_miss_tokens?: number;
      completion_tokens_details?: {
        reasoning_tokens?: number;
        [key: string]: unknown;
      };
      [key: string]: unknown;
    }
    
    export interface DeepSeekChatCompletionChoice {
      index: number;
      message: {
        role: "assistant";
        content: string | null;
        reasoning_content?: string;
        tool_calls?: DeepSeekToolCall[];
      };
      finish_reason: string | null;
      logprobs?: unknown;
    }
    
    export interface DeepSeekChatCompletionResponse {
      id: string;
      object: string;
      created: number;
      model: string;
      choices: DeepSeekChatCompletionChoice[];
      usage?: DeepSeekUsage;
      system_fingerprint?: string;
      [key: string]: unknown;
    }
    
    export interface DeepSeekCompletionChoice {
      index: number;
      text: string;
      logprobs?: unknown;
      finish_reason: string | null;
    }
    
    export interface DeepSeekCompletionResponse {
      id: string;
      object: string;
      created: number;
      model: string;
      choices: DeepSeekCompletionChoice[];
      usage?: DeepSeekUsage;
      system_fingerprint?: string;
      [key: string]: unknown;
    }
    
    export interface DeepSeekModel {
      id: string;
      object: string;
      owned_by?: string;
      created?: number;
      [key: string]: unknown;
    }
    
    export interface DeepSeekListModelsResponse {
      object: string;
      data: DeepSeekModel[];
    }
    
    export interface DeepSeekBalanceInfo {
      currency: string;
      total_balance: string;
      granted_balance: string;
      topped_up_balance: string;
    }
    
    export interface DeepSeekUserBalanceResponse {
      is_available: boolean;
      balance_infos: DeepSeekBalanceInfo[];
    }
    
    export interface ChatCompletionExecutionResult {
      response: DeepSeekChatCompletionResponse;
      streamChunkCount?: number;
    }
    
    export interface CompletionExecutionResult {
      response: DeepSeekCompletionResponse;
      streamChunkCount?: number;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must cover behavior. Only mentions model default and include_raw_response usage, but lacks details on response format, limits, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences front-loading purpose and usage, zero wasted words. Efficient and well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite clear purpose, the description is insufficient for a tool with 14 parameters and no output schema. Missing parameter meanings and response semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage and 14 parameters, description only addresses 'model' and 'include_raw_response' marginally. No information on other parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it's a FIM completion tool for prompt/suffix workflows, contrasts with chat formatting, and specifies the default model. Differentiates from sibling 'chat_completion'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use (raw completion vs chat) and when to set 'include_raw_response'. Does not explicitly state when not to use but provides clear context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DMontgomery40/deepseek-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server