Skip to main content
Glama

chat_completion

Generate chat responses using DeepSeek V4 models with support for multi-turn conversations, thinking modes, and customizable parameters for tailored interactions.

Instructions

Primary DeepSeek V4 chat tool for single-turn and multi-turn generation. Defaults to deepseek-v4-flash; use deepseek-v4-pro for higher-capability reasoning. Provide either message (simple single user turn) or messages (full chat history); if both are provided, messages is used. Thinking mode is enabled by DeepSeek by default; pass thinking:{type:"disabled"} for non-thinking mode, and use reasoning_effort:"high"|"max" when thinking is enabled. Use conversation_id to persist context across calls and clear_conversation=true to reset stored state before sending the next turn. Set include_raw_response=true only for debugging because it returns the full provider payload.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
messageNo
messagesNo
modelNodeepseek-v4-flash
conversation_idNo
clear_conversationNo
frequency_penaltyNo
max_tokensNo
presence_penaltyNo
response_formatNo
stopNo
streamNo
stream_optionsNo
temperatureNo
top_pNo
toolsNo
tool_choiceNo
logprobsNo
top_logprobsNo
thinkingNo
reasoning_effortNo
include_raw_responseNo
extra_bodyNo

Implementation Reference

  • The main handler function for the chat_completion tool. Normalizes input messages, manages conversation history (retrieve/clear/save), builds the API request, calls createChatCompletion on the DeepSeek client, and returns a structured response with text, reasoning, tool calls, usage, and optionally raw response.
    async (input) => {
      try {
        const normalizedInput = input as ChatCompletionToolInput;
    
        const conversationId = normalizedInput.conversation_id;
        if (conversationId && normalizedInput.clear_conversation) {
          options.conversations.clear(conversationId);
        }
    
        const newMessages = normalizeInputMessages(normalizedInput);
        const existingHistory = conversationId ? options.conversations.get(conversationId) : [];
        const outboundMessages = conversationId ? [...existingHistory, ...newMessages] : newMessages;
    
        const request = buildChatCompletionRequest(normalizedInput, outboundMessages, options.defaultModel);
        const result = await options.client.createChatCompletion(request);
    
        const choice = result.response.choices[0];
        const assistantMessage = choice?.message;
    
        if (conversationId && assistantMessage) {
          options.conversations.set(conversationId, [
            ...outboundMessages,
            {
              role: "assistant",
              content: assistantMessage.content,
              reasoning_content: assistantMessage.reasoning_content,
              tool_calls: assistantMessage.tool_calls,
            },
          ]);
        }
    
        const responseText = assistantMessage?.content ?? "";
        const reasoning = assistantMessage?.reasoning_content;
        const toolCalls = assistantMessage?.tool_calls ?? [];
        const includeRawResponse = normalizedInput.include_raw_response;
    
        const summary = [
          responseText || "(no assistant content returned)",
          reasoning ? "\nReasoning:\n" + reasoning : undefined,
          toolCalls.length > 0 ? "\nTool calls returned by model: " + JSON.stringify(toolCalls, null, 2) : undefined,
        ]
          .filter(Boolean)
          .join("\n");
    
        const structuredContent: Record<string, unknown> = {
          model: result.response.model,
          conversation_id: conversationId ?? null,
          response_text: responseText,
          reasoning_content: reasoning ?? null,
          tool_calls: toolCalls,
          finish_reason: choice?.finish_reason ?? null,
          usage: result.response.usage ?? null,
          stream_chunk_count: result.streamChunkCount ?? null,
        };
    
        if (includeRawResponse) {
          structuredContent.raw_response = result.response;
        }
    
        return {
          content: [{ type: "text", text: summary }],
          structuredContent,
        };
      } catch (error) {
        return makeToolErrorResult(error);
      }
    },
  • Zod input schema for the chat_completion tool. Defines all accepted parameters: message/messages, model, conversation_id, clear_conversation, frequency_penalty, max_tokens, presence_penalty, response_format, stop, stream, stream_options, temperature, top_p, tools, tool_choice, logprobs, top_logprobs, thinking, reasoning_effort, include_raw_response, extra_body. Includes superRefine validation.
    export const chatCompletionToolInputSchema = z
      .object({
        message: z.string().min(1).optional(),
        messages: z.array(chatMessageSchema).min(1).optional(),
        model: z.string().default("deepseek-v4-flash"),
        conversation_id: z.string().min(1).optional(),
        clear_conversation: z.boolean().default(false),
        frequency_penalty: z.number().min(-2).max(2).optional(),
        max_tokens: z.number().int().positive().optional(),
        presence_penalty: z.number().min(-2).max(2).optional(),
        response_format: z
          .object({
            type: z.enum(["text", "json_object"]),
          })
          .passthrough()
          .optional(),
        stop: stopSchema.optional(),
        stream: z.boolean().default(false),
        stream_options: streamOptionsSchema.optional(),
        temperature: z.number().min(0).max(2).optional(),
        top_p: z.number().min(0).max(1).optional(),
        tools: z.array(toolDefinitionSchema).optional(),
        tool_choice: toolChoiceSchema.optional(),
        logprobs: z.boolean().optional(),
        top_logprobs: z.number().int().min(0).max(20).optional(),
        thinking: thinkingSchema.optional(),
        reasoning_effort: z.enum(["high", "max"]).optional(),
        include_raw_response: z.boolean().default(false),
        extra_body: z.record(z.string(), z.unknown()).optional(),
      })
      .superRefine((value, context) => {
        if (!value.message && !value.messages) {
          context.addIssue({
            code: z.ZodIssueCode.custom,
            message: "Either `message` or `messages` must be provided",
          });
        }
    
        if (value.top_logprobs !== undefined && !value.logprobs) {
          context.addIssue({
            code: z.ZodIssueCode.custom,
            message: "`top_logprobs` requires `logprobs=true`",
          });
        }
      });
  • Registration of the 'chat_completion' tool via server.registerTool(). Binds the tool name, description, input schema, and async handler function.
    function registerTools(server: McpServer, options: DeepSeekMcpServerOptions): void {
      server.registerTool(
        "chat_completion",
        {
          description:
            "Primary DeepSeek V4 chat tool for single-turn and multi-turn generation. Defaults to `deepseek-v4-flash`; use `deepseek-v4-pro` for higher-capability reasoning. Provide either `message` (simple single user turn) or `messages` (full chat history); if both are provided, `messages` is used. Thinking mode is enabled by DeepSeek by default; pass `thinking:{type:\"disabled\"}` for non-thinking mode, and use `reasoning_effort:\"high\"|\"max\"` when thinking is enabled. Use `conversation_id` to persist context across calls and `clear_conversation=true` to reset stored state before sending the next turn. Set `include_raw_response=true` only for debugging because it returns the full provider payload.",
          inputSchema: chatCompletionToolInputSchema,
        },
        async (input) => {
          try {
            const normalizedInput = input as ChatCompletionToolInput;
    
            const conversationId = normalizedInput.conversation_id;
            if (conversationId && normalizedInput.clear_conversation) {
              options.conversations.clear(conversationId);
            }
    
            const newMessages = normalizeInputMessages(normalizedInput);
            const existingHistory = conversationId ? options.conversations.get(conversationId) : [];
            const outboundMessages = conversationId ? [...existingHistory, ...newMessages] : newMessages;
    
            const request = buildChatCompletionRequest(normalizedInput, outboundMessages, options.defaultModel);
            const result = await options.client.createChatCompletion(request);
    
            const choice = result.response.choices[0];
            const assistantMessage = choice?.message;
    
            if (conversationId && assistantMessage) {
              options.conversations.set(conversationId, [
                ...outboundMessages,
                {
                  role: "assistant",
                  content: assistantMessage.content,
                  reasoning_content: assistantMessage.reasoning_content,
                  tool_calls: assistantMessage.tool_calls,
                },
              ]);
            }
    
            const responseText = assistantMessage?.content ?? "";
            const reasoning = assistantMessage?.reasoning_content;
            const toolCalls = assistantMessage?.tool_calls ?? [];
            const includeRawResponse = normalizedInput.include_raw_response;
    
            const summary = [
              responseText || "(no assistant content returned)",
              reasoning ? "\nReasoning:\n" + reasoning : undefined,
              toolCalls.length > 0 ? "\nTool calls returned by model: " + JSON.stringify(toolCalls, null, 2) : undefined,
            ]
              .filter(Boolean)
              .join("\n");
    
            const structuredContent: Record<string, unknown> = {
              model: result.response.model,
              conversation_id: conversationId ?? null,
              response_text: responseText,
              reasoning_content: reasoning ?? null,
              tool_calls: toolCalls,
              finish_reason: choice?.finish_reason ?? null,
              usage: result.response.usage ?? null,
              stream_chunk_count: result.streamChunkCount ?? null,
            };
    
            if (includeRawResponse) {
              structuredContent.raw_response = result.response;
            }
    
            return {
              content: [{ type: "text", text: summary }],
              structuredContent,
            };
          } catch (error) {
            return makeToolErrorResult(error);
          }
        },
      );
  • Helper function normalizeInputMessages() that converts a ChatCompletionToolInput into an array of DeepSeekChatMessage. If 'messages' array is provided, it's used directly; otherwise a single 'message' string is wrapped as a user message.
    function normalizeInputMessages(input: ChatCompletionToolInput): DeepSeekChatMessage[] {
      if (input.messages && input.messages.length > 0) {
        return input.messages as DeepSeekChatMessage[];
      }
    
      if (input.message) {
        return [{ role: "user", content: input.message }];
      }
    
      throw new Error("Either `message` or `messages` must be provided");
    }
  • Helper function buildChatCompletionRequest() that constructs a DeepSeekChatCompletionRequest from tool input and conversation messages. Sets model, messages, and optional fields (frequency_penalty, max_tokens, temperature, tools, thinking, etc.), plus extra_body passthrough.
    function buildChatCompletionRequest(
      input: ChatCompletionToolInput,
      messages: DeepSeekChatMessage[],
      defaultModel: string,
    ): DeepSeekChatCompletionRequest {
      const request: DeepSeekChatCompletionRequest = {
        model: input.model ?? defaultModel,
        messages,
      };
    
      const optionalFields: (keyof ChatCompletionToolInput)[] = [
        "frequency_penalty",
        "max_tokens",
        "presence_penalty",
        "response_format",
        "stop",
        "stream",
        "stream_options",
        "temperature",
        "top_p",
        "tools",
        "tool_choice",
        "logprobs",
        "top_logprobs",
        "thinking",
        "reasoning_effort",
      ];
      const requestRecord = request as Record<string, unknown>;
    
      for (const field of optionalFields) {
        const value = input[field];
        if (value !== undefined) {
          requestRecord[field] = value;
        }
      }
    
      if (input.extra_body) {
        Object.assign(request, input.extra_body);
      }
    
      return request;
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description fully discloses key behaviors: default model, thinking mode enabled by default, conversation ID for context, clear flag, and raw response debugging. Covers essential operational traits beyond the tool name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single paragraph with every sentence adding value. Front-loaded with purpose, then efficient explanatory details. No superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers the main workflow (single/multi-turn, model choice, context persistence, thinking, debugging). Missing details on streaming, tool calling, response format, and return value structure. Adequate for most use cases but not exhaustive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage, the description explains the most critical parameters (message/messages, model, conversation_id, clear_conversation, thinking, reasoning_effort, include_raw_response). Some parameters like frequency_penalty, max_tokens, and streaming are omitted, but core semantics are well addressed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it is the primary DeepSeek V4 chat tool for single-turn and multi-turn generation, distinguishing it from siblings like 'completion'. Mentions model variants and their use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on model selection (flash vs pro), conversation persistence, thinking mode, and debugging usage. Does not explicitly exclude alternatives or specify when not to use, but offers sufficient context for typical use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DMontgomery40/deepseek-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server