Skip to main content
Glama

DeepSeek Chat Completion

deepseek_chat

Chat with AI models for general conversations or complex reasoning tasks. Supports multi-turn sessions, function calling, JSON output, and multimodal input.

Instructions

Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
messagesYesArray of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.
modelNoModel to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)deepseek-chat
temperatureNoSampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.
max_tokensNoMaximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536
streamNoEnable streaming mode. Returns full response after streaming completes.
toolsNoArray of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).
tool_choiceNoControls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}
thinkingNoEnable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.
json_modeNoEnable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.
session_idNoSession ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. If the session does not exist, it is created automatically. Omit for stateless single-turn requests.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelYes
usageYes
contentYes
session_idNo
tool_callsNo
finish_reasonYes
reasoning_contentNo

Implementation Reference

  • Main handler function and registration for deepseek_chat tool. Contains the async handler (lines 144-321) that processes chat requests with DeepSeek models, supporting multi-turn sessions, function calling, thinking mode, streaming, and cost tracking. The registerChatTool function (lines 43-323) registers the tool with the MCP server including input/output schemas.
    export function registerChatTool(server: McpServer, client: DeepSeekClient): void {
      // Use config's defaultModel if it's a valid model name
      const cfg = getConfig();
      const modelDefault: 'deepseek-chat' | 'deepseek-reasoner' =
        cfg.defaultModel === 'deepseek-reasoner' ? 'deepseek-reasoner' : 'deepseek-chat';
    
      server.registerTool(
        'deepseek_chat',
        {
          title: 'DeepSeek Chat Completion',
          description:
            'Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and ' +
            'deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. ' +
            'Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, ' +
            'JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.',
          inputSchema: {
            messages: z
              .array(ExtendedMessageSchema)
              .min(1)
              .describe(
                'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.'
              ),
            model: z
              .enum(['deepseek-chat', 'deepseek-reasoner'])
              .default(modelDefault)
              .describe(
                'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)'
              ),
            temperature: z
              .number()
              .min(0)
              .max(2)
              .optional()
              .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'),
            max_tokens: z
              .number()
              .min(1)
              .max(65536)
              .optional()
              .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'),
            stream: z
              .boolean()
              .optional()
              .default(false)
              .describe(
                'Enable streaming mode. Returns full response after streaming completes.'
              ),
            tools: z
              .array(ToolDefinitionSchema)
              .max(128)
              .optional()
              .describe(
                'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).'
              ),
            tool_choice: ToolChoiceSchema.optional().describe(
              'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}'
            ),
            thinking: ThinkingSchema.describe(
              'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.'
            ),
            json_mode: z
              .boolean()
              .optional()
              .describe(
                'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.'
              ),
            session_id: z
              .string()
              .optional()
              .describe(
                'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' +
                'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.'
              ),
          },
          outputSchema: {
            content: z.string(),
            reasoning_content: z.string().optional(),
            model: z.string(),
            usage: z.object({
              prompt_tokens: z.number(),
              completion_tokens: z.number(),
              total_tokens: z.number(),
              prompt_cache_hit_tokens: z.number().optional(),
              prompt_cache_miss_tokens: z.number().optional(),
            }),
            finish_reason: z.string(),
            tool_calls: z
              .array(
                z.object({
                  id: z.string(),
                  type: z.literal('function'),
                  function: z.object({
                    name: z.string(),
                    arguments: z.string(),
                  }),
                })
              )
              .optional(),
            session_id: z.string().optional(),
          },
        },
        async (input: DeepSeekChatInputWithSession) => {
          try {
            // Validate message content length
            validateMessageLength(input);
    
            // Validate input with extended schema (supports tools + session_id)
            const validated = ChatInputWithToolsSchema.parse(input);
    
            // Multimodal guard: reject array content when multimodal is disabled
            if (!getConfig().enableMultimodal) {
              for (const msg of validated.messages) {
                if (Array.isArray(msg.content)) {
                  throw new Error(
                    'Multimodal content (image/array) is not enabled. Set ENABLE_MULTIMODAL=true to use multimodal input.'
                  );
                }
              }
            }
    
            // JSON mode guard: warn if "json" word is not in any message content
            if (validated.json_mode) {
              const hasJsonWord = validated.messages.some((m) =>
                getTextContent(m.content).toLowerCase().includes('json')
              );
              if (!hasJsonWord) {
                console.error(
                  '[DeepSeek MCP] Warning: json_mode enabled but no "json" word found in messages. Results may be unreliable.'
                );
              }
            }
    
            // Model-aware max_tokens warnings
            if (validated.max_tokens) {
              if (validated.model === 'deepseek-chat' && validated.max_tokens > 8192) {
                console.error(
                  `[DeepSeek MCP] Warning: deepseek-chat max output is 8192 tokens, requested ${validated.max_tokens}. API may truncate.`
                );
              }
              if (validated.model === 'deepseek-reasoner' && validated.max_tokens > 65536) {
                console.error(
                  `[DeepSeek MCP] Warning: deepseek-reasoner max output is 65536 tokens, requested ${validated.max_tokens}. API may truncate.`
                );
              }
            }
    
            // Session management: build full message list
            let allMessages: ChatMessage[] = validated.messages;
            const sessionStore = SessionStore.getInstance();
    
            if (validated.session_id) {
              // Create or get session
              const session = sessionStore.create(validated.session_id);
              // Prepend previous session messages to current messages
              const previousMessages = sessionStore.getMessages(validated.session_id);
              allMessages = [...previousMessages, ...validated.messages];
    
              console.error(
                `[DeepSeek MCP] Session: id=${validated.session_id}, previous_messages=${previousMessages.length}, total_messages=${allMessages.length}`
              );
            }
    
            console.error(
              `[DeepSeek MCP] Request: model=${validated.model}, messages=${allMessages.length}, stream=${validated.stream}${validated.tools ? `, tools=${validated.tools.length}` : ''}${validated.thinking ? `, thinking=${validated.thinking.type}` : ''}${validated.json_mode ? ', json_mode=true' : ''}${validated.session_id ? `, session=${validated.session_id}` : ''}`
            );
    
            // Build params for client
            const clientParams = {
              model: validated.model,
              messages: allMessages,
              temperature: validated.temperature,
              max_tokens: validated.max_tokens,
              tools: validated.tools,
              tool_choice: validated.tool_choice,
              thinking: validated.thinking,
              response_format: validated.json_mode
                ? ({ type: 'json_object' } as const)
                : undefined,
            };
    
            // Call appropriate method based on stream parameter
            const response = validated.stream
              ? await client.createStreamingChatCompletion(clientParams)
              : await client.createChatCompletion(clientParams);
    
            console.error(
              `[DeepSeek MCP] Response: tokens=${response.usage.total_tokens}, finish_reason=${response.finish_reason}${response.tool_calls ? `, tool_calls=${response.tool_calls.length}` : ''}`
            );
    
            // Calculate cost (model-aware pricing)
            const costBreakdown = calculateCost(response.usage, response.model);
    
            // Update session with new messages and response
            if (validated.session_id) {
              const session = sessionStore.get(validated.session_id);
              if (session) {
                // Add the new user messages to session
                sessionStore.addMessages(validated.session_id, validated.messages);
                // Add assistant response to session (include tool_calls for multi-turn function calling)
                sessionStore.addMessages(validated.session_id, [
                  {
                    role: 'assistant',
                    content: response.content,
                    ...(response.tool_calls ? { tool_calls: response.tool_calls } : {}),
                  },
                ]);
                session.totalCost += costBreakdown.totalCost;
                session.requestCount++;
              }
            }
    
            // Track usage globally
            UsageTracker.getInstance().trackRequest(response.usage, costBreakdown.totalCost);
    
            // Format response
            let responseText = '';
    
            // Add reasoning content if available (for deepseek-reasoner)
            if (response.reasoning_content) {
              responseText += `<thinking>\n${response.reasoning_content}\n</thinking>\n\n`;
            }
    
            responseText += response.content;
    
            // Format tool calls if present
            if (response.tool_calls?.length) {
              responseText += '\n\n**Function Calls:**\n';
              for (const tc of response.tool_calls) {
                responseText += `\`${tc.function.name}\`\n`;
                responseText += `- Call ID: ${tc.id}\n`;
                responseText += `- Arguments: ${tc.function.arguments}\n\n`;
              }
            }
    
            // Add usage stats with cost information (controlled by config)
            if (getConfig().showCostInfo) {
              responseText += `\n---\n**Request Information:**\n`;
              responseText += `- **Tokens:** ${response.usage.total_tokens} (${response.usage.prompt_tokens} prompt + ${response.usage.completion_tokens} completion)\n`;
              responseText += `- **Model:** ${response.model}\n`;
              responseText += `- **Cost:** ${formatCost(costBreakdown)}`;
              if (response.tool_calls?.length) {
                responseText += `\n- **Tool Calls:** ${response.tool_calls.length}`;
              }
              if (validated.session_id) {
                const session = sessionStore.get(validated.session_id);
                if (session) {
                  responseText += `\n- **Session:** ${validated.session_id} (${session.messages.length} messages, ${session.requestCount} requests, $${session.totalCost.toFixed(4)} total)`;
                }
              }
            }
    
            return {
              content: [
                {
                  type: 'text' as const,
                  text: responseText,
                },
              ],
              structuredContent: {
                ...response,
                cost_usd: parseFloat(costBreakdown.totalCost.toFixed(6)),
                ...(validated.session_id ? { session_id: validated.session_id } : {}),
              } as unknown as Record<string, unknown>,
            };
          } catch (error: unknown) {
            console.error('[DeepSeek MCP] Error:', error);
            const errorMessage = getErrorMessage(error);
    
            return {
              content: [
                {
                  type: 'text' as const,
                  text: `Error: ${errorMessage}`,
                },
              ],
              isError: true,
            };
          }
        }
      );
    }
  • Input and output schema definitions for deepseek_chat tool. Input schema includes messages, model (deepseek-chat/deepseek-reasoner), temperature, max_tokens, stream, tools, tool_choice, thinking, json_mode, and session_id. Output schema includes content, reasoning_content, model, usage metrics, finish_reason, tool_calls, and session_id.
    inputSchema: {
      messages: z
        .array(ExtendedMessageSchema)
        .min(1)
        .describe(
          'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.'
        ),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default(modelDefault)
        .describe(
          'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)'
        ),
      temperature: z
        .number()
        .min(0)
        .max(2)
        .optional()
        .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'),
      max_tokens: z
        .number()
        .min(1)
        .max(65536)
        .optional()
        .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'),
      stream: z
        .boolean()
        .optional()
        .default(false)
        .describe(
          'Enable streaming mode. Returns full response after streaming completes.'
        ),
      tools: z
        .array(ToolDefinitionSchema)
        .max(128)
        .optional()
        .describe(
          'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).'
        ),
      tool_choice: ToolChoiceSchema.optional().describe(
        'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}'
      ),
      thinking: ThinkingSchema.describe(
        'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.'
      ),
      json_mode: z
        .boolean()
        .optional()
        .describe(
          'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.'
        ),
      session_id: z
        .string()
        .optional()
        .describe(
          'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' +
          'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.'
        ),
    },
    outputSchema: {
      content: z.string(),
      reasoning_content: z.string().optional(),
      model: z.string(),
      usage: z.object({
        prompt_tokens: z.number(),
        completion_tokens: z.number(),
        total_tokens: z.number(),
        prompt_cache_hit_tokens: z.number().optional(),
        prompt_cache_miss_tokens: z.number().optional(),
      }),
      finish_reason: z.string(),
      tool_calls: z
        .array(
          z.object({
            id: z.string(),
            type: z.literal('function'),
            function: z.object({
              name: z.string(),
              arguments: z.string(),
            }),
          })
        )
        .optional(),
      session_id: z.string().optional(),
    },
  • Reusable Zod schema definitions used by deepseek_chat tool. Includes ChatInputSchema (basic chat parameters), ChatInputWithToolsSchema (extended with tools and session_id), ToolDefinitionSchema, ToolChoiceSchema, and ThinkingSchema for type-safe validation.
    export const ChatInputSchema = z.object({
      messages: z.array(MessageSchema).min(1),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default('deepseek-chat'),
      temperature: z.number().min(0).max(2).optional(),
      max_tokens: z.number().min(1).max(65536).optional(),
      stream: z.boolean().optional().default(false),
      thinking: ThinkingSchema,
      json_mode: z.boolean().optional(),
    });
    
    // ─── Function Calling Schemas ───────────────────────────────────
    
    export const FunctionDefinitionSchema = z.object({
      name: z.string().min(1),
      description: z.string().optional(),
      parameters: z.record(z.string(), z.unknown()).optional(),
      strict: z.boolean().optional(),
    });
    
    export const ToolDefinitionSchema = z.object({
      type: z.literal('function'),
      function: FunctionDefinitionSchema,
    });
    
    export const ToolChoiceSchema = z.union([
      z.enum(['auto', 'none', 'required']),
      z.object({
        type: z.literal('function'),
        function: z.object({
          name: z.string().min(1),
        }),
      }),
    ]);
    
    export const ExtendedMessageSchema = z.object({
      role: z.enum(['system', 'user', 'assistant', 'tool']),
      content: ContentSchema,
      tool_call_id: z.string().optional(),
    });
    
    export const ChatInputWithToolsSchema = z.object({
      messages: z.array(ExtendedMessageSchema).min(1),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default('deepseek-chat'),
      temperature: z.number().min(0).max(2).optional(),
      max_tokens: z.number().min(1).max(65536).optional(),
      stream: z.boolean().optional().default(false),
      tools: z.array(ToolDefinitionSchema).max(128).optional(),
      tool_choice: ToolChoiceSchema.optional(),
      thinking: ThinkingSchema,
      json_mode: z.boolean().optional(),
      session_id: z.string().optional(),
    });
  • Tool registration aggregator that calls registerChatTool to register the deepseek_chat tool with the MCP server. This is called from the main server initialization in src/index.ts.
    export function registerAllTools(server: McpServer, client: DeepSeekClient): void {
      registerChatTool(server, client);
      registerSessionsTool(server);
    }
  • DeepSeekClient.createChatCompletion method - underlying API client used by the deepseek_chat handler. Implements circuit breaker pattern, model fallback, and error handling for non-streaming chat completions with DeepSeek API.
    async createChatCompletion(
      params: ChatCompletionParams
    ): Promise<ChatCompletionResponseWithFallback> {
      const config = getConfig();
    
      try {
        // Primary attempt through circuit breaker
        const result = await this.circuitBreaker.execute(async () => {
          const requestParams = this.buildRequestParams(params, false);
          const rawResponse = await this.client.chat.completions.create(requestParams);
          return this.parseResponse(rawResponse as unknown as DeepSeekRawResponse);
        });
        return result;
      } catch (error: unknown) {
        // Try fallback if enabled, error is retryable, and fallback candidates exist
        const fallbackCandidates = FALLBACK_ORDER[params.model];
        if (config.fallbackEnabled && isRetryableError(error) && fallbackCandidates?.length) {
          const fallbackModel = fallbackCandidates[0];
          const reason = getErrorMessage(error);
          console.error(
            `[DeepSeek MCP] Primary model ${params.model} failed (${reason}), falling back to ${fallbackModel}`
          );
    
          try {
            const fallbackParams = { ...params, model: fallbackModel as ChatCompletionParams['model'] };
            // Fallback bypasses circuit breaker (it's a different attempt)
            const requestParams = this.buildRequestParams(fallbackParams, false);
            const rawResponse = await this.client.chat.completions.create(requestParams);
            const result = this.parseResponse(rawResponse as unknown as DeepSeekRawResponse);
            return {
              ...result,
              fallback: {
                originalModel: params.model,
                fallbackModel,
                reason,
              },
            };
          } catch (fallbackError: unknown) {
            console.error(
              `[DeepSeek MCP] Fallback model ${fallbackModel} also failed:`,
              fallbackError
            );
            throw new FallbackExhaustedError(
              `All models failed. Primary (${params.model}): ${reason}. Fallback (${fallbackModel}): ${getErrorMessage(fallbackError)}`,
              [params.model, fallbackModel]
            );
          }
        }
    
        // Not retryable or fallback disabled
        console.error('DeepSeek API Error:', error);
        this.wrapError(error, 'DeepSeek API Error');
      }
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: multi-turn session support, function calling capability, thinking mode, JSON output mode, multimodal input, cost tracking, and model fallback with circuit breaker resilience. It provides substantial context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with a clear opening statement followed by a bullet-style list of features. Each sentence adds value, though it could be slightly more front-loaded by leading with the most critical information about model selection.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of this tool (10 parameters, nested objects) and the presence of an output schema, the description provides excellent contextual completeness. It covers the tool's purpose, key features, model differences, and behavioral characteristics without needing to explain return values (handled by output schema).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add significant parameter semantics beyond what's in the schema, though it mentions some features (like session_id, tools parameter, thinking mode) that correspond to parameters. Baseline 3 is appropriate when schema does heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Chat with DeepSeek AI models' and distinguishes between two specific model types for different use cases (general conversations vs. complex reasoning). It goes beyond the name/title by specifying the core functionality and model variants.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use different model types (deepseek-chat for general conversations, deepseek-reasoner for complex reasoning) and mentions multi-turn sessions. However, it doesn't explicitly state when to use this tool versus the sibling 'deepseek_sessions' tool or provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arikusi/deepseek-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server