Skip to main content
Glama

deepseek_chat

Chat with AI models for general conversations or complex reasoning tasks. Supports multi-turn sessions, function calling, JSON output, and multimodal input.

Instructions

Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
messagesYesArray of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.
modelNoModel to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)deepseek-chat
temperatureNoSampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.
max_tokensNoMaximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536
streamNoEnable streaming mode. Returns full response after streaming completes.
toolsNoArray of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).
tool_choiceNoControls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}
thinkingNoEnable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.
json_modeNoEnable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.
session_idNoSession ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. If the session does not exist, it is created automatically. Omit for stateless single-turn requests.

Implementation Reference

  • Main handler function and registration for deepseek_chat tool. Contains the async handler (lines 144-321) that processes chat requests with DeepSeek models, supporting multi-turn sessions, function calling, thinking mode, streaming, and cost tracking. The registerChatTool function (lines 43-323) registers the tool with the MCP server including input/output schemas.
    export function registerChatTool(server: McpServer, client: DeepSeekClient): void {
      // Use config's defaultModel if it's a valid model name
      const cfg = getConfig();
      const modelDefault: 'deepseek-chat' | 'deepseek-reasoner' =
        cfg.defaultModel === 'deepseek-reasoner' ? 'deepseek-reasoner' : 'deepseek-chat';
    
      server.registerTool(
        'deepseek_chat',
        {
          title: 'DeepSeek Chat Completion',
          description:
            'Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and ' +
            'deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. ' +
            'Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, ' +
            'JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.',
          inputSchema: {
            messages: z
              .array(ExtendedMessageSchema)
              .min(1)
              .describe(
                'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.'
              ),
            model: z
              .enum(['deepseek-chat', 'deepseek-reasoner'])
              .default(modelDefault)
              .describe(
                'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)'
              ),
            temperature: z
              .number()
              .min(0)
              .max(2)
              .optional()
              .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'),
            max_tokens: z
              .number()
              .min(1)
              .max(65536)
              .optional()
              .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'),
            stream: z
              .boolean()
              .optional()
              .default(false)
              .describe(
                'Enable streaming mode. Returns full response after streaming completes.'
              ),
            tools: z
              .array(ToolDefinitionSchema)
              .max(128)
              .optional()
              .describe(
                'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).'
              ),
            tool_choice: ToolChoiceSchema.optional().describe(
              'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}'
            ),
            thinking: ThinkingSchema.describe(
              'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.'
            ),
            json_mode: z
              .boolean()
              .optional()
              .describe(
                'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.'
              ),
            session_id: z
              .string()
              .optional()
              .describe(
                'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' +
                'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.'
              ),
          },
          outputSchema: {
            content: z.string(),
            reasoning_content: z.string().optional(),
            model: z.string(),
            usage: z.object({
              prompt_tokens: z.number(),
              completion_tokens: z.number(),
              total_tokens: z.number(),
              prompt_cache_hit_tokens: z.number().optional(),
              prompt_cache_miss_tokens: z.number().optional(),
            }),
            finish_reason: z.string(),
            tool_calls: z
              .array(
                z.object({
                  id: z.string(),
                  type: z.literal('function'),
                  function: z.object({
                    name: z.string(),
                    arguments: z.string(),
                  }),
                })
              )
              .optional(),
            session_id: z.string().optional(),
          },
        },
        async (input: DeepSeekChatInputWithSession) => {
          try {
            // Validate message content length
            validateMessageLength(input);
    
            // Validate input with extended schema (supports tools + session_id)
            const validated = ChatInputWithToolsSchema.parse(input);
    
            // Multimodal guard: reject array content when multimodal is disabled
            if (!getConfig().enableMultimodal) {
              for (const msg of validated.messages) {
                if (Array.isArray(msg.content)) {
                  throw new Error(
                    'Multimodal content (image/array) is not enabled. Set ENABLE_MULTIMODAL=true to use multimodal input.'
                  );
                }
              }
            }
    
            // JSON mode guard: warn if "json" word is not in any message content
            if (validated.json_mode) {
              const hasJsonWord = validated.messages.some((m) =>
                getTextContent(m.content).toLowerCase().includes('json')
              );
              if (!hasJsonWord) {
                console.error(
                  '[DeepSeek MCP] Warning: json_mode enabled but no "json" word found in messages. Results may be unreliable.'
                );
              }
            }
    
            // Model-aware max_tokens warnings
            if (validated.max_tokens) {
              if (validated.model === 'deepseek-chat' && validated.max_tokens > 8192) {
                console.error(
                  `[DeepSeek MCP] Warning: deepseek-chat max output is 8192 tokens, requested ${validated.max_tokens}. API may truncate.`
                );
              }
              if (validated.model === 'deepseek-reasoner' && validated.max_tokens > 65536) {
                console.error(
                  `[DeepSeek MCP] Warning: deepseek-reasoner max output is 65536 tokens, requested ${validated.max_tokens}. API may truncate.`
                );
              }
            }
    
            // Session management: build full message list
            let allMessages: ChatMessage[] = validated.messages;
            const sessionStore = SessionStore.getInstance();
    
            if (validated.session_id) {
              // Create or get session
              const session = sessionStore.create(validated.session_id);
              // Prepend previous session messages to current messages
              const previousMessages = sessionStore.getMessages(validated.session_id);
              allMessages = [...previousMessages, ...validated.messages];
    
              console.error(
                `[DeepSeek MCP] Session: id=${validated.session_id}, previous_messages=${previousMessages.length}, total_messages=${allMessages.length}`
              );
            }
    
            console.error(
              `[DeepSeek MCP] Request: model=${validated.model}, messages=${allMessages.length}, stream=${validated.stream}${validated.tools ? `, tools=${validated.tools.length}` : ''}${validated.thinking ? `, thinking=${validated.thinking.type}` : ''}${validated.json_mode ? ', json_mode=true' : ''}${validated.session_id ? `, session=${validated.session_id}` : ''}`
            );
    
            // Build params for client
            const clientParams = {
              model: validated.model,
              messages: allMessages,
              temperature: validated.temperature,
              max_tokens: validated.max_tokens,
              tools: validated.tools,
              tool_choice: validated.tool_choice,
              thinking: validated.thinking,
              response_format: validated.json_mode
                ? ({ type: 'json_object' } as const)
                : undefined,
            };
    
            // Call appropriate method based on stream parameter
            const response = validated.stream
              ? await client.createStreamingChatCompletion(clientParams)
              : await client.createChatCompletion(clientParams);
    
            console.error(
              `[DeepSeek MCP] Response: tokens=${response.usage.total_tokens}, finish_reason=${response.finish_reason}${response.tool_calls ? `, tool_calls=${response.tool_calls.length}` : ''}`
            );
    
            // Calculate cost (model-aware pricing)
            const costBreakdown = calculateCost(response.usage, response.model);
    
            // Update session with new messages and response
            if (validated.session_id) {
              const session = sessionStore.get(validated.session_id);
              if (session) {
                // Add the new user messages to session
                sessionStore.addMessages(validated.session_id, validated.messages);
                // Add assistant response to session (include tool_calls for multi-turn function calling)
                sessionStore.addMessages(validated.session_id, [
                  {
                    role: 'assistant',
                    content: response.content,
                    ...(response.tool_calls ? { tool_calls: response.tool_calls } : {}),
                  },
                ]);
                session.totalCost += costBreakdown.totalCost;
                session.requestCount++;
              }
            }
    
            // Track usage globally
            UsageTracker.getInstance().trackRequest(response.usage, costBreakdown.totalCost);
    
            // Format response
            let responseText = '';
    
            // Add reasoning content if available (for deepseek-reasoner)
            if (response.reasoning_content) {
              responseText += `<thinking>\n${response.reasoning_content}\n</thinking>\n\n`;
            }
    
            responseText += response.content;
    
            // Format tool calls if present
            if (response.tool_calls?.length) {
              responseText += '\n\n**Function Calls:**\n';
              for (const tc of response.tool_calls) {
                responseText += `\`${tc.function.name}\`\n`;
                responseText += `- Call ID: ${tc.id}\n`;
                responseText += `- Arguments: ${tc.function.arguments}\n\n`;
              }
            }
    
            // Add usage stats with cost information (controlled by config)
            if (getConfig().showCostInfo) {
              responseText += `\n---\n**Request Information:**\n`;
              responseText += `- **Tokens:** ${response.usage.total_tokens} (${response.usage.prompt_tokens} prompt + ${response.usage.completion_tokens} completion)\n`;
              responseText += `- **Model:** ${response.model}\n`;
              responseText += `- **Cost:** ${formatCost(costBreakdown)}`;
              if (response.tool_calls?.length) {
                responseText += `\n- **Tool Calls:** ${response.tool_calls.length}`;
              }
              if (validated.session_id) {
                const session = sessionStore.get(validated.session_id);
                if (session) {
                  responseText += `\n- **Session:** ${validated.session_id} (${session.messages.length} messages, ${session.requestCount} requests, $${session.totalCost.toFixed(4)} total)`;
                }
              }
            }
    
            return {
              content: [
                {
                  type: 'text' as const,
                  text: responseText,
                },
              ],
              structuredContent: {
                ...response,
                cost_usd: parseFloat(costBreakdown.totalCost.toFixed(6)),
                ...(validated.session_id ? { session_id: validated.session_id } : {}),
              } as unknown as Record<string, unknown>,
            };
          } catch (error: unknown) {
            console.error('[DeepSeek MCP] Error:', error);
            const errorMessage = getErrorMessage(error);
    
            return {
              content: [
                {
                  type: 'text' as const,
                  text: `Error: ${errorMessage}`,
                },
              ],
              isError: true,
            };
          }
        }
      );
    }
  • Input and output schema definitions for deepseek_chat tool. Input schema includes messages, model (deepseek-chat/deepseek-reasoner), temperature, max_tokens, stream, tools, tool_choice, thinking, json_mode, and session_id. Output schema includes content, reasoning_content, model, usage metrics, finish_reason, tool_calls, and session_id.
    inputSchema: {
      messages: z
        .array(ExtendedMessageSchema)
        .min(1)
        .describe(
          'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.'
        ),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default(modelDefault)
        .describe(
          'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)'
        ),
      temperature: z
        .number()
        .min(0)
        .max(2)
        .optional()
        .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'),
      max_tokens: z
        .number()
        .min(1)
        .max(65536)
        .optional()
        .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'),
      stream: z
        .boolean()
        .optional()
        .default(false)
        .describe(
          'Enable streaming mode. Returns full response after streaming completes.'
        ),
      tools: z
        .array(ToolDefinitionSchema)
        .max(128)
        .optional()
        .describe(
          'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).'
        ),
      tool_choice: ToolChoiceSchema.optional().describe(
        'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}'
      ),
      thinking: ThinkingSchema.describe(
        'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.'
      ),
      json_mode: z
        .boolean()
        .optional()
        .describe(
          'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.'
        ),
      session_id: z
        .string()
        .optional()
        .describe(
          'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' +
          'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.'
        ),
    },
    outputSchema: {
      content: z.string(),
      reasoning_content: z.string().optional(),
      model: z.string(),
      usage: z.object({
        prompt_tokens: z.number(),
        completion_tokens: z.number(),
        total_tokens: z.number(),
        prompt_cache_hit_tokens: z.number().optional(),
        prompt_cache_miss_tokens: z.number().optional(),
      }),
      finish_reason: z.string(),
      tool_calls: z
        .array(
          z.object({
            id: z.string(),
            type: z.literal('function'),
            function: z.object({
              name: z.string(),
              arguments: z.string(),
            }),
          })
        )
        .optional(),
      session_id: z.string().optional(),
    },
  • Reusable Zod schema definitions used by deepseek_chat tool. Includes ChatInputSchema (basic chat parameters), ChatInputWithToolsSchema (extended with tools and session_id), ToolDefinitionSchema, ToolChoiceSchema, and ThinkingSchema for type-safe validation.
    export const ChatInputSchema = z.object({
      messages: z.array(MessageSchema).min(1),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default('deepseek-chat'),
      temperature: z.number().min(0).max(2).optional(),
      max_tokens: z.number().min(1).max(65536).optional(),
      stream: z.boolean().optional().default(false),
      thinking: ThinkingSchema,
      json_mode: z.boolean().optional(),
    });
    
    // ─── Function Calling Schemas ───────────────────────────────────
    
    export const FunctionDefinitionSchema = z.object({
      name: z.string().min(1),
      description: z.string().optional(),
      parameters: z.record(z.string(), z.unknown()).optional(),
      strict: z.boolean().optional(),
    });
    
    export const ToolDefinitionSchema = z.object({
      type: z.literal('function'),
      function: FunctionDefinitionSchema,
    });
    
    export const ToolChoiceSchema = z.union([
      z.enum(['auto', 'none', 'required']),
      z.object({
        type: z.literal('function'),
        function: z.object({
          name: z.string().min(1),
        }),
      }),
    ]);
    
    export const ExtendedMessageSchema = z.object({
      role: z.enum(['system', 'user', 'assistant', 'tool']),
      content: ContentSchema,
      tool_call_id: z.string().optional(),
    });
    
    export const ChatInputWithToolsSchema = z.object({
      messages: z.array(ExtendedMessageSchema).min(1),
      model: z
        .enum(['deepseek-chat', 'deepseek-reasoner'])
        .default('deepseek-chat'),
      temperature: z.number().min(0).max(2).optional(),
      max_tokens: z.number().min(1).max(65536).optional(),
      stream: z.boolean().optional().default(false),
      tools: z.array(ToolDefinitionSchema).max(128).optional(),
      tool_choice: ToolChoiceSchema.optional(),
      thinking: ThinkingSchema,
      json_mode: z.boolean().optional(),
      session_id: z.string().optional(),
    });
  • Tool registration aggregator that calls registerChatTool to register the deepseek_chat tool with the MCP server. This is called from the main server initialization in src/index.ts.
    export function registerAllTools(server: McpServer, client: DeepSeekClient): void {
      registerChatTool(server, client);
      registerSessionsTool(server);
    }
  • DeepSeekClient.createChatCompletion method - underlying API client used by the deepseek_chat handler. Implements circuit breaker pattern, model fallback, and error handling for non-streaming chat completions with DeepSeek API.
    async createChatCompletion(
      params: ChatCompletionParams
    ): Promise<ChatCompletionResponseWithFallback> {
      const config = getConfig();
    
      try {
        // Primary attempt through circuit breaker
        const result = await this.circuitBreaker.execute(async () => {
          const requestParams = this.buildRequestParams(params, false);
          const rawResponse = await this.client.chat.completions.create(requestParams);
          return this.parseResponse(rawResponse as unknown as DeepSeekRawResponse);
        });
        return result;
      } catch (error: unknown) {
        // Try fallback if enabled, error is retryable, and fallback candidates exist
        const fallbackCandidates = FALLBACK_ORDER[params.model];
        if (config.fallbackEnabled && isRetryableError(error) && fallbackCandidates?.length) {
          const fallbackModel = fallbackCandidates[0];
          const reason = getErrorMessage(error);
          console.error(
            `[DeepSeek MCP] Primary model ${params.model} failed (${reason}), falling back to ${fallbackModel}`
          );
    
          try {
            const fallbackParams = { ...params, model: fallbackModel as ChatCompletionParams['model'] };
            // Fallback bypasses circuit breaker (it's a different attempt)
            const requestParams = this.buildRequestParams(fallbackParams, false);
            const rawResponse = await this.client.chat.completions.create(requestParams);
            const result = this.parseResponse(rawResponse as unknown as DeepSeekRawResponse);
            return {
              ...result,
              fallback: {
                originalModel: params.model,
                fallbackModel,
                reason,
              },
            };
          } catch (fallbackError: unknown) {
            console.error(
              `[DeepSeek MCP] Fallback model ${fallbackModel} also failed:`,
              fallbackError
            );
            throw new FallbackExhaustedError(
              `All models failed. Primary (${params.model}): ${reason}. Fallback (${fallbackModel}): ${getErrorMessage(fallbackError)}`,
              [params.model, fallbackModel]
            );
          }
        }
    
        // Not retryable or fallback disabled
        console.error('DeepSeek API Error:', error);
        this.wrapError(error, 'DeepSeek API Error');
      }
    }
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arikusi/deepseek-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server