deepseek_chat
Chat with AI models for general conversations or complex reasoning tasks. Supports multi-turn sessions, function calling, JSON output, and multimodal input.
Instructions
Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| messages | Yes | Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id. | |
| model | No | Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output) | deepseek-chat |
| temperature | No | Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled. | |
| max_tokens | No | Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536 | |
| stream | No | Enable streaming mode. Returns full response after streaming completes. | |
| tools | No | Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema). | |
| tool_choice | No | Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}} | |
| thinking | No | Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate. | |
| json_mode | No | Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models. | |
| session_id | No | Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. If the session does not exist, it is created automatically. Omit for stateless single-turn requests. |
Implementation Reference
- src/tools/deepseek-chat.ts:43-323 (handler)Main handler function and registration for deepseek_chat tool. Contains the async handler (lines 144-321) that processes chat requests with DeepSeek models, supporting multi-turn sessions, function calling, thinking mode, streaming, and cost tracking. The registerChatTool function (lines 43-323) registers the tool with the MCP server including input/output schemas.
export function registerChatTool(server: McpServer, client: DeepSeekClient): void { // Use config's defaultModel if it's a valid model name const cfg = getConfig(); const modelDefault: 'deepseek-chat' | 'deepseek-reasoner' = cfg.defaultModel === 'deepseek-reasoner' ? 'deepseek-reasoner' : 'deepseek-chat'; server.registerTool( 'deepseek_chat', { title: 'DeepSeek Chat Completion', description: 'Chat with DeepSeek AI models. Supports deepseek-chat for general conversations and ' + 'deepseek-reasoner for complex reasoning tasks with chain-of-thought explanations. ' + 'Features: multi-turn sessions (session_id), function calling (tools parameter), thinking mode, ' + 'JSON output mode, multimodal input (when enabled), automatic cost tracking, and model fallback with circuit breaker resilience.', inputSchema: { messages: z .array(ExtendedMessageSchema) .min(1) .describe( 'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.' ), model: z .enum(['deepseek-chat', 'deepseek-reasoner']) .default(modelDefault) .describe( 'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)' ), temperature: z .number() .min(0) .max(2) .optional() .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'), max_tokens: z .number() .min(1) .max(65536) .optional() .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'), stream: z .boolean() .optional() .default(false) .describe( 'Enable streaming mode. Returns full response after streaming completes.' ), tools: z .array(ToolDefinitionSchema) .max(128) .optional() .describe( 'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).' ), tool_choice: ToolChoiceSchema.optional().describe( 'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}' ), thinking: ThinkingSchema.describe( 'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.' ), json_mode: z .boolean() .optional() .describe( 'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.' ), session_id: z .string() .optional() .describe( 'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' + 'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.' ), }, outputSchema: { content: z.string(), reasoning_content: z.string().optional(), model: z.string(), usage: z.object({ prompt_tokens: z.number(), completion_tokens: z.number(), total_tokens: z.number(), prompt_cache_hit_tokens: z.number().optional(), prompt_cache_miss_tokens: z.number().optional(), }), finish_reason: z.string(), tool_calls: z .array( z.object({ id: z.string(), type: z.literal('function'), function: z.object({ name: z.string(), arguments: z.string(), }), }) ) .optional(), session_id: z.string().optional(), }, }, async (input: DeepSeekChatInputWithSession) => { try { // Validate message content length validateMessageLength(input); // Validate input with extended schema (supports tools + session_id) const validated = ChatInputWithToolsSchema.parse(input); // Multimodal guard: reject array content when multimodal is disabled if (!getConfig().enableMultimodal) { for (const msg of validated.messages) { if (Array.isArray(msg.content)) { throw new Error( 'Multimodal content (image/array) is not enabled. Set ENABLE_MULTIMODAL=true to use multimodal input.' ); } } } // JSON mode guard: warn if "json" word is not in any message content if (validated.json_mode) { const hasJsonWord = validated.messages.some((m) => getTextContent(m.content).toLowerCase().includes('json') ); if (!hasJsonWord) { console.error( '[DeepSeek MCP] Warning: json_mode enabled but no "json" word found in messages. Results may be unreliable.' ); } } // Model-aware max_tokens warnings if (validated.max_tokens) { if (validated.model === 'deepseek-chat' && validated.max_tokens > 8192) { console.error( `[DeepSeek MCP] Warning: deepseek-chat max output is 8192 tokens, requested ${validated.max_tokens}. API may truncate.` ); } if (validated.model === 'deepseek-reasoner' && validated.max_tokens > 65536) { console.error( `[DeepSeek MCP] Warning: deepseek-reasoner max output is 65536 tokens, requested ${validated.max_tokens}. API may truncate.` ); } } // Session management: build full message list let allMessages: ChatMessage[] = validated.messages; const sessionStore = SessionStore.getInstance(); if (validated.session_id) { // Create or get session const session = sessionStore.create(validated.session_id); // Prepend previous session messages to current messages const previousMessages = sessionStore.getMessages(validated.session_id); allMessages = [...previousMessages, ...validated.messages]; console.error( `[DeepSeek MCP] Session: id=${validated.session_id}, previous_messages=${previousMessages.length}, total_messages=${allMessages.length}` ); } console.error( `[DeepSeek MCP] Request: model=${validated.model}, messages=${allMessages.length}, stream=${validated.stream}${validated.tools ? `, tools=${validated.tools.length}` : ''}${validated.thinking ? `, thinking=${validated.thinking.type}` : ''}${validated.json_mode ? ', json_mode=true' : ''}${validated.session_id ? `, session=${validated.session_id}` : ''}` ); // Build params for client const clientParams = { model: validated.model, messages: allMessages, temperature: validated.temperature, max_tokens: validated.max_tokens, tools: validated.tools, tool_choice: validated.tool_choice, thinking: validated.thinking, response_format: validated.json_mode ? ({ type: 'json_object' } as const) : undefined, }; // Call appropriate method based on stream parameter const response = validated.stream ? await client.createStreamingChatCompletion(clientParams) : await client.createChatCompletion(clientParams); console.error( `[DeepSeek MCP] Response: tokens=${response.usage.total_tokens}, finish_reason=${response.finish_reason}${response.tool_calls ? `, tool_calls=${response.tool_calls.length}` : ''}` ); // Calculate cost (model-aware pricing) const costBreakdown = calculateCost(response.usage, response.model); // Update session with new messages and response if (validated.session_id) { const session = sessionStore.get(validated.session_id); if (session) { // Add the new user messages to session sessionStore.addMessages(validated.session_id, validated.messages); // Add assistant response to session (include tool_calls for multi-turn function calling) sessionStore.addMessages(validated.session_id, [ { role: 'assistant', content: response.content, ...(response.tool_calls ? { tool_calls: response.tool_calls } : {}), }, ]); session.totalCost += costBreakdown.totalCost; session.requestCount++; } } // Track usage globally UsageTracker.getInstance().trackRequest(response.usage, costBreakdown.totalCost); // Format response let responseText = ''; // Add reasoning content if available (for deepseek-reasoner) if (response.reasoning_content) { responseText += `<thinking>\n${response.reasoning_content}\n</thinking>\n\n`; } responseText += response.content; // Format tool calls if present if (response.tool_calls?.length) { responseText += '\n\n**Function Calls:**\n'; for (const tc of response.tool_calls) { responseText += `\`${tc.function.name}\`\n`; responseText += `- Call ID: ${tc.id}\n`; responseText += `- Arguments: ${tc.function.arguments}\n\n`; } } // Add usage stats with cost information (controlled by config) if (getConfig().showCostInfo) { responseText += `\n---\n**Request Information:**\n`; responseText += `- **Tokens:** ${response.usage.total_tokens} (${response.usage.prompt_tokens} prompt + ${response.usage.completion_tokens} completion)\n`; responseText += `- **Model:** ${response.model}\n`; responseText += `- **Cost:** ${formatCost(costBreakdown)}`; if (response.tool_calls?.length) { responseText += `\n- **Tool Calls:** ${response.tool_calls.length}`; } if (validated.session_id) { const session = sessionStore.get(validated.session_id); if (session) { responseText += `\n- **Session:** ${validated.session_id} (${session.messages.length} messages, ${session.requestCount} requests, $${session.totalCost.toFixed(4)} total)`; } } } return { content: [ { type: 'text' as const, text: responseText, }, ], structuredContent: { ...response, cost_usd: parseFloat(costBreakdown.totalCost.toFixed(6)), ...(validated.session_id ? { session_id: validated.session_id } : {}), } as unknown as Record<string, unknown>, }; } catch (error: unknown) { console.error('[DeepSeek MCP] Error:', error); const errorMessage = getErrorMessage(error); return { content: [ { type: 'text' as const, text: `Error: ${errorMessage}`, }, ], isError: true, }; } } ); } - src/tools/deepseek-chat.ts:58-142 (schema)Input and output schema definitions for deepseek_chat tool. Input schema includes messages, model (deepseek-chat/deepseek-reasoner), temperature, max_tokens, stream, tools, tool_choice, thinking, json_mode, and session_id. Output schema includes content, reasoning_content, model, usage metrics, finish_reason, tool_calls, and session_id.
inputSchema: { messages: z .array(ExtendedMessageSchema) .min(1) .describe( 'Array of conversation messages. Each message has role (system/user/assistant/tool) and content (string or array of content parts for multimodal). Tool messages require tool_call_id.' ), model: z .enum(['deepseek-chat', 'deepseek-reasoner']) .default(modelDefault) .describe( 'Model to use. Both run DeepSeek V3.2 (128K context). deepseek-chat: non-thinking mode (max 8K output), deepseek-reasoner: thinking mode (max 64K output)' ), temperature: z .number() .min(0) .max(2) .optional() .describe('Sampling temperature (0-2). Higher = more random. Default: 1.0. Ignored when thinking mode is enabled.'), max_tokens: z .number() .min(1) .max(65536) .optional() .describe('Maximum tokens to generate. deepseek-chat: max 8192, deepseek-reasoner: max 65536'), stream: z .boolean() .optional() .default(false) .describe( 'Enable streaming mode. Returns full response after streaming completes.' ), tools: z .array(ToolDefinitionSchema) .max(128) .optional() .describe( 'Array of tool definitions for function calling. Each tool has type "function" and a function object with name, description, and parameters (JSON Schema).' ), tool_choice: ToolChoiceSchema.optional().describe( 'Controls which tool the model calls. "auto" (default), "none", "required", or {type:"function",function:{name:"..."}}' ), thinking: ThinkingSchema.describe( 'Enable thinking mode. When enabled, temperature/top_p/frequency_penalty/presence_penalty are automatically ignored. Use {type: "enabled"} to activate.' ), json_mode: z .boolean() .optional() .describe( 'Enable JSON output mode. The model will output valid JSON. Include the word "json" in your prompt for best results. Supported by both models.' ), session_id: z .string() .optional() .describe( 'Session ID for multi-turn conversations. When provided, previous messages from this session are prepended to the current messages. ' + 'If the session does not exist, it is created automatically. Omit for stateless single-turn requests.' ), }, outputSchema: { content: z.string(), reasoning_content: z.string().optional(), model: z.string(), usage: z.object({ prompt_tokens: z.number(), completion_tokens: z.number(), total_tokens: z.number(), prompt_cache_hit_tokens: z.number().optional(), prompt_cache_miss_tokens: z.number().optional(), }), finish_reason: z.string(), tool_calls: z .array( z.object({ id: z.string(), type: z.literal('function'), function: z.object({ name: z.string(), arguments: z.string(), }), }) ) .optional(), session_id: z.string().optional(), }, - src/schemas.ts:44-99 (schema)Reusable Zod schema definitions used by deepseek_chat tool. Includes ChatInputSchema (basic chat parameters), ChatInputWithToolsSchema (extended with tools and session_id), ToolDefinitionSchema, ToolChoiceSchema, and ThinkingSchema for type-safe validation.
export const ChatInputSchema = z.object({ messages: z.array(MessageSchema).min(1), model: z .enum(['deepseek-chat', 'deepseek-reasoner']) .default('deepseek-chat'), temperature: z.number().min(0).max(2).optional(), max_tokens: z.number().min(1).max(65536).optional(), stream: z.boolean().optional().default(false), thinking: ThinkingSchema, json_mode: z.boolean().optional(), }); // ─── Function Calling Schemas ─────────────────────────────────── export const FunctionDefinitionSchema = z.object({ name: z.string().min(1), description: z.string().optional(), parameters: z.record(z.string(), z.unknown()).optional(), strict: z.boolean().optional(), }); export const ToolDefinitionSchema = z.object({ type: z.literal('function'), function: FunctionDefinitionSchema, }); export const ToolChoiceSchema = z.union([ z.enum(['auto', 'none', 'required']), z.object({ type: z.literal('function'), function: z.object({ name: z.string().min(1), }), }), ]); export const ExtendedMessageSchema = z.object({ role: z.enum(['system', 'user', 'assistant', 'tool']), content: ContentSchema, tool_call_id: z.string().optional(), }); export const ChatInputWithToolsSchema = z.object({ messages: z.array(ExtendedMessageSchema).min(1), model: z .enum(['deepseek-chat', 'deepseek-reasoner']) .default('deepseek-chat'), temperature: z.number().min(0).max(2).optional(), max_tokens: z.number().min(1).max(65536).optional(), stream: z.boolean().optional().default(false), tools: z.array(ToolDefinitionSchema).max(128).optional(), tool_choice: ToolChoiceSchema.optional(), thinking: ThinkingSchema, json_mode: z.boolean().optional(), session_id: z.string().optional(), }); - src/tools/index.ts:11-14 (registration)Tool registration aggregator that calls registerChatTool to register the deepseek_chat tool with the MCP server. This is called from the main server initialization in src/index.ts.
export function registerAllTools(server: McpServer, client: DeepSeekClient): void { registerChatTool(server, client); registerSessionsTool(server); } - src/deepseek-client.ts:211-264 (helper)DeepSeekClient.createChatCompletion method - underlying API client used by the deepseek_chat handler. Implements circuit breaker pattern, model fallback, and error handling for non-streaming chat completions with DeepSeek API.
async createChatCompletion( params: ChatCompletionParams ): Promise<ChatCompletionResponseWithFallback> { const config = getConfig(); try { // Primary attempt through circuit breaker const result = await this.circuitBreaker.execute(async () => { const requestParams = this.buildRequestParams(params, false); const rawResponse = await this.client.chat.completions.create(requestParams); return this.parseResponse(rawResponse as unknown as DeepSeekRawResponse); }); return result; } catch (error: unknown) { // Try fallback if enabled, error is retryable, and fallback candidates exist const fallbackCandidates = FALLBACK_ORDER[params.model]; if (config.fallbackEnabled && isRetryableError(error) && fallbackCandidates?.length) { const fallbackModel = fallbackCandidates[0]; const reason = getErrorMessage(error); console.error( `[DeepSeek MCP] Primary model ${params.model} failed (${reason}), falling back to ${fallbackModel}` ); try { const fallbackParams = { ...params, model: fallbackModel as ChatCompletionParams['model'] }; // Fallback bypasses circuit breaker (it's a different attempt) const requestParams = this.buildRequestParams(fallbackParams, false); const rawResponse = await this.client.chat.completions.create(requestParams); const result = this.parseResponse(rawResponse as unknown as DeepSeekRawResponse); return { ...result, fallback: { originalModel: params.model, fallbackModel, reason, }, }; } catch (fallbackError: unknown) { console.error( `[DeepSeek MCP] Fallback model ${fallbackModel} also failed:`, fallbackError ); throw new FallbackExhaustedError( `All models failed. Primary (${params.model}): ${reason}. Fallback (${fallbackModel}): ${getErrorMessage(fallbackError)}`, [params.model, fallbackModel] ); } } // Not retryable or fallback disabled console.error('DeepSeek API Error:', error); this.wrapError(error, 'DeepSeek API Error'); } }