Grok MCP Server

create_chat_completion

Generate chat completions using the Grok API by specifying models, messages, and parameters like temperature, tools, and response format to create tailored conversational outputs.

Instructions

Create a chat completion with the Grok API

Input Schema

TableJSON Schema

Name	Required	Description
`frequency_penalty`	No	Penalty for new tokens based on frequency in text (-2 to 2)
`logit_bias`	No	Map of token IDs to bias scores (-100 to 100) that influence generation
`max_tokens`	No	Maximum number of tokens to generate
`messages`	Yes	Messages to generate chat completions for
`model`	Yes	ID of the model to use
`n`	No	Number of chat completion choices to generate
`presence_penalty`	No	Penalty for new tokens based on presence in text (-2 to 2)
`response_format`	No	Specify 'json_object' to receive JSON response or 'text' for raw text
`search_parameters`	No	Parameters for live search capabilities
`seed`	No	If specified, results will be more deterministic when the same seed is used
`stop`	No	Sequences where the API will stop generating further tokens
`stream`	No	If set, partial message deltas will be sent
`temperature`	No	Sampling temperature (0-2)
`tool_choice`	No	Controls which (if any) tool is called by the model
`tools`	No	List of tools the model may call
`top_p`	No	Nucleus sampling parameter (0-1)
`user`	No	A unique user identifier

Implementation Reference

src/operations/chat.ts:233-242 (handler)

The core handler function that executes the tool logic by sending a POST request to the Grok API's chat/completions endpoint using the grokRequest helper and parsing the response with Zod.

export async function createChatCompletion(
  options: z.infer<typeof ChatCompletionRequestSchema>
): Promise<z.infer<typeof ChatCompletionSchema>> {
  const response = await grokRequest("chat/completions", {
    method: "POST",
    body: options,
  });

  return ChatCompletionSchema.parse(response);
}

src/operations/chat.ts:139-230 (schema)

The Zod input schema (ChatCompletionRequestSchema) defining the parameters for the create_chat_completion tool, used in both the handler type and tool registration.

export const ChatCompletionRequestSchema = z.object({
  model: z.string().describe("ID of the model to use"),
  messages: z
    .array(MessageSchema)
    .describe("Messages to generate chat completions for"),
  tools: z
    .array(ToolSchema)
    .optional()
    .describe("List of tools the model may call"),
  tool_choice: z
    .union([
      z.literal("auto"),
      z.literal("none"),
      z.object({
        type: z.literal("function"),
        function: z.object({
          name: z
            .string()
            .describe("Force the model to call the specified function"),
        }),
      }),
    ])
    .optional()
    .describe("Controls which (if any) tool is called by the model"),
  temperature: z
    .number()
    .min(0)
    .max(2)
    .optional()
    .describe("Sampling temperature (0-2)"),
  top_p: z
    .number()
    .min(0)
    .max(1)
    .optional()
    .describe("Nucleus sampling parameter (0-1)"),
  n: z
    .number()
    .int()
    .positive()
    .optional()
    .describe("Number of chat completion choices to generate"),
  stream: z
    .boolean()
    .optional()
    .describe("If set, partial message deltas will be sent"),
  max_tokens: z
    .number()
    .int()
    .positive()
    .optional()
    .describe("Maximum number of tokens to generate"),
  presence_penalty: z
    .number()
    .min(-2)
    .max(2)
    .optional()
    .describe("Penalty for new tokens based on presence in text (-2 to 2)"),
  frequency_penalty: z
    .number()
    .min(-2)
    .max(2)
    .optional()
    .describe("Penalty for new tokens based on frequency in text (-2 to 2)"),
  logit_bias: z
    .record(z.string(), z.number())
    .optional()
    .describe(
      "Map of token IDs to bias scores (-100 to 100) that influence generation"
    ),
  response_format: z
    .object({ type: z.enum(["text", "json_object"]) })
    .optional()
    .describe(
      "Specify 'json_object' to receive JSON response or 'text' for raw text"
    ),
  seed: z
    .number()
    .int()
    .optional()
    .describe(
      "If specified, results will be more deterministic when the same seed is used"
    ),
  stop: z
    .union([z.string(), z.array(z.string())])
    .optional()
    .describe("Sequences where the API will stop generating further tokens"),
  user: z.string().optional().describe("A unique user identifier"),
  search_parameters: SearchParametersSchema.optional().describe(
    "Parameters for live search capabilities"
  ),
});

index.ts:97-122 (registration)

The FastMCP tool registration for 'create_chat_completion', linking the input schema and handler execute function.

server.addTool({
  name: "create_chat_completion",
  description: "Create a chat completion with the Grok API",
  parameters: chat.ChatCompletionRequestSchema,
  execute: async (args) => {
    try {
      console.error(
        `[DEBUG] Creating chat completion with model: ${args.model}`
      );
      const completion = await chat.createChatCompletion(args);
      console.error(`[DEBUG] Chat completion created successfully`);
      return JSON.stringify(completion, null, 2);
    } catch (err) {
      console.error(`[ERROR] Failed to create chat completion:`, err);
      if (err instanceof GrokResourceNotFoundError) {
        throw new Error(
          `Model '${args.model}' not found. Please verify:\n` +
            `1. The model exists\n` +
            `2. You have correct access permissions\n` +
            `3. The model name is spelled correctly`
        );
      }
      handleError(err);
    }
  },
});

Tool Definition Quality

C2.7/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full responsibility for behavioral disclosure but offers none. It doesn't mention that this is a write operation (creates something), potential costs/rate limits, authentication requirements, response format, or any side effects. 'Create' implies mutation but this isn't explicitly stated or explained.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that states exactly what the tool does without unnecessary words. It's appropriately sized and front-loaded with the essential information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 17 parameters, no annotations, and no output schema, the description is severely inadequate. It doesn't explain what a 'chat completion' actually is, what the Grok API provides, what the response looks like, or any behavioral characteristics. The agent would need to rely heavily on the schema alone.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, all 17 parameters are documented in the schema itself. The description adds no parameter-specific information beyond what's in the schema. According to scoring rules, when schema_description_coverage is high (>80%), the baseline is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create') and resource ('chat completion') with the specific API ('Grok API'), making the purpose immediately understandable. However, it doesn't differentiate this tool from its sibling 'create_completion' - both appear to create completions, so the distinction between 'chat' vs regular completions isn't explained.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'create_completion' or 'list_models'. There's no mention of prerequisites, appropriate contexts, or exclusion criteria. The agent must infer usage from the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

create_completionC
@BrewMyTech/grok-mcp
kobold_chat
@PhialsBasement/KoboldCPP-MCP-Server
kobold_complete
@PhialsBasement/KoboldCPP-MCP-Server
mcp_gemini_chat_completionC
@bigdata-coss/agent_mcp
list_modelsB
@BrewMyTech/grok-mcp
mcp_openai_chatC
@bigdata-coss/agent_mcp

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/BrewMyTech/grok-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server