create-endpoint

Deploy scalable GPU or CPU endpoints on RunPod by specifying templates, compute resources, and worker configurations for AI inference workloads.

Input Schema

TableJSON Schema

Name	Required	Description
`name`	No	Name for the endpoint
`templateId`	Yes	Template ID to use
`computeType`	No	GPU or CPU endpoint
`gpuTypeIds`	No	List of acceptable GPU types
`gpuCount`	No	Number of GPUs per worker
`workersMin`	No	Minimum number of workers
`workersMax`	No	Maximum number of workers
`dataCenterIds`	No	List of data centers

Implementation Reference

src/index.ts:420-431 (handler)
Handler function that sends a POST request to RunPod's /endpoints endpoint with the input parameters and formats the response as MCP content.
async (params) => { const result = await runpodRequest('/endpoints', 'POST', params); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; }
src/index.ts:401-419 (schema)
Zod input schema defining parameters for creating a RunPod endpoint.
{ name: z.string().optional().describe('Name for the endpoint'), templateId: z.string().describe('Template ID to use'), computeType: z .enum(['GPU', 'CPU']) .optional() .describe('GPU or CPU endpoint'), gpuTypeIds: z .array(z.string()) .optional() .describe('List of acceptable GPU types'), gpuCount: z.number().optional().describe('Number of GPUs per worker'), workersMin: z.number().optional().describe('Minimum number of workers'), workersMax: z.number().optional().describe('Maximum number of workers'), dataCenterIds: z .array(z.string()) .optional() .describe('List of data centers'), },
src/index.ts:399-432 (registration)
Registration of the create-endpoint tool on the MCP server using server.tool().
server.tool( 'create-endpoint', { name: z.string().optional().describe('Name for the endpoint'), templateId: z.string().describe('Template ID to use'), computeType: z .enum(['GPU', 'CPU']) .optional() .describe('GPU or CPU endpoint'), gpuTypeIds: z .array(z.string()) .optional() .describe('List of acceptable GPU types'), gpuCount: z.number().optional().describe('Number of GPUs per worker'), workersMin: z.number().optional().describe('Minimum number of workers'), workersMax: z.number().optional().describe('Maximum number of workers'), dataCenterIds: z .array(z.string()) .optional() .describe('List of data centers'), }, async (params) => { const result = await runpodRequest('/endpoints', 'POST', params); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; } );
src/index.ts:27-66 (helper)
Shared helper function for making authenticated HTTP requests to the RunPod API, used by all tools including create-endpoint.
async function runpodRequest( endpoint: string, method: string = 'GET', body?: Record<string, unknown> ) { const url = `${API_BASE_URL}${endpoint}`; const headers = { Authorization: `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }; const options: NodeFetchRequestInit = { method, headers, }; if (body && (method === 'POST' || method === 'PATCH')) { options.body = JSON.stringify(body); } try { const response = await fetch(url, options); if (!response.ok) { const errorText = await response.text(); throw new Error(`RunPod API Error: ${response.status} - ${errorText}`); } // Some endpoints might not return JSON const contentType = response.headers.get('content-type'); if (contentType && contentType.includes('application/json')) { return await response.json(); } return { success: true, status: response.status }; } catch (error) { console.error('Error calling RunPod API:', error); throw error; } }

RunPod MCP Server

create-endpoint

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API