create-endpoint
Create scalable GPU or CPU endpoints on RunPod by specifying template configurations, worker counts, and compute resources for deploying containerized applications.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| computeType | No | GPU or CPU endpoint | |
| dataCenterIds | No | List of data centers | |
| gpuCount | No | Number of GPUs per worker | |
| gpuTypeIds | No | List of acceptable GPU types | |
| name | No | Name for the endpoint | |
| templateId | Yes | Template ID to use | |
| workersMax | No | Maximum number of workers | |
| workersMin | No | Minimum number of workers |
Implementation Reference
- src/index.ts:420-431 (handler)The handler function for the 'create-endpoint' tool. It sends a POST request to the Runpod '/endpoints' API with the provided parameters and returns the JSON response formatted as text content.async (params) => { const result = await runpodRequest('/endpoints', 'POST', params); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; }
- src/index.ts:401-419 (schema)Zod schema defining the input parameters for the 'create-endpoint' tool, including optional fields like name, computeType, gpuTypeIds, etc.{ name: z.string().optional().describe('Name for the endpoint'), templateId: z.string().describe('Template ID to use'), computeType: z .enum(['GPU', 'CPU']) .optional() .describe('GPU or CPU endpoint'), gpuTypeIds: z .array(z.string()) .optional() .describe('List of acceptable GPU types'), gpuCount: z.number().optional().describe('Number of GPUs per worker'), workersMin: z.number().optional().describe('Minimum number of workers'), workersMax: z.number().optional().describe('Maximum number of workers'), dataCenterIds: z .array(z.string()) .optional() .describe('List of data centers'), },
- src/index.ts:399-432 (registration)The server.tool() call that registers the 'create-endpoint' tool with its schema and handler function.server.tool( 'create-endpoint', { name: z.string().optional().describe('Name for the endpoint'), templateId: z.string().describe('Template ID to use'), computeType: z .enum(['GPU', 'CPU']) .optional() .describe('GPU or CPU endpoint'), gpuTypeIds: z .array(z.string()) .optional() .describe('List of acceptable GPU types'), gpuCount: z.number().optional().describe('Number of GPUs per worker'), workersMin: z.number().optional().describe('Minimum number of workers'), workersMax: z.number().optional().describe('Maximum number of workers'), dataCenterIds: z .array(z.string()) .optional() .describe('List of data centers'), }, async (params) => { const result = await runpodRequest('/endpoints', 'POST', params); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; } );