generate_test_data
Generate realistic relational test data for database tables from schemas or plain English descriptions, ensuring foreign key integrity and locale-aware values across multiple output formats.
Instructions
Generate realistic test data for database tables.
Send either a structured schema (tables with fields) or a plain English description. Supports relational data with foreign keys, locale-aware names and addresses, 22 locales, 157 field types, and multiple output formats (JSON, CSV, SQL).
The killer feature: define multiple tables with "ref" fields, and all foreign key relationships are correct — orders reference real user IDs, reviews link to real products. One call seeds your entire database.
Auto-locale: add a "country" field as an enum with country codes (DE, FR, US, etc.) and names, emails, phones automatically match each row's nationality.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| tables | No | Structured schema — array of table definitions | |
| prompt | No | Plain English description (e.g. "50 users with German names and 200 orders linked to them") | |
| format | No | Output format | json |
| sql_dialect | No | SQL dialect (only when format=sql) | |
| locale | No | Default locale (en, de, fr, es, ja, etc.). Auto-detected from country field if present. | |
| seed | No | Seed for reproducible output. Same seed + same schema = identical data. |
Implementation Reference
- packages/mcp/src/tools.ts:205-245 (handler)The main handler function for the `generate_test_data` MCP tool, which parses input arguments, performs the data generation, and optionally formats the output.
async function handleGenerateTestData( args: Record<string, unknown> ): Promise<ToolResult> { // Parse and validate the schema const parsed = parseSchema(args); if (!parsed.success) { return err( `Schema validation failed:\n${parsed.errors .map((e) => ` - ${e.field}: ${e.message}`) .join("\n")}` ); } // Generate data const result = await generate(parsed.data); if (!result.success) { if ("errors" in result) { return err( `Generation failed:\n${result.errors .map((e) => ` - ${e.field}: ${e.message}`) .join("\n")}` ); } return err(`Generation failed: circular dependency between tables: ${result.cycle.join(" -> ")}`); } // Optionally format output const format = args.format as string | undefined; if (format && format !== "json") { const sqlDialect = args.sql_dialect as string | undefined; const formatted = formatOutput( result.result, parsed.data.tables, format as "csv" | "sql", sqlDialect as "postgres" | "mysql" | "sqlite" | undefined ); return ok(formatted.body); } return ok({ data: result.result.data, meta: result.result.meta }); } - packages/mcp/src/tools.ts:33-65 (schema)The definition and schema for the `generate_test_data` tool, outlining input requirements like tables, field types, and supported formats.
export const TOOL_DEFINITIONS: ToolDefinition[] = [ { name: "generate_test_data", description: "Generate realistic test data for database tables. Supports 135+ field types, 20 locales, relational data with foreign keys, and multiple output formats (JSON, CSV, SQL).", inputSchema: { type: "object", properties: { tables: { type: "array", description: "Array of table definitions. Each table has: name (string), count (number), fields (array of {name, type, params?, nullable?}). Field types include: first_name, last_name, email, uuid, integer, boolean, datetime, price, enum, ref, and 125+ more.", items: { type: "object", properties: { name: { type: "string", description: "Table name" }, count: { type: "number", description: "Number of rows to generate", }, fields: { type: "array", description: "Array of field definitions", items: { type: "object", properties: { name: { type: "string", description: "Column name" }, type: { type: "string", description: "Field type (e.g. first_name, email, uuid, integer, enum, ref)", }, params: { - packages/mcp/src/tools.ts:361-365 (registration)Registration logic within a handler switch statement that routes incoming tool requests for `generate_test_data` to `handleGenerateTestData`.
switch (name) { case "generate_test_data": return handleGenerateTestData(args); case "detect_schema": return handleDetectSchema(args);