Skip to main content
Glama

generate-dataset

Create structured datasets with realistic mock data for testing databases, APIs, and development scenarios. Supports multiple entity types with referential integrity and relationships.

Instructions

Generate a structured dataset with multiple related entities and referential integrity. Supports person, company, and custom entity types with one-to-many and many-to-many relationships. Perfect for creating test databases, mock APIs, and complex data scenarios.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
schemaYes
seedNo
localeNoen

Implementation Reference

  • The handler function that implements the core logic of the generate-dataset tool, including validation, dataset generation, and response formatting.
    export function handleGenerateDataset(params: unknown) {
      try {
        // Validate parameters
        const validatedParams = GenerateDatasetParamsSchema.parse(params);
    
        // Additional schema validation (referential integrity, circular dependencies)
        const schemaValidation = validateDatasetSchema(validatedParams.schema);
        if (!schemaValidation.valid) {
          throw new Error(`Invalid dataset schema: ${schemaValidation.errors.join(', ')}`);
        }
    
        // Create generator
        const generator = new DatasetGenerator({
          seed: validatedParams.seed,
          locale: validatedParams.locale,
        });
    
        // Generate dataset
        const result = generator.generateDataset(validatedParams.schema);
    
        // Log generation (no console.log, following linter rules - will log in server.ts instead)
    
        // Return response
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(result, null, 2),
            },
          ],
        };
      } catch (error) {
        // Error handling
        if (error instanceof z.ZodError) {
          const errorMessage = `Validation error: ${error.errors.map((e) => `${e.path.join('.')}: ${e.message}`).join(', ')}`;
          // Log error (will be handled by server)
          throw new Error(errorMessage);
        }
    
        if (error instanceof Error) {
          // Log error (will be handled by server)
          throw error;
        }
    
        // Log unknown error (will be handled by server)
        throw new Error('Unknown error occurred during dataset generation');
      }
    }
  • Zod validation schemas for the generate-dataset tool parameters, including relationship, entity, dataset schema definitions, and the top-level params schema.
    const RelationshipDefinitionSchema = z.object({
      references: z.string().min(1, 'Relationship references must be a non-empty string'),
      type: z.nativeEnum(RelationshipType),
      nullable: z.boolean().optional(),
    });
    
    /**
     * Zod validation schema for entity definitions within datasets.
     *
     * @constant
     * @type {z.ZodObject}
     */
    const EntityDefinitionSchema = z.object({
      count: z
        .number()
        .int('Count must be an integer')
        .min(1, 'Count must be at least 1')
        .max(10000, 'Count must not exceed 10000'),
      type: z.nativeEnum(EntityType),
      fields: z.array(z.string()).optional(),
      relationships: z.record(z.string(), RelationshipDefinitionSchema).optional(),
    });
    
    /**
     * Zod validation schema for complete dataset schemas.
     *
     * @constant
     * @type {z.ZodObject}
     */
    const DatasetSchemaSchema = z.object({
      entities: z
        .record(z.string(), EntityDefinitionSchema)
        .refine((entities) => Object.keys(entities).length > 0, {
          message: 'Schema must contain at least one entity',
        }),
    });
    
    /**
     * Zod validation schema for generate-dataset tool parameters.
     *
     * @constant
     * @type {z.ZodObject}
     */
    export const GenerateDatasetParamsSchema = z.object({
      schema: DatasetSchemaSchema,
      seed: z.number().int().optional(),
      locale: z.nativeEnum(SupportedLocale).optional().default(SupportedLocale.EN),
    });
    
    /**
     * Type definition for generate-dataset parameters, inferred from Zod schema.
     *
     * @typedef {z.infer<typeof GenerateDatasetParamsSchema>} GenerateDatasetParams
     */
    export type GenerateDatasetParams = z.infer<typeof GenerateDatasetParamsSchema>;
  • The Tool object definition for generate-dataset, including name, description, and inputSchema derived from Zod schema.
    export const generateDatasetTool: Tool = {
      name: 'generate-dataset',
      description:
        'Generate a structured dataset with multiple related entities and referential integrity. ' +
        'Supports person, company, and custom entity types with one-to-many and many-to-many relationships. ' +
        'Perfect for creating test databases, mock APIs, and complex data scenarios.',
      inputSchema: zodToJsonSchema(GenerateDatasetParamsSchema) as Tool['inputSchema'],
    };
  • src/index.ts:24-28 (registration)
    The server.registerTool call that registers the generate-dataset tool and its handler with the MCP server.
    // Register User Story 2 tool: generate-dataset
    server.registerTool(generateDatasetTool, async (args) => {
      await Promise.resolve();
      return handleGenerateDataset(args);
    });

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/funsjanssen/faker-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server