Skip to main content
Glama

detect_schema

Convert SQL CREATE TABLE statements or JSON samples into a structured schema for generating test data that matches your existing database.

Instructions

Convert an existing database schema or JSON sample into MockHero's schema format.

Send a SQL CREATE TABLE statement and get back the structured schema ready to use with generate_test_data. Or send a sample JSON record and MockHero will infer the field types.

This is useful when you have migration files or an existing database and want to generate test data that matches your schema without manually writing the definition.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sqlNoSQL CREATE TABLE statement(s) to convert
sample_jsonNoExample JSON record to infer schema from

Implementation Reference

  • Main handler function for the 'detect_schema' tool. Routes to either detectFromSql or detectFromJson based on input args.
    async function handleDetectSchema(
      args: Record<string, unknown>
    ): Promise<ToolResult> {
      const { sql, sample_json } = args;
    
      if (!sql && !sample_json) {
        return err(
          "Either 'sql' (SQL CREATE TABLE statements) or 'sample_json' (JSON sample data) is required."
        );
      }
    
      if (sql && typeof sql === "string") {
        const schema = detectFromSql(sql);
        if (schema.tables.length === 0) {
          return err(
            "Could not detect any tables from the provided SQL. Make sure it contains valid CREATE TABLE statements."
          );
        }
        return ok(schema);
      }
    
      if (sample_json) {
        const schema = detectFromJson(sample_json);
        return ok(schema);
      }
    
      return err("Invalid input. Provide 'sql' as a string or 'sample_json' as an object.");
    }
  • Parses SQL CREATE TABLE statements into MockHero TableDefinition format. Handles inline REFERENCES and CHECK IN constraints.
    export function detectFromSql(sql: string): { tables: TableDefinition[] } {
      const tables: TableDefinition[] = [];
    
      // Match CREATE TABLE statements
      const tableRegex = /CREATE\s+TABLE\s+(?:IF\s+NOT\s+EXISTS\s+)?(?:`|")?(\w+)(?:`|")?\s*\(([\s\S]*?)(?:\);|\)\s*;)/gi;
      let match;
    
      while ((match = tableRegex.exec(sql)) !== null) {
        const tableName = match[1].toLowerCase();
        const columnsBlock = match[2];
        const fields: FieldDefinition[] = [];
    
        // Parse individual columns (skip constraints)
        // Split on commas that are NOT inside parentheses — prevents breaking
        // types like DECIMAL(10,2), NUMERIC(12,4), VARCHAR(255), etc.
        const lines = splitColumnsBlock(columnsBlock);
    
        for (const line of lines) {
          if (/^\s*(PRIMARY\s+KEY|FOREIGN\s+KEY|UNIQUE|CHECK|CONSTRAINT|INDEX)/i.test(line)) {
            continue;
          }
    
          const colMatch = line.match(/^(?:`|")?(\w+)(?:`|")?\s+(\w+(?:\([^)]*\))?)/i);
          if (!colMatch) continue;
    
          const colName = colMatch[1].toLowerCase();
          const sqlType = colMatch[2].toUpperCase();
    
          const mapped = mapSqlType(colName, sqlType);
          const field: FieldDefinition = {
            name: colName,
            type: mapped.type,
            ...(mapped.params ? { params: mapped.params } : {}),
          };
    
          // Check for inline REFERENCES (foreign key)
          const refMatch = line.match(/REFERENCES\s+(?:`|")?(\w+)(?:`|")?\s*\((?:`|")?(\w+)(?:`|")?\)/i);
          if (refMatch) {
            field.type = "ref";
            field.params = { table: refMatch[1].toLowerCase(), field: refMatch[2].toLowerCase() };
          }
    
          // Check for CHECK IN constraint (enum detection)
          const checkMatch = line.match(/CHECK\s*\(\s*\w+\s+IN\s*\((.*?)\)\s*\)/i);
          if (checkMatch) {
            field.type = "enum";
            field.params = {
              values: checkMatch[1].split(",").map((v) => v.trim().replace(/'/g, "")),
            };
          }
    
          fields.push(field);
        }
    
        tables.push({
          name: tableName,
          count: 50,
          fields,
        });
      }
    
      return { tables };
    }
  • Infers MockHero field definitions from a sample JSON record. Uses key names and value types to detect field types.
    export function detectFromJson(sample: Record<string, unknown>): { tables: [{ name: string; count: number; fields: FieldDefinition[] }] } {
      const fields: FieldDefinition[] = [];
    
      for (const [key, value] of Object.entries(sample)) {
        const inferred = inferTypeFromValue(key, value);
        fields.push({
          name: key,
          type: inferred.type,
          ...(inferred.params ? { params: inferred.params } : {}),
        });
      }
    
      return {
        tables: [{
          name: "records",
          count: 50,
          fields,
        }],
      };
    }
  • API route handler for POST /api/v1/schema/detect. Validates API key, enforces Pro+ tier gating, and delegates to detectFromSql/detectFromJson.
    export async function POST(request: Request) {
      try {
        // 1. Validate API key
        const user = await validateApiKey(request);
        if (!user) {
          return unauthorizedError();
        }
    
        // 2. Feature gate — Pro+ only
        if (user.tier === "free") {
          return forbiddenFeatureError("Schema detection", "Pro");
        }
    
        // 3. Parse request body
        let body: Record<string, unknown>;
        try {
          body = await request.json();
        } catch {
          return validationError("Invalid JSON in request body");
        }
    
        const { sql, sample_json } = body;
    
        if (!sql && !sample_json) {
          return validationError("Provide either 'sql' (CREATE TABLE statement) or 'sample_json' (example JSON record)");
        }
    
        // 4. Detect schema
        if (typeof sql === "string") {
          if (sql.length > 50_000) {
            return validationError("SQL input must be 50,000 characters or fewer");
          }
          const schema = detectFromSql(sql);
          return NextResponse.json({ schema });
        }
    
        if (sample_json && typeof sample_json === "object") {
          // Reject arrays — sample_json must be a plain object (single record)
          if (Array.isArray(sample_json)) {
            return validationError("'sample_json' must be a single JSON object (not an array). Pass one example record.");
          }
          const schema = detectFromJson(sample_json as Record<string, unknown>);
          return NextResponse.json({ schema });
        }
    
        return validationError("'sql' must be a string, 'sample_json' must be an object");
      } catch (err) {
        console.error("Schema detect error:", err);
        return internalError();
      }
    }
  • Input schema definition for the 'detect_schema' tool: accepts optional 'sql' (string) and 'sample_json' (object) parameters.
    {
      name: "detect_schema",
      description:
        "Convert a SQL CREATE TABLE statement or JSON sample into MockHero's schema format. Use this to quickly generate test data for an existing database.",
      inputSchema: {
        type: "object",
        properties: {
          sql: {
            type: "string",
            description:
              "One or more SQL CREATE TABLE statements to convert into MockHero schema format.",
          },
          sample_json: {
            type: "object",
            description:
              "A sample JSON object (or array of objects) to infer a schema from. The keys become field names, and values are used to detect types.",
          },
        },
      },
    },
  • Registration of 'detect_schema' as an MCP server tool with zod validation and proxying to the API endpoint.
    server.tool(
      "detect_schema",
      `Convert an existing database schema or JSON sample into MockHero's schema format.
    
    Send a SQL CREATE TABLE statement and get back the structured schema ready to use
    with generate_test_data. Or send a sample JSON record and MockHero will infer
    the field types.
    
    This is useful when you have migration files or an existing database and want to
    generate test data that matches your schema without manually writing the definition.`,
      {
        sql: z
          .string()
          .optional()
          .describe("SQL CREATE TABLE statement(s) to convert"),
        sample_json: z
          .record(z.string(), z.unknown())
          .optional()
          .describe("Example JSON record to infer schema from"),
      },
      async (params) => {
        if (!params.sql && !params.sample_json) {
          return {
            content: [
              {
                type: "text" as const,
                text: "Error: provide either 'sql' (CREATE TABLE statements) or 'sample_json' (example record).",
              },
            ],
          };
        }
    
        const res = await apiCall("POST", "/api/v1/schema/detect", params);
    
        if (!res.ok) {
          return {
            content: [
              {
                type: "text" as const,
                text: `API error (${res.status}): ${JSON.stringify(res.data, null, 2)}`,
              },
            ],
            isError: true,
          };
        }
    
        return {
          content: [
            {
              type: "text" as const,
              text: JSON.stringify(res.data, null, 2),
            },
          ],
        };
      }
    );
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It fails to mention side effects, auth requirements, or whether the operation is read-only or destructive. For a tool that processes input and returns a schema, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three paragraphs, but the first sentence captures the core. The rest provides useful context without excessive verbosity. It is well-structured and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and full parameter descriptions, the description adequately explains the two input modes and use case. However, it lacks details on output format, error handling, or limitations, which would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds meaning by explaining that sql converts CREATE TABLE statements and sample_json infers types, going beyond the schema's basic descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it converts an existing database schema or JSON sample into MockHero's schema format. It specifies the verb 'Convert' and the resource, distinguishing it from siblings like generate_test_data or list_field_types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: when you have migration files or an existing database and want to generate test data. It provides context but does not explicitly state when not to use it, though the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dinosaur24/mockhero'

If you have feedback or need assistance with the MCP directory API, please join our Discord server