Skip to main content
Glama

getDatasetInfoAndSchema

Retrieve dataset metadata and structure details from Axiom to understand data organization and field definitions for analysis.

Instructions

Get dataset info and schema

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
datasetYesThe dataset to get info and schema for

Implementation Reference

  • The main handler function for the 'getDatasetInfoAndSchema' tool. It applies rate limiting, fetches dataset information and schema from Axiom's internal API using fetch, validates the response with datasetInfoSchema, converts the fields schema to TypeScript type definitions using helper functions, and returns the data as a formatted text content block.
    async ({ dataset }) => {
      const remainingTokens = datasetsLimiter.tryRemoveTokens(1);
      if (!remainingTokens) {
        throw new Error("Rate limit exceeded for dataset operations");
      }
    
      try {
        // Axiom client does not provide access to internal routes. We need to hit the API directly.
        const response = await fetch(
          `${config.internalUrl}/datasets/${dataset}/info`,
          {
            headers: {
              Authorization: `Bearer ${config.token}`,
              "X-AXIOM-ORG-ID": config.orgId,
            },
          }
        );
    
        const rawData = await response.json();
    
        // Validate the response data
        const data = datasetInfoSchema.parse(rawData);
    
        // Convert the fields to type definitions string
        const typeDefsString = getStringifiedSchema(
          convertSchemaToJSON(data.fields)
        );
    
        return {
          content: [
            {
              type: "text",
              text: JSON.stringify({ ...data, fields: typeDefsString }), // Override the fields with the type definitions
            },
          ],
        };
      } catch (error) {
        throw new Error(`Failed to list datasets: ${error.message}`);
      }
    }
  • index.js:192-238 (registration)
    Registration of the 'getDatasetInfoAndSchema' tool with the MCP server, including tool name, description, input schema (dataset name), and the inline handler function.
    server.tool(
      "getDatasetInfoAndSchema",
      "Get dataset info and schema",
      {
        dataset: z.string().describe("The dataset to get info and schema for"),
      },
      async ({ dataset }) => {
        const remainingTokens = datasetsLimiter.tryRemoveTokens(1);
        if (!remainingTokens) {
          throw new Error("Rate limit exceeded for dataset operations");
        }
    
        try {
          // Axiom client does not provide access to internal routes. We need to hit the API directly.
          const response = await fetch(
            `${config.internalUrl}/datasets/${dataset}/info`,
            {
              headers: {
                Authorization: `Bearer ${config.token}`,
                "X-AXIOM-ORG-ID": config.orgId,
              },
            }
          );
    
          const rawData = await response.json();
    
          // Validate the response data
          const data = datasetInfoSchema.parse(rawData);
    
          // Convert the fields to type definitions string
          const typeDefsString = getStringifiedSchema(
            convertSchemaToJSON(data.fields)
          );
    
          return {
            content: [
              {
                type: "text",
                text: JSON.stringify({ ...data, fields: typeDefsString }), // Override the fields with the type definitions
              },
            ],
          };
        } catch (error) {
          throw new Error(`Failed to list datasets: ${error.message}`);
        }
      }
    );
  • Zod schema used to validate the raw dataset info response from the Axiom API before processing.
    const datasetInfoSchema = z.object({
      compressedBytes: z.number(),
      compressedBytesHuman: z.string(),
      created: z.string(),
      fields: fieldsSchema,
      inputBytes: z.number(),
      inputBytesHuman: z.string(),
      maxTime: z.string(),
      minTime: z.string(),
      name: z.string(),
      numBlocks: z.number(),
      numEvents: z.number(),
      numFields: z.number(),
      quickQueries: z.null(),
      who: z.string(),
    });
  • Helper function that converts the flat array of Axiom dataset fields (with dotted notation for nested fields) into a nested JSON object suitable for TypeScript type generation.
    function convertSchemaToJSON(fields) {
      // Validate fields
      const validatedFields = fieldsSchema.parse(fields);
    
      const defs = {};
    
      validatedFields.forEach((field) => {
        const type = field.type || "any"; // Directly use the type from the field
        const path = field.name.split(".");
    
        let current = defs;
    
        path.forEach((key, index) => {
          // Ensure the current level is initialized as an object or the final type
          if (!current[key]) {
            current[key] = index === path.length - 1 ? type : {};
          } else if (
            index === path.length - 1 &&
            typeof current[key] === "object"
          ) {
            // If the final level was previously initialized as an object, overwrite with the type
            current[key] = type;
          }
    
          current = current[key] || {};
        });
      });
    
      return defs;
    }
  • Helper function that recursively converts the nested JSON schema object into a formatted multi-line string representing TypeScript type/interface definitions.
    function getStringifiedSchema(defs, indent = 2) {
      const entries = Object.entries(defs);
      const spaces = " ".repeat(indent);
    
      return `{
    ${entries
      .map(([key, value]) => {
        if (typeof value === "string") {
          return `${spaces}${key}: ${value};`;
        }
    
        return `${spaces}${key}: ${getStringifiedSchema(value, indent + 2)};`;
      })
      .join("\n")}
    ${" ".repeat(indent - 2)}}`;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states what the tool does ('Get dataset info and schema') without explaining behavioral traits such as whether it's a read-only operation, what permissions are needed, how it handles errors, or what format the output takes. This leaves significant gaps in understanding the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise ('Get dataset info and schema') and front-loaded with the core purpose. It uses minimal words without unnecessary elaboration, though it could be more structured by including key details. Every sentence (here, just one) earns its place by stating the tool's function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a read operation with one parameter) and the absence of annotations and output schema, the description is incomplete. It doesn't explain what 'info' includes (e.g., metadata, statistics) or the schema format (e.g., JSON, table structure), leaving the agent with insufficient context to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'dataset' documented as 'The dataset to get info and schema for'. The description adds no additional meaning beyond this, as it doesn't elaborate on what constitutes a valid dataset name or provide examples. With high schema coverage, the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool's purpose ('Get dataset info and schema') which is clear but vague. It specifies the verb 'Get' and resource 'dataset info and schema', but doesn't distinguish it from sibling tools like 'listDatasets' or 'queryApl'. The purpose is understandable but lacks specificity about what 'info' includes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There's no mention of when to use 'getDatasetInfoAndSchema' instead of 'listDatasets' or 'queryApl', nor any context about prerequisites or exclusions. The user must infer usage from the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ThetaBird/mcp-server-axiom-js'

If you have feedback or need assistance with the MCP directory API, please join our Discord server