Skip to main content
Glama

queryApl

Query Axiom datasets using Axiom Processing Language (APL) to analyze logs, metrics, and events with filtering, aggregation, and data shaping operations.

Instructions

Instructions

  1. Query Axiom datasets using Axiom Processing Language (APL). The query must be a valid APL query string.

  2. ALWAYS get the schema of the dataset before running queries rather than guessing. You can do this by getting a single event and projecting all fields.

  3. Keep in mind that there's a maximum row limit of 65000 rows per query.

  4. Prefer aggregations over non aggregating queries when possible to reduce the amount of data returned.

  5. Be selective in what you project in each query (unless otherwise needed, like for discovering the schema). It's expensive to project all fields.

  6. ALWAYS restrict the time range of the query to the smallest possible range that meets your needs. This will reduce the amount of data scanned and improve query performance.

  7. NEVER guess the schema of the dataset. If you don't where something is, use search first to find in which fields it appears.

Examples

Basic:

  • Filter: ['logs'] | where ['severity'] == "error" or ['duration'] > 500ms

  • Time range: ['logs'] | where ['_time'] > ago(2h) and ['_time'] < now()

  • Project rename: ['logs'] | project-rename responseTime=['duration'], path=['url']

Aggregations:

  • Count by: ['logs'] | summarize count() by bin(['_time'], 5m), ['status']

  • Multiple aggs: ['logs'] | summarize count(), avg(['duration']), max(['duration']), p95=percentile(['duration'], 95) by ['endpoint']

  • Dimensional: ['logs'] | summarize dimensional_analysis(['isError'], pack_array(['endpoint'], ['status']))

  • Histograms: ['logs'] | summarize histogram(['responseTime'], 100) by ['endpoint']

  • Distinct: ['logs'] | summarize dcount(['userId']) by bin_auto(['_time'])

Search & Parse:

  • Search all: search "error" or "exception"

  • Parse logs: ['logs'] | parse-kv ['message'] as (duration:long, error:string) with (pair_delimiter=",")

  • Regex extract: ['logs'] | extend errorCode = extract("error code ([0-9]+)", 1, ['message'])

  • Contains ops: ['logs'] | where ['message'] contains_cs "ERROR" or ['message'] startswith "FATAL"

Data Shaping:

  • Extend & Calculate: ['logs'] | extend duration_s = ['duration']/1000, success = ['status'] < 400

  • Dynamic: ['logs'] | extend props = parse_json(['properties']) | where ['props.level'] == "error"

  • Pack/Unpack: ['logs'] | extend fields = pack("status", ['status'], "duration", ['duration'])

  • Arrays: ['logs'] | where ['url'] in ("login", "logout", "home") | where array_length(['tags']) > 0

Advanced:

  • Make series: ['metrics'] | make-series avg(['cpu']) default=0 on ['_time'] step 1m by ['host']

  • Join: ['errors'] | join kind=inner (['users'] | project ['userId'], ['email']) on ['userId']

  • Union: union ['logs-app*'] | where ['severity'] == "error"

  • Fork: ['logs'] | fork (where ['status'] >= 500 | as errors) (where ['status'] < 300 | as success)

  • Case: ['logs'] | extend level = case(['status'] >= 500, "error", ['status'] >= 400, "warn", "info")

Time Operations:

  • Bin & Range: ['logs'] | where ['_time'] between(datetime(2024-01-01)..now())

  • Multiple time bins: ['logs'] | summarize count() by bin(['_time'], 1h), bin(['_time'], 1d)

  • Time shifts: ['logs'] | extend prev_hour = ['_time'] - 1h

String Operations:

  • String funcs: ['logs'] | extend domain = tolower(extract("://([^/]+)", 1, ['url']))

  • Concat: ['logs'] | extend full_msg = strcat(['level'], ": ", ['message'])

  • Replace: ['logs'] | extend clean_msg = replace_regex("(password=)[^&]", "\1**", ['message'])

Common Patterns:

  • Error analysis: ['logs'] | where ['severity'] == "error" | summarize error_count=count() by ['error_code'], ['service']

  • Status codes: ['logs'] | summarize requests=count() by ['status'], bin_auto(['_time']) | where ['status'] >= 500

  • Latency tracking: ['logs'] | summarize p50=percentile(['duration'], 50), p90=percentile(['duration'], 90) by ['endpoint']

  • User activity: ['logs'] | summarize user_actions=count() by ['userId'], ['action'], bin(['_time'], 1h)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesThe APL query to run

Implementation Reference

  • The handler function that performs rate limiting, query validation, executes the APL query using the Axiom client, and returns the result as JSON.
    async ({ query }) => {
      const remainingTokens = queryLimiter.tryRemoveTokens(1);
      if (!remainingTokens) {
        throw new Error("Rate limit exceeded for queries");
      }
    
      if (!query) {
        throw new Error("Query must not be empty");
      }
    
      try {
        const result = await client.query(query);
        return {
          content: [
            {
              type: "text",
              text: JSON.stringify(result),
            },
          ],
        };
      } catch (error) {
        throw new Error(`Query failed: ${error.message}`);
      }
    }
  • Zod input schema defining the 'query' parameter for the queryApl tool.
    {
      query: z.string().describe("The APL query to run"),
    },
  • index.js:58-147 (registration)
    Registration of the 'queryApl' tool on the MCP server with detailed description, input schema, and handler function.
    server.tool(
      "queryApl",
      `# Instructions
    1. Query Axiom datasets using Axiom Processing Language (APL). The query must be a valid APL query string.
    2. ALWAYS get the schema of the dataset before running queries rather than guessing.
        You can do this by getting a single event and projecting all fields.
    3. Keep in mind that there's a maximum row limit of 65000 rows per query.
    4. Prefer aggregations over non aggregating queries when possible to reduce the amount of data returned.
    5. Be selective in what you project in each query (unless otherwise needed, like for discovering the schema).
        It's expensive to project all fields.
    6. ALWAYS restrict the time range of the query to the smallest possible range that
        meets your needs. This will reduce the amount of data scanned and improve query performance.
    7. NEVER guess the schema of the dataset. If you don't where something is, use search first to find in which fields
        it appears.
    
    # Examples
    Basic:
    - Filter: ['logs'] | where ['severity'] == "error" or ['duration'] > 500ms
    - Time range: ['logs'] | where ['_time'] > ago(2h) and ['_time'] < now()
    - Project rename: ['logs'] | project-rename responseTime=['duration'], path=['url']
    
    Aggregations:
    - Count by: ['logs'] | summarize count() by bin(['_time'], 5m), ['status']
    - Multiple aggs: ['logs'] | summarize count(), avg(['duration']), max(['duration']), p95=percentile(['duration'], 95) by ['endpoint']
    - Dimensional: ['logs'] | summarize dimensional_analysis(['isError'], pack_array(['endpoint'], ['status']))
    - Histograms: ['logs'] | summarize histogram(['responseTime'], 100) by ['endpoint']
    - Distinct: ['logs'] | summarize dcount(['userId']) by bin_auto(['_time'])
    
    Search & Parse:
    - Search all: search "error" or "exception"
    - Parse logs: ['logs'] | parse-kv ['message'] as (duration:long, error:string) with (pair_delimiter=",")
    - Regex extract: ['logs'] | extend errorCode = extract("error code ([0-9]+)", 1, ['message'])
    - Contains ops: ['logs'] | where ['message'] contains_cs "ERROR" or ['message'] startswith "FATAL"
    
    Data Shaping:
    - Extend & Calculate: ['logs'] | extend duration_s = ['duration']/1000, success = ['status'] < 400
    - Dynamic: ['logs'] | extend props = parse_json(['properties']) | where ['props.level'] == "error"
    - Pack/Unpack: ['logs'] | extend fields = pack("status", ['status'], "duration", ['duration'])
    - Arrays: ['logs'] | where ['url'] in ("login", "logout", "home") | where array_length(['tags']) > 0
    
    Advanced:
    - Make series: ['metrics'] | make-series avg(['cpu']) default=0 on ['_time'] step 1m by ['host']
    - Join: ['errors'] | join kind=inner (['users'] | project ['userId'], ['email']) on ['userId']
    - Union: union ['logs-app*'] | where ['severity'] == "error"
    - Fork: ['logs'] | fork (where ['status'] >= 500 | as errors) (where ['status'] < 300 | as success)
    - Case: ['logs'] | extend level = case(['status'] >= 500, "error", ['status'] >= 400, "warn", "info")
    
    Time Operations:
    - Bin & Range: ['logs'] | where ['_time'] between(datetime(2024-01-01)..now())
    - Multiple time bins: ['logs'] | summarize count() by bin(['_time'], 1h), bin(['_time'], 1d)
    - Time shifts: ['logs'] | extend prev_hour = ['_time'] - 1h
    
    String Operations:
    - String funcs: ['logs'] | extend domain = tolower(extract("://([^/]+)", 1, ['url']))
    - Concat: ['logs'] | extend full_msg = strcat(['level'], ": ", ['message'])
    - Replace: ['logs'] | extend clean_msg = replace_regex("(password=)[^&]*", "\\1***", ['message'])
    
    Common Patterns:
    - Error analysis: ['logs'] | where ['severity'] == "error" | summarize error_count=count() by ['error_code'], ['service']
    - Status codes: ['logs'] | summarize requests=count() by ['status'], bin_auto(['_time']) | where ['status'] >= 500
    - Latency tracking: ['logs'] | summarize p50=percentile(['duration'], 50), p90=percentile(['duration'], 90) by ['endpoint']
    - User activity: ['logs'] | summarize user_actions=count() by ['userId'], ['action'], bin(['_time'], 1h)`,
      {
        query: z.string().describe("The APL query to run"),
      },
      async ({ query }) => {
        const remainingTokens = queryLimiter.tryRemoveTokens(1);
        if (!remainingTokens) {
          throw new Error("Rate limit exceeded for queries");
        }
    
        if (!query) {
          throw new Error("Query must not be empty");
        }
    
        try {
          const result = await client.query(query);
          return {
            content: [
              {
                type: "text",
                text: JSON.stringify(result),
              },
            ],
          };
        } catch (error) {
          throw new Error(`Query failed: ${error.message}`);
        }
      }
    );
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It thoroughly describes key traits: the query must be a valid APL string, there's a maximum row limit of 65000 rows, performance considerations (e.g., reducing data scanned by restricting time ranges), and cost implications (e.g., 'It's expensive to project all fields'). This goes well beyond the basic input schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is excessively long and poorly structured for a tool definition. It includes extensive examples and patterns that belong in documentation rather than a concise tool description. The front-loaded instructions are useful, but the bulk of the text (examples) adds unnecessary verbosity without earning its place in a tool description meant for AI agents.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of querying datasets with no annotations or output schema, the description is largely complete. It covers purpose, usage, behavioral traits, and parameter semantics thoroughly. However, the lack of output schema means the description doesn't explain return values or result formats, and the over-inclusion of examples reduces focus on core completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage for the single parameter 'query', describing it as 'The APL query to run.' The description adds significant semantic context by explaining what constitutes a valid APL query (e.g., examples of syntax, operators, and patterns) and constraints (e.g., must be valid APL). However, it doesn't explicitly link this to the parameter name or format beyond the examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Query Axiom datasets using Axiom Processing Language (APL).' It specifies the verb ('query'), resource ('Axiom datasets'), and technology ('APL'), distinguishing it from sibling tools like getDatasetInfoAndSchema and listDatasets, which likely retrieve metadata rather than execute queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when and how to use this tool, including prerequisites (e.g., 'ALWAYS get the schema of the dataset before running queries'), best practices (e.g., 'Prefer aggregations over non aggregating queries'), and alternatives (e.g., 'use search first to find in which fields it appears'). It also specifies exclusions like avoiding schema guessing and limiting time ranges.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ThetaBird/mcp-server-axiom-js'

If you have feedback or need assistance with the MCP directory API, please join our Discord server