Skip to main content
Glama
ThoTischner

observability-mcp

query_logs

Fetch log entries for a service over a look-back window with summary counts of errors/warnings and frequent error patterns. For inspecting service logs or investigating error spikes.

Instructions

Fetch recent log entries for ONE service over a look-back window, with a pre-computed summary (error/warning counts and the most frequent error patterns). When to use: to inspect what a service actually logged, or to investigate an error spike surfaced by detect_anomalies / get_service_health. For numeric metrics use query_metrics instead. Prerequisites: get the exact service name from list_services (the service must expose a logs signal). Behavior: read-only, no side effects. Returns the matching log entries (newest first, capped by limit) plus a summary with total/error/warn counts and top recurring error patterns. No matches yields an empty result with a zeroed summary; an unreachable backend yields a structured explanatory error, never an exception.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
serviceYesRequired. Exact, case-sensitive service name exactly as returned by `list_services` (e.g. 'payment-service').
queryNoOptional. Filter expression matched against the log message; regular expressions are supported. Omit to return all entries in the window.
durationNoOptional. Look-back window ending at 'now', written as <number><unit> with unit s|m|h|d (e.g. '5m', '1h', '24h'). Default: '5m'.
levelNoOptional. Return only entries at this severity. Default: all levels.
limitNoOptional. Maximum number of log entries to return (most recent first). Default: 100.

Implementation Reference

  • The queryLogsHandler function — the main handler that orchestrates log queries across all registered connectors. It validates inputs (service name, duration), iterates connectors registered for 'logs' signal type, calls each connector's queryLogs method, and aggregates results into a LogResult array.
    export async function queryLogsHandler(
      registry: ConnectorRegistry,
      args: { service: string; query?: string; duration?: string; level?: string; limit?: number },
      _ctx: RequestContext = defaultContext()
    ) {
      const svcErr = validateServiceName(args.service);
      if (svcErr) return errorResponse(svcErr);
      const duration = args.duration || "5m";
      const durationErr = validateDuration(duration);
      if (durationErr) return errorResponse(durationErr);
      const connectors = registry.getBySignal("logs");
    
      if (connectors.length === 0) {
        return {
          content: [
            { type: "text" as const, text: JSON.stringify({ error: "No log backends configured" }) },
          ],
          isError: true,
        };
      }
    
      const results: LogResult[] = [];
      const errors: string[] = [];
      for (const connector of connectors) {
        if (!connector.queryLogs) continue;
        try {
          const result = await connector.queryLogs({
            service: args.service,
            query: args.query,
            duration,
            level: args.level,
            limit: args.limit,
          });
          results.push(result);
        } catch (err) {
          const msg = err instanceof Error ? err.message : String(err);
          console.error(`Log query failed on ${connector.name}:`, msg);
          errors.push(`${connector.name}: ${msg}`);
        }
      }
    
      if (results.length === 0) {
        return {
          content: [
            {
              type: "text" as const,
              text: JSON.stringify({
                error: errors.length > 0 ? `Query failed: ${errors.join("; ")}` : "No logs returned",
                service: args.service,
                duration,
              }),
            },
          ],
          isError: errors.length > 0,
        };
      }
    
      return {
        content: [
          {
            type: "text" as const,
            text: JSON.stringify(results.length === 1 ? results[0] : results, null, 2),
          },
        ],
      };
    }
  • queryLogsDefinition — the tool definition/input schema (name, description, JSON Schema input with service, query, duration, level, limit fields) that gets registered with the MCP server.
    export const queryLogsDefinition = {
      name: "query_logs" as const,
      description:
        "Query logs for a service over a given timeframe. Returns log entries with a summary including error/warning counts and top error patterns. Supports filtering by log level and search query.",
      inputSchema: {
        type: "object" as const,
        properties: {
          service: {
            type: "string",
            description: "Service name (e.g. 'payment-service')",
          },
          query: {
            type: "string",
            description: "Optional search query to filter log messages (regex supported)",
          },
          duration: {
            type: "string",
            description: "Time range to query (e.g. '5m', '1h', '24h'). Default: '5m'",
          },
          level: {
            type: "string",
            description: "Filter by log level: 'error', 'warn', 'info', 'debug'",
          },
          limit: {
            type: "number",
            description: "Maximum number of log entries to return. Default: 100",
          },
        },
        required: ["service"],
      },
    };
  • Registration of 'query_logs' tool with the MCP server (lines 217-259). Uses Zod schemas to validate inputs and wires it to queryLogsHandler via withToolMetrics instrumentation.
    mcpServer.tool(
      "query_logs",
      [
        "Fetch recent log entries for ONE service over a look-back window, with a pre-computed summary (error/warning counts and the most frequent error patterns).",
        "When to use: to inspect what a service actually logged, or to investigate an error spike surfaced by `detect_anomalies` / `get_service_health`. For numeric metrics use `query_metrics` instead.",
        "Prerequisites: get the exact service name from `list_services` (the service must expose a logs signal).",
        "Behavior: read-only, no side effects. Returns the matching log entries (newest first, capped by `limit`) plus a summary with total/error/warn counts and top recurring error patterns. No matches yields an empty result with a zeroed summary; an unreachable backend yields a structured explanatory error, never an exception.",
      ].join(" "),
      {
        service: z
          .string()
          .describe(
            "Required. Exact, case-sensitive service name exactly as returned by `list_services` (e.g. 'payment-service').",
          ),
        query: z
          .string()
          .optional()
          .describe(
            "Optional. Filter expression matched against the log message; regular expressions are supported. Omit to return all entries in the window.",
          ),
        duration: z
          .string()
          .optional()
          .describe(
            "Optional. Look-back window ending at 'now', written as <number><unit> with unit s|m|h|d (e.g. '5m', '1h', '24h'). Default: '5m'.",
          ),
        level: z
          .enum(["error", "warn", "info", "debug"])
          .optional()
          .describe(
            "Optional. Return only entries at this severity. Default: all levels.",
          ),
        limit: z
          .number()
          .int()
          .positive()
          .optional()
          .describe(
            "Optional. Maximum number of log entries to return (most recent first). Default: 100.",
          ),
      },
      async (args) => withToolMetrics("query_logs", () => queryLogsHandler(registry, args, ctx))
    );
  • Import of queryLogsHandler from the tools/query-logs.ts module at line 32.
    import { queryLogsHandler } from "./tools/query-logs.js";
    import { getServiceHealthHandler, setHealthThresholds } from "./tools/get-service-health.js";
  • LokiConnector.queryLogs() — the actual backend implementation for Loki/Grafana. Builds a LogQL query from params, fetches from Loki's query_range API, parses entries, computes summary (error/warn counts, top error patterns), and returns LogResult.
      async queryLogs(params: LogQuery): Promise<LogResult> {
        const { start, end } = this.parseTimeRange(params.duration);
        const limit = Math.min(Math.max(params.limit || 100, 1), 1000);
    
        // Resolve label + actual selector value. For the 'container' label the
        // value stored in Loki may be '/my-app-1' while the caller passes the
        // sanitized 'my-app-1' — return the prefixed form so the LogQL selector
        // matches the real stream.
        const { label: matchedLabel, value: rawValue } = await this.resolveServiceSelector(params.service);
        const service = this.escapeLogQLValue(rawValue);
        let logql = `{${matchedLabel}="${service}"}`;
        if (params.level) {
          const level = this.escapeLogQLValue(params.level);
          logql += ` | json | level="${level}"`;
        } else {
          logql += ` | json`;
        }
        if (params.query) {
          const query = this.escapeLogQLRegex(params.query);
          logql += ` |~ \`${query}\``;
        }
    
        const url =
          `/loki/api/v1/query_range?query=${encodeURIComponent(logql)}` +
          `&start=${start}000000000&end=${end}000000000&limit=${limit}`;
    
        const data = await this.apiGet<LokiQueryResponse>(url);
    
        const entries: LogEntry[] = [];
        for (const stream of data?.data?.result || []) {
          const labels = stream.stream;
          for (const [ts, line] of stream.values) {
            const parsed = this.parseLine(line);
            entries.push({
              timestamp: new Date(parseInt(ts) / 1_000_000).toISOString(),
              level: parsed.level || labels.level || "unknown",
              message: parsed.msg || line,
              labels,
            });
          }
        }
    
        // Sort newest first
        entries.sort((a, b) => b.timestamp.localeCompare(a.timestamp));
    
        // Compute summary
        const errorCount = entries.filter((e) => e.level === "error").length;
        const warnCount = entries.filter((e) => e.level === "warn").length;
        const topPatterns = this.extractTopPatterns(entries.filter((e) => e.level === "error"));
    
        return {
          source: this.name,
          service: params.service,
          entries,
          summary: {
            total: entries.length,
            errorCount,
            warnCount,
            topPatterns,
          },
        };
      }
    
      // --- Private helpers ---
    
      private async getLabelValues(label: string): Promise<string[]> {
        const cached = this.labelValuesCache.get(label);
        if (cached && cached.expiresAt > Date.now()) {
          return cached.values;
        }
        try {
          const data = await this.apiGet<{ data: string[] }>(
            `/loki/api/v1/label/${encodeURIComponent(label)}/values`
          );
          const values = data?.data || [];
          this.labelValuesCache.set(label, {
            values,
            expiresAt: Date.now() + LABEL_CACHE_TTL_MS,
          });
          return values;
        } catch {
          this.labelValuesCache.set(label, { values: [], expiresAt: Date.now() + LABEL_CACHE_TTL_MS });
          return [];
        }
      }
    
      private async resolveServiceSelector(service: string): Promise<{ label: string; value: string }> {
        for (const label of this.serviceLabels) {
          const values = await this.getLabelValues(label);
          if (values.includes(service)) return { label, value: service };
          // Container label values are Docker-prefixed with '/'. The caller can't
          // pass that form (validator rejects '/'), so probe the prefixed variant.
          if (label === "container" && values.includes(`/${service}`)) {
            return { label, value: `/${service}` };
          }
        }
        return { label: this.serviceLabels[0] || "service_name", value: service };
      }
    
      private parseLine(line: string): Record<string, string> {
        try {
          return JSON.parse(line);
        } catch {
          return { msg: line };
        }
      }
    
      private extractTopPatterns(errorEntries: LogEntry[]): string[] {
        const patterns = new Map<string, number>();
        for (const entry of errorEntries) {
          // Use first 100 chars of message as pattern key
          const key = entry.message.slice(0, 100);
          patterns.set(key, (patterns.get(key) || 0) + 1);
        }
        return Array.from(patterns.entries())
          .sort((a, b) => b[1] - a[1])
          .slice(0, 5)
          .map(([pattern, count]) => `${pattern} (${count}x)`);
      }
    
      private parseTimeRange(duration: string) {
        const now = Math.floor(Date.now() / 1000);
        const match = duration.match(/^(\d+)([mhd])$/);
        if (!match) throw new Error(`Invalid duration: ${duration}`);
        const value = parseInt(match[1]);
        const unit = match[2];
        const seconds = unit === "m" ? value * 60 : unit === "h" ? value * 3600 : value * 86400;
        return { start: now - seconds, end: now };
      }
    
      private escapeLogQLValue(value: string): string {
        return value.replace(/\\/g, "\\\\").replace(/"/g, '\\"');
      }
    
      private escapeLogQLRegex(value: string): string {
        // Escape backslash first (so we don't double-escape sequences we add),
        // then the backtick that delimits LogQL regex literals.
        return value.replace(/\\/g, "\\\\").replace(/`/g, "\\`");
      }
    
      private buildAuthHeaders(): Record<string, string> {
        if (!this.auth || this.auth.type === "none") return {};
        if (this.auth.type === "bearer" && this.auth.token) {
          return { Authorization: `Bearer ${this.auth.token}` };
        }
        if (this.auth.type === "basic" && this.auth.username) {
          const encoded = Buffer.from(`${this.auth.username}:${this.auth.password || ""}`).toString("base64");
          return { Authorization: `Basic ${encoded}` };
        }
        return {};
      }
    
      private async apiGet<T>(path: string, timeoutMs = 10000): Promise<T> {
        const controller = new AbortController();
        const timer = setTimeout(() => controller.abort(), timeoutMs);
        try {
          const res = await fetch(`${this.baseUrl}${path}`, {
            ...this.fetchOptions(),
            signal: controller.signal,
          });
          if (!res.ok) throw new Error(`Loki API error: ${res.status} ${res.statusText}`);
          return res.json() as Promise<T>;
        } catch (err) {
          if (err instanceof DOMException && err.name === "AbortError") {
            throw new Error(`Loki query timed out after ${timeoutMs}ms`);
          }
          throw err;
        } finally {
          clearTimeout(timer);
        }
      }
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses read-only, no side effects. Describes return behavior (newest first, capped by limit, including summary) and edge cases (empty result with zeroed summary, unreachable backend yields structured error).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured: first sentence gives core purpose, then usage guidelines, prerequisites, behavior, return details, edge cases. No wasted words, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description thoroughly explains return values, including summary counts and pattern analysis. Covers usage context, behavioral expectations, and error handling comprehensively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline is 3. Description adds context for the 'service' parameter (exact, case-sensitive, from list_services) and overall purpose, but does not significantly enhance understanding of other parameters beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool fetches log entries for one service with a summary. Distinguishes from siblings like query_metrics and detect_anomalies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (inspect logs, investigate error spikes) and when not to use (numeric metrics use query_metrics). Lists prerequisite: get exact service name from list_services.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ThoTischner/observability-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server