Skip to main content
Glama
alilxxey

openobserve-community-mcp

search_logs

Search OpenObserve logs using full SQL queries. Specify time range, limit, offset, and output format. Optionally reduce token usage with columnar output or trim Kubernetes metadata with compact profile.

Instructions

Run a full SQL search against OpenObserve logs. Example row query: SELECT _timestamp, message FROM "my_stream" ORDER BY _timestamp DESC LIMIT 20. Example aggregate query: SELECT level, count(*) AS cnt FROM "my_stream" GROUP BY level ORDER BY cnt DESC LIMIT 20. Prefer double quotes around stream names in SQL when in doubt, and confirm actual field names with get_stream_schema instead of assuming a log column. start_time and end_time accept Unix timestamps in seconds, milliseconds, microseconds, or nanoseconds and are normalized to microseconds. The limit parameter sets the API page size; if your OpenObserve/DataFusion setup still complains about ORDER BY without a SQL LIMIT, add an explicit LIMIT to the SQL as well. output_format can be 'records' or 'columns'; 'columns' is especially useful for wide SELECT * queries and can save roughly 35-40% tokens. record_profile can be 'generic' or 'kubernetes_compact'; the Kubernetes compact profile trims common noisy metadata fields such as pod labels and pod IP metadata.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sqlYes
start_timeYes
end_timeYes
limitNo
offsetNo
use_cacheNo
timeoutNo
output_formatNorecords
record_profileNogeneric
include_rawNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The search_logs tool handler function. It is registered as a FastMCP tool, accepts SQL query, time range, limit, offset, output_format, record_profile, etc., normalizes timestamps, calls client.search_sql(), and returns result via build_search_logs_result().
    @server.tool()
    def search_logs(
        sql: str,
        start_time: int,
        end_time: int,
        limit: int = 100,
        offset: int = 0,
        use_cache: bool = False,
        timeout: int | None = None,
        output_format: str = "records",
        record_profile: str = "generic",
        include_raw: bool = False,
    ) -> dict[str, Any]:
        """Run a full SQL search against OpenObserve logs. Example row query: SELECT _timestamp, message FROM "my_stream" ORDER BY _timestamp DESC LIMIT 20. Example aggregate query: SELECT level, count(*) AS cnt FROM "my_stream" GROUP BY level ORDER BY cnt DESC LIMIT 20. Prefer double quotes around stream names in SQL when in doubt, and confirm actual field names with get_stream_schema instead of assuming a `log` column. start_time and end_time accept Unix timestamps in seconds, milliseconds, microseconds, or nanoseconds and are normalized to microseconds. The limit parameter sets the API page size; if your OpenObserve/DataFusion setup still complains about ORDER BY without a SQL LIMIT, add an explicit LIMIT to the SQL as well. output_format can be 'records' or 'columns'; 'columns' is especially useful for wide SELECT * queries and can save roughly 35-40% tokens. record_profile can be 'generic' or 'kubernetes_compact'; the Kubernetes compact profile trims common noisy metadata fields such as pod labels and pod IP metadata."""
        client = client_provider.get()
        start_time, end_time = _normalize_time_range(start_time, end_time)
        raw = client.search_sql(
            sql=sql,
            start_time=start_time,
            end_time=end_time,
            offset=offset,
            limit=limit,
            use_cache=use_cache,
            timeout=timeout,
        )
        return build_search_logs_result(
            org_id=client.resolve_org_id(),
            raw=raw,
            output_format=output_format,
            record_profile=record_profile,
            include_raw=include_raw,
        )
  • The build_search_logs_result() helper that shapes the API response into a compact result dict with metadata (took, total, scan_records, cached_ratio, hit_count) and records/columns payload, applying record profile and output format normalization.
    def build_search_logs_result(
        *,
        org_id: str,
        raw: Any,
        output_format: str,
        record_profile: str,
        include_raw: bool,
    ) -> dict[str, Any]:
        hits = raw.get("hits", []) if isinstance(raw, dict) else []
        records = [_apply_record_profile(summarize_search_record(hit), record_profile=record_profile) for hit in hits if isinstance(hit, dict)]
        result: dict[str, Any] = {
            "org_id": org_id,
            "took": raw.get("took") if isinstance(raw, dict) else None,
            "total": raw.get("total") if isinstance(raw, dict) else None,
            "scan_records": raw.get("scan_records") if isinstance(raw, dict) else None,
            "cached_ratio": raw.get("cached_ratio") if isinstance(raw, dict) else None,
            "hit_count": len(hits),
            "output_format": _normalize_output_format(output_format),
            "record_profile": _normalize_record_profile(record_profile),
        }
        _attach_record_payload(result, records, output_format=output_format)
        return maybe_include_raw(result, raw, include_raw)
  • The search_logs tool is registered via @server.tool() decorator on the search_logs function at line 78 (server creation blocks) within create_server().
    @server.tool()
    def search_logs(
        sql: str,
        start_time: int,
        end_time: int,
        limit: int = 100,
        offset: int = 0,
        use_cache: bool = False,
        timeout: int | None = None,
        output_format: str = "records",
        record_profile: str = "generic",
        include_raw: bool = False,
    ) -> dict[str, Any]:
        """Run a full SQL search against OpenObserve logs. Example row query: SELECT _timestamp, message FROM "my_stream" ORDER BY _timestamp DESC LIMIT 20. Example aggregate query: SELECT level, count(*) AS cnt FROM "my_stream" GROUP BY level ORDER BY cnt DESC LIMIT 20. Prefer double quotes around stream names in SQL when in doubt, and confirm actual field names with get_stream_schema instead of assuming a `log` column. start_time and end_time accept Unix timestamps in seconds, milliseconds, microseconds, or nanoseconds and are normalized to microseconds. The limit parameter sets the API page size; if your OpenObserve/DataFusion setup still complains about ORDER BY without a SQL LIMIT, add an explicit LIMIT to the SQL as well. output_format can be 'records' or 'columns'; 'columns' is especially useful for wide SELECT * queries and can save roughly 35-40% tokens. record_profile can be 'generic' or 'kubernetes_compact'; the Kubernetes compact profile trims common noisy metadata fields such as pod labels and pod IP metadata."""
        client = client_provider.get()
        start_time, end_time = _normalize_time_range(start_time, end_time)
        raw = client.search_sql(
            sql=sql,
            start_time=start_time,
            end_time=end_time,
            offset=offset,
            limit=limit,
            use_cache=use_cache,
            timeout=timeout,
        )
        return build_search_logs_result(
            org_id=client.resolve_org_id(),
            raw=raw,
            output_format=output_format,
            record_profile=record_profile,
            include_raw=include_raw,
        )
  • The search_sql() method on OpenObserveClient that makes the POST request to /api/{org_id}/_search with the SQL query, time range, pagination, and cache settings.
    def search_sql(
        self,
        *,
        sql: str,
        start_time: int,
        end_time: int,
        offset: int = 0,
        limit: int = 100,
        use_cache: bool = False,
        timeout: int | None = None,
    ) -> Any:
        body: dict[str, Any] = {
            "query": {
                "sql": sql,
                "start_time": start_time,
                "end_time": end_time,
                "from": offset,
                "size": limit,
            },
            "use_cache": use_cache,
        }
        if timeout is not None:
            body["timeout"] = timeout
    
        return self.request_json(
            "POST",
            self._org_path("/api/{org_id}/_search"),
            query={
                "is_ui_histogram": "false",
                "is_multi_stream_search": "false",
                "validate": "false",
            },
            json_body=body,
        )
  • The _apply_record_profile() helper used within build_search_logs_result to trim Kubernetes noisy metadata fields when record_profile='kubernetes_compact'.
    def _apply_record_profile(record: dict[str, Any], *, record_profile: str) -> dict[str, Any]:
        normalized = _normalize_record_profile(record_profile)
        if normalized == "generic":
            return record
    
        compact_record: dict[str, Any] = {}
        for key, value in record.items():
            if key in _KUBERNETES_COMPACT_DROP_FIELDS:
                continue
            if any(key.startswith(prefix) for prefix in _KUBERNETES_COMPACT_DROP_PREFIXES):
                continue
            compact_record[key] = value
        return compact_record
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: time normalization, limit as page size, output format token savings, record profiles, and advice on SQL syntax. This provides a rich understanding of tool behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but every sentence adds practical value, with examples and tips well integrated. It could be slightly more concise, but the structure is logical and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 10 parameters and no annotations, the description covers the core usage comprehensively. Some less common parameters are omitted, but the most critical ones are addressed. Output schema exists, so return values are not needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description compensates by explaining key parameters (sql, start_time, end_time, limit, output_format, record_profile) in detail. However, offset, use_cache, timeout, and include_raw are not described, leaving some gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it runs a full SQL search against OpenObserve logs, with specific examples of row and aggregate queries. It is a specific verb+resource combination that distinguishes it from other search tools like search_around or search_values.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus its siblings. It only provides tips on SQL syntax and field names, but does not clarify when to prefer search_logs over search_around or search_values.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/alilxxey/openobserve-community-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server