Skip to main content
Glama
hydrolix

mcp-hydrolix

Official

run_select_query

Execute SELECT queries on a Hydrolix time-series database using Clickhouse SQL. Optimize queries with performance guards like timestamp filters or LIMIT clauses to ensure timely results.

Instructions

Run a SELECT query in a Hydrolix time-series database using the Clickhouse SQL dialect. Queries run using this tool will timeout after 30 seconds.

The primary key on tables queried this way is always a timestamp. Queries should include either a LIMIT clause or a filter based on the primary key as a performance guard to ensure they return in a reasonable amount of time. Queries should select specific fields and avoid the use of SELECT * to avoid performance issues. The performance guard used for the query should be clearly communicated with the user, and the user should be informed that the query may take a long time to run if the performance guard is not used. When choosing a performance guard, the user's preference should be requested and used if available. When using aggregations, the performance guard should take form of a primary key filter, or else the LIMIT should be applied in a subquery before applying the aggregations.

When matching columns based on substrings, prefix or suffix matches should be used instead of full-text search whenever possible. When searching for substrings, the syntax column LIKE '%suffix' or column LIKE 'prefix%' should be used.

Example query. Purpose: get logs from the application.logs table. Primary key: timestamp. Performance guard: 10 minute recency filter.

SELECT message, timestamp FROM application.logs WHERE timestamp > now() - INTERVAL 10 MINUTES

Example query. Purpose: get the median humidity from the weather.measurements table. Primary key: date. Performance guard: 1000 row limit, applied before aggregation.

SELECT median(humidity) FROM (SELECT humidity FROM weather.measurements LIMIT 1000)

Example query. Purpose: get the lowest temperature from the weather.measurements table over the last 10 years. Primary key: date. Performance guard: date range filter.

SELECT min(temperature) FROM weather.measurements WHERE date > now() - INTERVAL 10 YEARS

Example query. Purpose: get the app name with the most log messages from the application.logs table in the window between new year and valentine's day of 2024. Primary key: timestamp. Performance guard: date range filter. SELECT app, count(*) FROM application.logs WHERE timestamp > '2024-01-01' AND timestamp < '2024-02-14' GROUP BY app ORDER BY count(*) DESC LIMIT 1

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes

Implementation Reference

  • The primary handler for the 'run_select_query' tool. It uses a thread pool to execute the query asynchronously with timeout handling, retrieves credentials, and formats the result as a dictionary with columns and rows. Includes comprehensive documentation with usage guidelines and examples.
    @mcp.tool()
    def run_select_query(query: str):
        """Run a SELECT query in a Hydrolix time-series database using the Clickhouse SQL dialect.
        Queries run using this tool will timeout after 30 seconds.
    
        The primary key on tables queried this way is always a timestamp. Queries should include either
        a LIMIT clause or a filter based on the primary key as a performance guard to ensure they return
        in a reasonable amount of time. Queries should select specific fields and avoid the use of
        SELECT * to avoid performance issues. The performance guard used for the query should be clearly
        communicated with the user, and the user should be informed that the query may take a long time
        to run if the performance guard is not used. When choosing a performance guard, the user's
        preference should be requested and used if available. When using aggregations, the performance
        guard should take form of a primary key filter, or else the LIMIT should be applied in a
        subquery before applying the aggregations.
    
        When matching columns based on substrings, prefix or suffix matches should be used instead of
        full-text search whenever possible. When searching for substrings, the syntax `column LIKE
        '%suffix'` or `column LIKE 'prefix%'` should be used.
    
        Example query. Purpose: get logs from the `application.logs` table. Primary key: `timestamp`.
        Performance guard: 10 minute recency filter.
    
        `SELECT message, timestamp FROM application.logs WHERE timestamp > now() - INTERVAL 10 MINUTES`
    
        Example query. Purpose: get the median humidity from the `weather.measurements` table. Primary
        key: `date`. Performance guard: 1000 row limit, applied before aggregation.
    
         `SELECT median(humidity) FROM (SELECT humidity FROM weather.measurements LIMIT 1000)`
    
        Example query. Purpose: get the lowest temperature from the `weather.measurements` table over
        the last 10 years. Primary key: `date`. Performance guard: date range filter.
    
        `SELECT min(temperature) FROM weather.measurements WHERE date > now() - INTERVAL 10 YEARS`
    
        Example query. Purpose: get the app name with the most log messages from the `application.logs`
        table in the window between new year and valentine's day of 2024. Primary key: `timestamp`.
        Performance guard: date range filter.
         `SELECT app, count(*) FROM application.logs WHERE timestamp > '2024-01-01' AND timestamp < '2024-02-14' GROUP BY app ORDER BY count(*) DESC LIMIT 1`
        """
        logger.info(f"Executing SELECT query: {query}")
        try:
            future = QUERY_EXECUTOR.submit(execute_query, query, get_request_credential())
            try:
                result = future.result(timeout=SELECT_QUERY_TIMEOUT_SECS)
                # Check if we received an error structure from execute_query
                if isinstance(result, dict) and "error" in result:
                    logger.warning(f"Query failed: {result['error']}")
                    # MCP requires structured responses; string error messages can cause
                    # serialization issues leading to BrokenResourceError
                    return {
                        "status": "error",
                        "message": f"Query failed: {result['error']}",
                    }
                return result
            except concurrent.futures.TimeoutError:
                logger.warning(f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds: {query}")
                future.cancel()
                raise ToolError(f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds")
        except ToolError:
            raise
        except Exception as e:
            logger.error(f"Unexpected error in run_select_query: {str(e)}")
            raise RuntimeError(f"Unexpected error during query execution: {str(e)}")
  • Helper function that performs the actual query execution on the Hydrolix ClickHouse-compatible client with readonly settings, limits on rows/memory/execution time, and formats the response.
    def execute_query(query: str, request_credential: Optional[HydrolixCredential]):
        client = create_hydrolix_client(request_credential)
        try:
            res = client.query(
                query,
                settings={
                    "readonly": 1,
                    "hdx_query_max_execution_time": SELECT_QUERY_TIMEOUT_SECS,
                    "hdx_query_max_attempts": 1,
                    "hdx_query_max_result_rows": 100_000,
                    "hdx_query_max_memory_usage": 2 * 1024 * 1024 * 1024,  # 2GiB
                    "hdx_query_admin_comment": f"User: {MCP_SERVER_NAME}",
                },
            )
            logger.info(f"Query returned {len(res.result_rows)} rows")
            return {"columns": res.column_names, "rows": res.result_rows}
        except Exception as err:
            logger.error(f"Error executing query: {err}")
            raise ToolError(f"Query execution failed: {str(err)}")
  • Exports the run_select_query function along with other tools for use as MCP tools.
    from .mcp_server import (
        create_hydrolix_client,
        list_databases,
        list_tables,
        run_select_query,
    )
    
    __all__ = [
        "list_databases",
        "list_tables",
        "run_select_query",
        "create_hydrolix_client",
    ]
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It thoroughly describes key behaviors: timeout after 30 seconds, performance requirements (e.g., need for LIMIT or primary key filters), query optimization tips (e.g., avoiding SELECT *), and user communication guidelines. This covers operational constraints and expectations beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with essential information (purpose, timeout, performance guards) but includes extensive examples that, while helpful, make it lengthy. Every sentence adds value, but the multiple detailed examples could be condensed or summarized more efficiently to improve conciseness without losing critical guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of SQL query execution and the lack of annotations and output schema, the description is highly complete. It covers purpose, usage guidelines, behavioral traits, parameter semantics, and provides practical examples. This ensures the agent has sufficient context to invoke the tool correctly, despite the absence of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for the single parameter 'query', but the description compensates fully by explaining the parameter's semantics. It details that queries must be in Clickhouse SQL dialect, include performance guards, avoid SELECT *, and use specific syntax for substring matches, supported by multiple examples that illustrate proper query construction and formatting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: 'Run a SELECT query in a Hydrolix time-series database using the Clickhouse SQL dialect.' It specifies the verb ('Run'), resource ('SELECT query'), and database context, distinguishing it from sibling tools like list_databases and list_tables, which are for listing rather than querying.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool, including performance considerations like timeout limits (30 seconds), necessity of LIMIT clauses or primary key filters, and avoidance of SELECT *. It also advises on alternatives like substring matching techniques and includes detailed examples for different scenarios, clearly defining usage contexts and best practices.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hydrolix/mcp-hydrolix'

If you have feedback or need assistance with the MCP directory API, please join our Discord server