run_select_query
Execute SELECT queries on a Hydrolix time-series database using Clickhouse SQL. Optimize queries with performance guards like timestamp filters or LIMIT clauses to ensure timely results.
Instructions
Run a SELECT query in a Hydrolix time-series database using the Clickhouse SQL dialect. Queries run using this tool will timeout after 30 seconds.
The primary key on tables queried this way is always a timestamp. Queries should include either a LIMIT clause or a filter based on the primary key as a performance guard to ensure they return in a reasonable amount of time. Queries should select specific fields and avoid the use of SELECT * to avoid performance issues. The performance guard used for the query should be clearly communicated with the user, and the user should be informed that the query may take a long time to run if the performance guard is not used. When choosing a performance guard, the user's preference should be requested and used if available. When using aggregations, the performance guard should take form of a primary key filter, or else the LIMIT should be applied in a subquery before applying the aggregations.
When matching columns based on substrings, prefix or suffix matches should be used instead of
full-text search whenever possible. When searching for substrings, the syntax column LIKE
'%suffix' or column LIKE 'prefix%' should be used.
Example query. Purpose: get logs from the application.logs table. Primary key: timestamp.
Performance guard: 10 minute recency filter.
SELECT message, timestamp FROM application.logs WHERE timestamp > now() - INTERVAL 10 MINUTES
Example query. Purpose: get the median humidity from the weather.measurements table. Primary
key: date. Performance guard: 1000 row limit, applied before aggregation.
SELECT median(humidity) FROM (SELECT humidity FROM weather.measurements LIMIT 1000)
Example query. Purpose: get the lowest temperature from the weather.measurements table over
the last 10 years. Primary key: date. Performance guard: date range filter.
SELECT min(temperature) FROM weather.measurements WHERE date > now() - INTERVAL 10 YEARS
Example query. Purpose: get the app name with the most log messages from the application.logs
table in the window between new year and valentine's day of 2024. Primary key: timestamp.
Performance guard: date range filter.
SELECT app, count(*) FROM application.logs WHERE timestamp > '2024-01-01' AND timestamp < '2024-02-14' GROUP BY app ORDER BY count(*) DESC LIMIT 1
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes |
Implementation Reference
- mcp_hydrolix/mcp_server.py:201-263 (handler)The primary handler for the 'run_select_query' tool. It uses a thread pool to execute the query asynchronously with timeout handling, retrieves credentials, and formats the result as a dictionary with columns and rows. Includes comprehensive documentation with usage guidelines and examples.@mcp.tool() def run_select_query(query: str): """Run a SELECT query in a Hydrolix time-series database using the Clickhouse SQL dialect. Queries run using this tool will timeout after 30 seconds. The primary key on tables queried this way is always a timestamp. Queries should include either a LIMIT clause or a filter based on the primary key as a performance guard to ensure they return in a reasonable amount of time. Queries should select specific fields and avoid the use of SELECT * to avoid performance issues. The performance guard used for the query should be clearly communicated with the user, and the user should be informed that the query may take a long time to run if the performance guard is not used. When choosing a performance guard, the user's preference should be requested and used if available. When using aggregations, the performance guard should take form of a primary key filter, or else the LIMIT should be applied in a subquery before applying the aggregations. When matching columns based on substrings, prefix or suffix matches should be used instead of full-text search whenever possible. When searching for substrings, the syntax `column LIKE '%suffix'` or `column LIKE 'prefix%'` should be used. Example query. Purpose: get logs from the `application.logs` table. Primary key: `timestamp`. Performance guard: 10 minute recency filter. `SELECT message, timestamp FROM application.logs WHERE timestamp > now() - INTERVAL 10 MINUTES` Example query. Purpose: get the median humidity from the `weather.measurements` table. Primary key: `date`. Performance guard: 1000 row limit, applied before aggregation. `SELECT median(humidity) FROM (SELECT humidity FROM weather.measurements LIMIT 1000)` Example query. Purpose: get the lowest temperature from the `weather.measurements` table over the last 10 years. Primary key: `date`. Performance guard: date range filter. `SELECT min(temperature) FROM weather.measurements WHERE date > now() - INTERVAL 10 YEARS` Example query. Purpose: get the app name with the most log messages from the `application.logs` table in the window between new year and valentine's day of 2024. Primary key: `timestamp`. Performance guard: date range filter. `SELECT app, count(*) FROM application.logs WHERE timestamp > '2024-01-01' AND timestamp < '2024-02-14' GROUP BY app ORDER BY count(*) DESC LIMIT 1` """ logger.info(f"Executing SELECT query: {query}") try: future = QUERY_EXECUTOR.submit(execute_query, query, get_request_credential()) try: result = future.result(timeout=SELECT_QUERY_TIMEOUT_SECS) # Check if we received an error structure from execute_query if isinstance(result, dict) and "error" in result: logger.warning(f"Query failed: {result['error']}") # MCP requires structured responses; string error messages can cause # serialization issues leading to BrokenResourceError return { "status": "error", "message": f"Query failed: {result['error']}", } return result except concurrent.futures.TimeoutError: logger.warning(f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds: {query}") future.cancel() raise ToolError(f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds") except ToolError: raise except Exception as e: logger.error(f"Unexpected error in run_select_query: {str(e)}") raise RuntimeError(f"Unexpected error during query execution: {str(e)}")
- mcp_hydrolix/mcp_server.py:180-199 (helper)Helper function that performs the actual query execution on the Hydrolix ClickHouse-compatible client with readonly settings, limits on rows/memory/execution time, and formats the response.def execute_query(query: str, request_credential: Optional[HydrolixCredential]): client = create_hydrolix_client(request_credential) try: res = client.query( query, settings={ "readonly": 1, "hdx_query_max_execution_time": SELECT_QUERY_TIMEOUT_SECS, "hdx_query_max_attempts": 1, "hdx_query_max_result_rows": 100_000, "hdx_query_max_memory_usage": 2 * 1024 * 1024 * 1024, # 2GiB "hdx_query_admin_comment": f"User: {MCP_SERVER_NAME}", }, ) logger.info(f"Query returned {len(res.result_rows)} rows") return {"columns": res.column_names, "rows": res.result_rows} except Exception as err: logger.error(f"Error executing query: {err}") raise ToolError(f"Query execution failed: {str(err)}")
- mcp_hydrolix/__init__.py:1-13 (registration)Exports the run_select_query function along with other tools for use as MCP tools.from .mcp_server import ( create_hydrolix_client, list_databases, list_tables, run_select_query, ) __all__ = [ "list_databases", "list_tables", "run_select_query", "create_hydrolix_client", ]