query_data_lake
Execute SQL queries on Panther's security data lake to analyze logs, investigate threats, and perform security monitoring with required time filters for optimal performance.
Instructions
Query Panther's security data lake using SQL for log analysis and threat hunting.
REQUIRED: Include time filter with p_event_time (required for performance and partitioning)
Panther Time Filter Macros (Recommended - optimized for performance):
p_occurs_since(timeOffset [, tableAlias[, column]]) Examples: p_occurs_since('1 d'), p_occurs_since('6 h'), p_occurs_since('2 weeks'), p_occurs_since(3600) Time formats: '30 s', '15 m', '6 h', '2 d', '1 w' OR '2 weeks', '1 day' OR numeric seconds
p_occurs_between(startTime, endTime [, tableAlias[, column]]) Examples: p_occurs_between('2024-01-01', '2024-01-02'), p_occurs_between('2024-03-20T00:00:00Z', '2024-03-20T23:59:59Z')
p_occurs_around(timestamp, timeOffset [, tableAlias[, column]]) Example: p_occurs_around('2024-01-15T10:30:00Z', '1 h') # ±1 hour around timestamp
p_occurs_after(timestamp [, tableAlias[, column]])
p_occurs_before(timestamp [, tableAlias[, column]])
Alternative (manual): WHERE p_event_time >= '2024-01-01' AND p_event_time < '2024-01-02'
Best Practices:
Always use time filters (macros preferred over manual p_event_time conditions)
Start with summary queries, then drill down to specific timeframes
Use p_any_* fields for faster correlation (p_any_ip_addresses, p_any_usernames, p_any_emails)
Query specific fields instead of SELECT * for better performance
Pagination:
First call: No cursor parameter - returns first page with max_rows results
Subsequent calls: Use next_cursor from previous response to get next page
Continue until has_next_page is False
Common Examples:
Recent failed logins: "SELECT * FROM panther_logs.public.aws_cloudtrail WHERE p_occurs_since('1 d') AND errorcode IS NOT NULL"
IP activity summary: "SELECT sourceippaddress, COUNT(*) FROM panther_logs.public.aws_cloudtrail WHERE p_occurs_since('6 h') GROUP BY sourceippaddress"
User correlation: "SELECT * FROM panther_logs.public.aws_cloudtrail WHERE p_occurs_since('2 h') AND ARRAY_CONTAINS('user@domain.com'::VARIANT, p_any_emails)"
Nested field access: "SELECT p_enrichment:ipinfo_privacy:"context.ip_address" FROM table WHERE p_occurs_since('1 h')"
Query Syntax (Snowflake SQL):
Access nested JSON: column:field.subfield
Quote special characters: column:"field name" or p_enrichment:"context.ip_address"
Array searches: ARRAY_CONTAINS('value'::VARIANT, array_column)
Returns: Dict with query results: - results: List of matching rows (paginated based on cursor parameter) - results_truncated: True if results were truncated (only for non-paginated requests) - total_rows_available: Total rows found (for non-paginated requests) - has_next_page: True if more results are available - next_cursor: Cursor for next page (use in subsequent call) - column_info: Column names and data types - stats: Query performance metrics (execution time, bytes scanned) - success/status/message: Query execution status
Permissions:{'all_of': ['Query Data Lake']}
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| sql | Yes | The SQL query to execute. Must include a p_event_time filter condition after WHERE or AND. The query must be compatible with Snowflake SQL. | |
| database_name | No | panther_logs.public | |
| timeout | No | Timeout in seconds before the SQL query is cancelled. If the query fails due to timeout, the caller should consider a longer timeout. | |
| max_rows | No | Maximum number of result rows to return (prevents context overflow) | |
| cursor | No | Optional pagination cursor from previous query to fetch next page of results |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||