�
ZTi�� � � � d Z ddlmZmZ ddlZddlZddlZddlmZ e� � ddl m
Z
ddlmZ d ej
dd � � z Z ej
d
d � � Z ej
dd � � ZdZd
ZdZdZdZddddddd�Zh d�Zddddddd�Zdd d!d"d#d$d%d&�Zd'e
fd(�Zded)fd*ed+ed,ed-ee d.ed/ed'efd0�Zd1ed'efd2�Z d3� Z!dS )4a�
Tools module for the MCP server (Enhanced Version with Robust Genie API Integration).
This module defines all the tools (functions) that the MCP server exposes to clients.
Tools are the core functionality of an MCP server - they are callable functions that
AI assistants and other clients can invoke to perform specific actions.
Each tool should:
- Have a clear, descriptive name
- Include comprehensive docstrings (used by AI to understand when to call the tool)
- Return structured data (typically dict or list)
- Handle errors gracefully
=== COMPREHENSIVE EDGE CASE HANDLING ===
This implementation includes robust error handling for all Databricks Genie API endpoints:
1. START CONVERSATION ENDPOINT (/api/2.0/genie/spaces/{space_id}/start-conversation)
Edge Cases Handled:
- Empty or invalid query content
- Query length validation (max 10,000 characters)
- Invalid conversation_id format when continuing conversations
- Missing or malformed response data
- Authentication failures (401)
- Permission denied errors (403)
- Resource not found (404)
- Rate limiting (429) with automatic retry
- Service unavailability (503) with exponential backoff
2. GET MESSAGE ENDPOINT (/api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id})
Edge Cases Handled:
- Invalid conversation_id or message_id format
- All message statuses: SUBMITTED, EXECUTING, COMPLETED, FAILED, CANCELLED, ERROR, and UNKNOWN states
- Polling timeouts with configurable max_wait_seconds
- Message failures with detailed error extraction from attachments
- Cancelled and errored messages with proper error context
- Network timeouts and connection errors during polling
3. GET QUERY RESULT ENDPOINT (/api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id}/attachments/{attachment_id}/query-result)
Edge Cases Handled:
- Invalid attachment_id or non-query attachments
- All SQL statement execution states: PENDING, RUNNING, SUCCEEDED, FAILED, CANCELLED, CLOSED
- Missing or malformed statement_response data
- Chunked results with metadata about total chunks and offsets
- Query execution failures with detailed error codes and messages
- Empty result sets
- Truncated results indication
4. AUTHENTICATION & NETWORK
Edge Cases Handled:
- M2M OAuth authentication failures
- Expired or invalid tokens
- Connection timeouts (configurable, default 30s)
- Network connection errors
- DNS resolution failures
- SSL/TLS errors
5. API-LEVEL ERRORS
All Databricks API error codes are handled:
- BAD_REQUEST (400): Invalid parameters, not retryable
- UNAUTHENTICATED (401): Missing/invalid credentials, not retryable
- PERMISSION_DENIED (403): Insufficient permissions, not retryable
- RESOURCE_NOT_FOUND (404): Resource doesn't exist, not retryable
- RESOURCE_EXHAUSTED (429): Rate limit exceeded, retryable with backoff
- INTERNAL_ERROR (500): Server error, retryable with backoff
- UNAVAILABLE (503): Service unavailable, retryable with backoff
6. RETRY LOGIC & RESILIENCE
- Exponential backoff for transient errors (max 3 retries)
- Initial delay: 1 second, doubling on each retry
- Automatic retry only for recoverable errors (5xx, 429, timeouts, connection errors)
- Non-retryable errors fail fast (4xx client errors except 429)
7. DATA VALIDATION & SANITIZATION
- Input validation for all parameters
- Type checking for conversation_id, message_id, attachment_id
- Length validation for queries
- Format validation for UUIDs
- Null/empty checks for all user inputs
- Safe extraction of nested data structures with fallback defaults
8. ATTACHMENT HANDLING
- Multiple attachment types: text, query, suggested_questions, error
- Graceful handling of missing attachment fields
- Deduplication of suggested questions
- Empty attachment filtering
- Type checking for all attachment components
9. LARGE RESULT SETS
- Chunked result detection and metadata
- Truncation indicators
- Row count and byte count tracking
- Chunk offset information for pagination
10. POLLING STRATEGY
- Configurable polling intervals (default: 2 seconds)
- Maximum attempts limit (default: 30 attempts or max_wait_seconds)
- Terminal state detection to stop polling early
- Detailed poll attempt tracking in responses
- Status validation against known states
=== USAGE EXAMPLES ===
Example 1: Submit a query and poll separately
# Step 1: Submit query
result = query_space_01f0d08866f11370b6735facce14e3ff(
query="What datasets are available?"
)
# Returns: {"conversation_id": "...", "message_id": "...", "status": "SUBMITTED"}
# Step 2: Poll for results
poll_result = poll_response_01f0d08866f11370b6735facce14e3ff(
conversation_id=result["conversation_id"],
message_id=result["message_id"]
)
# Returns full results including text responses, queries, and data
Example 2: Continue conversation
result = query_space_01f0d08866f11370b6735facce14e3ff(
query="What about stock AAPL?",
conversation_id="01f0e34ce9641238a5018229451c2ff2"
)
Example 3: Fetch specific query results
result = get_query_result_01f0d08866f11370b6735facce14e3ff(
conversation_id="01f0e34ce9641238a5018229451c2ff2",
message_id="01f0e34ce97a157983ba500ee38047ea",
attachment_id="01f0e35763041059b7102eca6703d021"
)
=== ERROR RESPONSE FORMAT ===
All functions return errors in a consistent format:
{
"error": "ERROR_CODE",
"message": "Human-readable error description",
"status": "current_status", # If applicable
... additional context fields ...
}
Common error codes:
- INVALID_INPUT: Parameter validation failed
- QUERY_FAILED: Query submission failed
- POLL_FAILED: Polling operation failed
- FETCH_FAILED: Failed to fetch results
- MESSAGE_FAILED: Genie message processing failed
- MESSAGE_CANCELLED: Message was cancelled
- MESSAGE_ERROR: Error during message processing
- TIMEOUT: Operation timed out
- QUERY_EXECUTION_FAILED: SQL execution failed
- QUERY_CANCELLED: SQL query was cancelled
- RESOURCE_NOT_FOUND: Resource doesn't exist
- PERMISSION_DENIED: Insufficient permissions
- UNAUTHENTICATED: Authentication failed
- RESOURCE_EXHAUSTED: Rate limit exceeded
=== TESTING RECOMMENDATIONS ===
To thoroughly test this implementation:
1. Test with empty/invalid queries
2. Test with very long queries (>10k chars)
3. Test conversation continuation with invalid IDs
4. Test polling timeout scenarios
5. Test with non-existent space_id
6. Test network failure scenarios
7. Test rate limiting by rapid requests
8. Test with queries that generate errors in Genie
9. Test chunked results
10. Test invalid attachment IDs
� )�Any�OptionalN)�load_dotenv)�WorkspaceClient)�utilszhttps://�DATABRICKS_HOST� �DATABRICKS_CLIENT_ID�DATABRICKS_CLIENT_SECRET� � � � z9Message has been submitted and is waiting to be processedz$Message is currently being processedz)Message processing completed successfullyzMessage processing failedz Message processing was cancelled�+An error occurred during message processing)� SUBMITTED� EXECUTING� COMPLETED�FAILED� CANCELLED�ERROR> r r r r z!Statement is queued for executionz Statement is currently executingzStatement executed successfullyzStatement execution failedz!Statement execution was cancelledzStatement execution was closed)�PENDING�RUNNING� SUCCEEDEDr r �CLOSED�Invalid request parametersz%The requested resource does not existz+Insufficient permissions to access resourcez1Authentication credentials are missing or invalidz&Rate limit exceeded or quota exhaustedzInternal server error occurredzService temporarily unavailable)�BAD_REQUEST�RESOURCE_NOT_FOUND�PERMISSION_DENIED�UNAUTHENTICATED�RESOURCE_EXHAUSTED�INTERNAL_ERROR�UNAVAILABLE�returnc � � t t t t d�� � S # t $ r$} t dt | � � � �� � �d} ~ ww xY w)z�
Create and return an authenticated WorkspaceClient using M2M OAuth.
Returns:
WorkspaceClient: Authenticated client for Databricks API calls
Raises:
Exception: If authentication fails
z oauth-m2m)�host� client_id�
client_secret� auth_typez(Failed to authenticate with Databricks: N)r �
WORKSPACE_URL�
M2M_CLIENT_ID�M2M_CLIENT_SECRET� Exception�str)�es �B/Users/miles.adkins/Documents/mcp-stonex-udp-genie/server/tools.py�_get_workspace_clientr0 � sm � �M���#�+�!�
�
�
�
�� � M� M� M��K�3�q�6�6�K�K�L�L�L�����M���s �!$ �
A�A
�
AT�method�url�headers�json_payload�timeout�retry_on_failurec �, � d}|rt nd}t |� � D �]d} | � � � dk rt j |||�� � } nC| � � � dk rt j ||||�� � } nt
d| � �� � �| j dk rE| j r| � � � ni }
|
� d d
� � }t d|� �� � �| j dk rt d
� � �| j dk rE| j r| � � � ni }
|
� d d� � }t d|� �� � �| j dk rE| j r| � � � ni }
|
� d d� � }t d|� �� � �| j dk r;||dz
k r#t d|z z }t j
|� � ���t d� � �| j dk r;||dz
k r#t d|z z }t j
|� � ���t d� � �| j dk r;||dz
k r#t d|z z }t j
|� � ��Dt d� � �| � � � | � � � }
n%# t $ r t d| j � �� � �w xY wd|
v r�|
� d d� � }|
� dd� � }|
� dg � � }h d �}||v r,||dz
k r#t d|z z }t j
|� � ��|rd!|� �nd"}t d#|� d$|� |� �� � �|
c S # t j j $ rN}t d%|� d&�� � }||dz
k r't d|z z }t j
|� � Y d}~���Y d}~���d}~wt j j $ rJ}t d'� � }||dz
k r't d|z z }t j
|� � Y d}~���Y d}~���d}~wt j j $ r]}t d(|� �� � }|j j dk r0||dz
k r't d|z z }t j
|� � Y d}~��bY d}~��hd}~wt j j $ rZ}t d)t+ |� � � �� � }||dz
k r't d|z z }t j
|� � Y d}~���Y d}~���d}~wt $ r�}d*t+ |� � v s3d+t+ |� � v s"d,t+ |� � v sd-t+ |� � v r� |}||dz
k r't d|z z }t j
|� � Y d}~��XY d}~��^d}~ww xY w|r|�t d.� � �)/a
Make an API request with comprehensive error handling and automatic retries.
This function implements:
- Exponential backoff for transient errors
- Detailed error classification
- HTTP status code handling
- API-level error detection
- Retry logic for recoverable failures
Args:
method: HTTP method (GET, POST, etc.)
url: Full URL for the request
headers: Request headers
json_payload: Optional JSON payload for POST requests
timeout: Request timeout in seconds
retry_on_failure: Whether to retry on transient failures
Returns:
dict: Response JSON as dictionary
Raises:
Exception: If request fails after all retries or returns unrecoverable error
Nr �GET)r3 r5 �POST)r3 �jsonr5 zUnsupported HTTP method: i� �messager z
BAD_REQUEST: i� zBUNAUTHENTICATED: Authentication credentials are missing or invalidi� zInsufficient permissionszPERMISSION_DENIED: i� zResource not foundzRESOURCE_NOT_FOUND: i� r
z?RESOURCE_EXHAUSTED: Rate limit exceeded. Please try again lateri� z.INTERNAL_ERROR: Internal server error occurredi� zDUNAVAILABLE: Service temporarily unavailable. Please try again laterz$Invalid JSON response. Status code: �
error_code�
Unknown error�UNKNOWN�details> r"