Skip to main content
Glama
Teradata

Teradata MCP Server

Official
by Teradata

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
DATABASE_URIYesTeradata database connection URI in the format: teradata://username:password@host:1025/username
MCP_TRANSPORTNoTransport protocol for MCP communication

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
logging
{}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
extensions
{
  "io.modelcontextprotocol/ui": {}
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
base_columnMetadataA

Retrieve detailed technical column metadata for Teradata tables and views, including exact Teradata type codes, character sets (LATIN/UNICODE), decimal precision, scale, nullability, and index classification. Use when the user needs precise Teradata-specific column information, not just basic column names and types. For a simple list of columns and types for a single object, use base_columnDescription instead. Supports bulk retrieval across many objects with payload and time budgets.

Resolution paths: Tables (T, O, Q) — DBC.ColumnsVX + DBC.IndicesVX. No HELP COLUMN. Views (V) — HELP COLUMN with derived-table wrapper, the only reliable mechanism for resolving view column types.

Uses the native TeradataConnection cursor pattern, consistent with all other tools in this module.

Technical capabilities:

  • Exact Teradata type codes and their SQL type string equivalents

  • Character set information (LATIN, UNICODE, etc.)

  • Decimal precision and scale

  • Detection of broken/invalid views

  • Column-level metadata for all objects in a database at once

LARGE-SCALE USAGE GUIDANCE:

When retrieving metadata for many objects (e.g. all views in DBC), both the response payload and the execution time can exceed limits. Use these strategies to control both:

  1. FILTER FIELDS: Pass only the columns you need via the fields parameter. View rows via HELP COLUMN return ~49 fields by default; table rows via DBC.ColumnsVX return fewer. Trimming to 6-8 fields can reduce payload by 80%+. Three computed fields (ColumnTypeString, IndexTypeString, CharSetString) are always included automatically. Example: fields='ColumnName,ColumnType,ColumnLength,CharType, UpperCase,Nullable,Indexed?,Primary?,Unique?'

  2. EXCLUDE OBJECTS: Use exclude_objects to skip objects you do not need. Accepts SQL LIKE patterns (% wildcard) as a CSV. Applied before any metadata queries, so excluded objects consume zero time and zero payload. Example: exclude_objects='ResUsage%,%ResUsage%,Res%View'

  3. INCREASE PARALLELISM: Set max_workers to 12-16 for large databases. Each worker gets its own Teradata session via conn.cursor(). Default is 8.

  4. FILTER BY KIND: Use table_kind to limit to just the object types you need (e.g. 'V' for views only, 'T' for tables only).

  5. PAYLOAD BUDGET: Use max_payload_kb (default 900) to set the maximum response payload size in kilobytes. When the accumulated result data approaches this limit, the tool stops collecting and returns what it has, plus a remaining_objects CSV in metadata listing the unprocessed objects. Pass that CSV straight into object_name on the next call for automatic continuation. This self-adapts to object sizes: small-column views fit more per call, large-column views page earlier.

  6. TIME BUDGET: Use max_execution_seconds (default 180) to set the maximum wall-clock execution time. The tool monitors elapsed time as each object completes, and self-interrupts BEFORE the MCP transport timeout (typically 240s) kills the session without returning any data. When the time budget is reached, the tool returns all data collected so far plus remaining_objects for continuation — exactly the same pattern as payload budget. This is the key difference from an MCP timeout: a timeout returns NOTHING; a time budget returns EVERYTHING collected so far, plus a continuation token.

CONTINUATION PATTERN (automatic pagination): # Call 1 — starts processing, time or payload budget fills up result1 = base_columnMetadata(database_name='DBC', table_kind='V', ...) # metadata contains: remaining_objects='ViewX,ViewY,...'

# Call 2 — pass remaining_objects as object_name
result2 = base_columnMetadata(
    database_name='DBC',
    object_name='ViewX,ViewY,...',  # from result1 metadata
    ...
)
# Repeat until metadata has no remaining_objects key.

Typical call for a large database: base_columnMetadata( database_name='DBC', table_kind='V', exclude_objects='ResUsage%,%ResUsage%', fields='ColumnName,ColumnType,ColumnLength,CharType, UpperCase,Nullable,Indexed?,Primary?,Unique?', max_workers=16, max_payload_kb=900, max_execution_seconds=180 )

Arguments: conn - TeradataConnection (injected by MCP server) database_name - Name of the Teradata database to inspect object_name - Optional: specific object name, or a CSV of names. Also used for continuation: pass the remaining_objects value from a previous truncated call to resume. If omitted, all objects matching table_kind are processed. table_kind - Optional: CSV of TableKind codes to filter by. Examples: 'V' (views only), 'T,O' (tables + NoPI), 'T,V' (tables and views). Defaults to all qualifying object types (T, O, V, Q). Tables (T, O, Q) use DBC.ColumnsVX + DBC.IndicesVX. Views (V) use HELP COLUMN with a derived-table wrapper to force type resolution — this is the only reliable mechanism for view column types. Stored procedures (P, E), functions (A, F, R, B, S), and macros (M) are not supported. DBC.ColumnsVX does return parameter rows for these object types, but their parameter semantics (IN/OUT/INOUT, SPParameterType) are incompatible with the column metadata model this tool produces. Support is a planned future enhancement. max_workers - Optional: number of parallel threads for view resolution via HELP COLUMN. Default: 8. Table metadata is retrieved via DBC.ColumnsVX and DBC.IndicesVX within the same worker pool. fields - Optional: CSV of field names to include in the response. Reduces payload size significantly. Computed fields (ObjectName, ColumnTypeString, IndexTypeString, CharSetString) always included. exclude_objects - Optional: CSV of object name patterns to exclude. Uses SQL LIKE-style % wildcards. Applied before any database calls — excluded objects incur zero query cost. max_payload_kb - Optional: maximum response payload budget in KB. Default: 900. Set to 0 to disable. max_execution_seconds - Optional: maximum wall-clock execution time in seconds. Default: 180. Set to 0 to disable. *args - Positional bind parameters (reserved) **kwargs - Named bind parameters (reserved)

Returns: MCP-compliant response via create_response() containing a list of column metadata records with normalised keys and four computed string fields per column:

    ColumnTypeString      - Human-readable SQL type (e.g. "VARCHAR(200)
                            UNICODE", "DECIMAL(18,2)", "INTEGER")
    IndexTypeString       - Index classification: 'UPI', 'NUPI', 'USI',
                            'NUSI', or None if not indexed.
                            For tables (T, O, Q): sourced from
                            DBC.IndicesVX — composite index grouping
                            (IndexNumber + ColumnPosition) is fully
                            preserved.
                            For views (V): sourced from HELP COLUMN
                            flags — reports column participation only,
                            not composite index grouping. Query
                            DBC.IndicesVX against the base table for
                            full composite index detail.
    CharSetString         - Character set name: 'LATIN', 'UNICODE',
                            'KANJI1', 'GRAPHIC', 'KANJISJIS', or None.
    CaseSpecificityString - Case attribute: 'UPPERCASE', 'CASESPECIFIC',
                            'NOT CASESPECIFIC', or None if no explicit
                            case attribute is defined on the column.

When truncated, metadata will include:
    remaining_objects  - CSV of unprocessed object names
    truncated          - True
    truncation_reason  - 'time_budget_exceeded' or
                         'payload_budget_exceeded'
    elapsed_seconds    - Wall-clock time consumed (always present)
base_readQueryA

Execute a user-provided SQL query against Teradata and return the results. Use this tool ONLY when the user supplies an explicit SQL statement or a request that includes filter conditions (WHERE clause, aggregations, JOINs, etc.). Do NOT use for simply browsing or sampling rows from a table — use base_tablePreview for that. The sql parameter is required and must contain the full SQL text.

Arguments: sql - SQL text, with optional bind-parameter placeholders persist - Set to True to persist the results as a table and reuse it later. Recommended for large result sets. row_limit - Maximum rows to return (default 1000, ceiling 50000). Pass a higher value when you need more rows.

When the response metadata contains 'truncated: true', more rows exist beyond the limit. To get more data:

  • Pass a higher row_limit (up to 50000) to retrieve more rows in the response.

  • Use persist=true to write all rows to a volatile table and query it directly — this bypasses the row limit entirely and is the recommended approach for large result sets.

Returns: ResponseType: formatted response with query results + metadata (includes 'volatile_table' field in metadata if persist=True) (includes 'truncated' and 'row_limit' in metadata when results are capped)

base_saveDDLA

Extract the DDL for a Teradata table, view, or stored procedure and SAVE it as a .sql file on disk. Use this tool ONLY when the user explicitly wants to export, write, download, or persist DDL to a file. Do NOT use simply to display or view DDL in the conversation — use base_tableDDL to display DDL without saving.

Arguments: database_name - Database name (e.g., 'MKTG_USR') table_name - Object name (e.g., 'SP_LOAD_VARIABLES_ARGUMENTARIO_IAG_FICHA_CLIENTE'). Accepts comma-separated values for bulk retrieval. object_type - Type of object: 'PROCEDURE', 'TABLE', 'VIEW' (default: 'PROCEDURE') output_dir - Directory where to save the DDL file (default: './ddls_extracted')

Returns: ResponseType: formatted response with file path, size, and metadata

plot_line_chartA

Generate a line chart that reads directly from a Teradata table — do NOT use base_readQuery to pre-fetch data first. Specify the table in table_name, the x-axis column in labels (typically a date or time field), and one or more y-axis numeric columns in columns. Use for time-series, trend lines, or sequential data. Do NOT use for proportional category breakdowns — use plot_pie_chart or plot_polar_chart. Do NOT use for multi-dimensional spider comparisons — use plot_radar_chart.

PARAMETERS: table_name: Required Argument. Specifies the name of the table to generate the line chart. Types: str

labels:
    Required Argument.
    Specifies the x-axis column (typically date or time).
    Types: str

columns:
    Required Argument.
    Specifies the y-axis numeric column(s) for the line chart.
    Types: List[str]

RETURNS: dict

plot_pie_chartA

Generate a pie chart that reads directly from a Teradata table — do NOT use base_readQuery to pre-fetch or aggregate data first. Specify the table in table_name, the category column in labels, and the numeric value column in column. Use when the user asks for proportions, shares, or how a total breaks down by category. For polar area charts, use plot_polar_chart. For time-series trends, use plot_line_chart.

PARAMETERS: table_name: Required Argument. Specifies the name of the table to generate the pie chart. Types: str

labels:
    Required Argument.
    Specifies the category column for labels.
    Types: str

column:
    Required Argument.
    Specifies the numeric value column for the pie chart.
    Types: str

RETURNS: dict

plot_polar_chartA

Generate a polar area chart that reads directly from a Teradata table — do NOT use base_readQuery first. Specify the table in table_name, the category column in labels, and the numeric value column in column. Use when the user explicitly asks for a polar chart or polar area chart. For standard pie-style breakdowns, use plot_pie_chart instead.

PARAMETERS: table_name: Required Argument. Specifies the name of the table to generate the polar chart. Types: str

labels:
    Required Argument.
    Specifies the category column for labels.
    Types: str

column:
    Required Argument.
    Specifies the numeric value column for the polar chart.
    Types: str

RETURNS: dict

plot_radar_chartA

Generate a radar chart (spider chart or web chart) that reads directly from a Teradata table — do NOT use base_readQuery to pre-fetch data first. Specify the table in table_name, the category column in labels, and one or more value columns in columns. Use when the user asks for a spider chart, radar chart, web chart, or multi-dimensional comparison across categories. For time-series or trend data, use plot_line_chart instead.

PARAMETERS: table_name: Required Argument. Specifies the name of the table to generate the radar chart. Types: str

labels:
    Required Argument.
    Specifies the category column for labels.
    Types: str

columns:
    Required Argument.
    Specifies the value column(s) for the radar chart.
    Types: str | List[str]

RETURNS: dict

sql_Analyze_Cluster_StatsA

ANALYZE SQL QUERY CLUSTER PERFORMANCE STATISTICS

This tool analyzes pre-computed cluster statistics to identify optimization opportunities without re-running the clustering pipeline. Perfect for iterative analysis and decision-making on which query clusters to focus optimization efforts.

ANALYSIS CAPABILITIES:

  • Performance Ranking: Sort clusters by any performance metric to identify top resource consumers

  • Resource Impact Assessment: Compare clusters by CPU usage, I/O volume, and execution complexity

  • Skew Problem Detection: Identify clusters with CPU or I/O distribution issues

  • Workload Characterization: Understand query patterns by user, application, and workload type

  • Optimization Prioritization: Focus on clusters with highest impact potential

AVAILABLE SORTING METRICS:

  • avg_cpu: Average CPU seconds per cluster (primary optimization target)

  • avg_io: Average logical I/O operations (scan intensity indicator)

  • avg_cpuskw: Average CPU skew (distribution problem indicator)

  • avg_ioskw: Average I/O skew (hot spot indicator)

  • avg_pji: Average Physical-to-Logical I/O ratio (compute intensity)

  • avg_uii: Average Unit I/O Intensity (I/O efficiency)

  • avg_numsteps: Average query plan complexity

  • queries: Number of queries in cluster (frequency indicator)

  • cluster_silhouette_score: Clustering quality measure

PERFORMANCE CATEGORIZATION: Automatically categorizes clusters using configurable thresholds (from sql_opt_config.yml):

  • HIGH_CPU_USAGE: Average CPU > config.performance_thresholds.cpu.high

  • HIGH_IO_USAGE: Average I/O > config.performance_thresholds.io.high

  • HIGH_CPU_SKEW: CPU skew > config.performance_thresholds.skew.high

  • HIGH_IO_SKEW: I/O skew > config.performance_thresholds.skew.high

  • NORMAL: Clusters within configured normal performance ranges

TYPICAL ANALYSIS WORKFLOW:

  1. Sort by 'avg_cpu' or 'avg_io' to find highest resource consumers

  2. Sort by 'avg_cpuskw' or 'avg_ioskw' to find distribution problems

  3. Use limit_results to focus on top problematic clusters

OPTIMIZATION DECISION FRAMEWORK:

  • High CPU + High Query Count: Maximum impact optimization candidates

  • High Skew + Moderate CPU: Distribution/statistics problems

  • High I/O + Low PJI: Potential indexing opportunities

  • High NumSteps: Complex query rewriting candidates

OUTPUT FORMAT: Returns detailed cluster statistics with performance rankings, categories, and metadata for LLM analysis and optimization recommendations.

sql_Execute_Full_PipelineA

COMPLETE SQL QUERY CLUSTERING PIPELINE FOR HIGH-USAGE QUERY OPTIMIZATION

This tool executes the entire SQL query clustering workflow to identify and analyze high CPU usage queries for optimization opportunities. It's designed for database performance analysts and DBAs who need to systematically identify query optimization candidates.

FULL PIPELINE WORKFLOW:

  1. Query Log Extraction: Extracts SQL queries from DBC.DBQLSqlTbl with comprehensive performance metrics

  2. Performance Metrics Calculation: Computes CPU skew, I/O skew, PJI (Physical to Logical I/O ratio), UII (Unit I/O Intensity)

  3. Query Tokenization: Tokenizes SQL text using {sql_clustering_config.get('model', {}).get('model_id', 'bge-small-en-v1.5')} tokenizer via ivsm.tokenizer_encode

  4. Embedding Generation: Creates semantic embeddings using ivsm.IVSM_score with ONNX models

  5. Vector Store Creation: Converts embeddings to vector columns via ivsm.vector_to_columns

  6. K-Means Clustering: Groups similar queries using TD_KMeans with optimal K from configuration

  7. Silhouette Analysis: Calculates clustering quality scores using TD_Silhouette

  8. Statistics Generation: Creates comprehensive cluster statistics with performance aggregations

PERFORMANCE METRICS EXPLAINED:

  • AMPCPUTIME: Total CPU seconds across all AMPs (primary optimization target)

  • CPUSKW/IOSKW: CPU/I/O skew ratios (>2.0 indicates distribution problems)

  • PJI: Physical-to-Logical I/O ratio (higher = more CPU-intensive)

  • UII: Unit I/O Intensity (higher = more I/O-intensive relative to CPU)

  • LogicalIO: Total logical I/O operations (indicates scan intensity)

  • NumSteps: Query plan complexity (higher = more complex plans)

CONFIGURATION (from sql_opt_config.yml):

  • Uses top {default_max_queries} queries by CPU time (configurable)

  • Creates {default_optimal_k} clusters by default (configurable via optimal_k parameter)

  • Embedding model: {sql_clustering_config.get('model', {}).get('model_id', 'bge-small-en-v1.5')}

  • Vector dimensions: {sql_clustering_config.get('embedding', {}).get('vector_length', 384)}

  • All database and table names are configurable

OPTIMIZATION WORKFLOW: After running this tool, use:

  1. sql_Analyze_Cluster_Stats to identify problematic clusters

  2. sql_Retrieve_Cluster_Queries to get actual SQL from target clusters

  3. LLM analysis to identify patterns and propose specific optimizations

USE CASES:

  • Identify query families consuming the most system resources

  • Find queries with similar patterns but different performance

  • Discover optimization opportunities through clustering analysis

  • Prioritize DBA effort on highest-impact query improvements

  • Understand workload composition and resource distribution

PREREQUISITES:

  • DBC.DBQLSqlTbl and DBC.DBQLOgTbl must be accessible

  • Embedding models and tokenizers must be installed in feature_ext_db

  • Sufficient space in feature_ext_db for intermediate and final tables

sql_Retrieve_Cluster_QueriesA

RETRIEVE ACTUAL SQL QUERIES FROM SPECIFIC CLUSTERS FOR PATTERN ANALYSIS

This tool extracts the actual SQL query text and performance metrics from selected clusters, enabling detailed pattern analysis and specific optimization recommendations. Essential for moving from cluster-level analysis to actual query optimization.

DETAILED ANALYSIS CAPABILITIES:

  • SQL Pattern Recognition: Analyze actual query structures, joins, predicates, and functions

  • Performance Correlation: Connect query patterns to specific performance characteristics

  • Optimization Identification: Identify common anti-patterns, missing indexes, inefficient joins

  • Code Quality Assessment: Evaluate query construction, complexity, and best practices

  • Workload Understanding: See actual business logic and data access patterns

QUERY SELECTION STRATEGIES:

  • By CPU Impact: Sort by 'ampcputime' to focus on highest CPU consumers

  • By I/O Volume: Sort by 'logicalio' to find scan-intensive queries

  • By Skew Problems: Sort by 'cpuskw' or 'ioskw' for distribution issues

  • By Complexity: Sort by 'numsteps' for complex execution plans

  • By Response Time: Sort by 'response_secs' for user experience impact

AVAILABLE METRICS FOR SORTING:

  • ampcputime: Total CPU seconds (primary optimization target)

  • logicalio: Total logical I/O operations (scan indicator)

  • cpuskw: CPU skew ratio (distribution problems)

  • ioskw: I/O skew ratio (hot spot indicators)

  • pji: Physical-to-Logical I/O ratio (compute intensity)

  • uii: Unit I/O Intensity (I/O efficiency)

  • numsteps: Query execution plan steps (complexity)

  • response_secs: Wall-clock execution time (user impact)

  • delaytime: Time spent in queue (concurrency issues)

AUTOMATIC PERFORMANCE CATEGORIZATION: Each query is categorized using configurable thresholds (from sql_opt_config.yml):

  • CPU Categories: VERY_HIGH_CPU (>config.very_high), HIGH_CPU (>config.high), MEDIUM_CPU (>10s), LOW_CPU

  • CPU Skew: SEVERE_CPU_SKEW (>config.severe), HIGH_CPU_SKEW (>config.high), MODERATE_CPU_SKEW (>config.moderate), NORMAL

  • I/O Skew: SEVERE_IO_SKEW (>config.severe), HIGH_IO_SKEW (>config.high), MODERATE_IO_SKEW (>config.moderate), NORMAL

Use thresholds set in config file for, CPU - high, very_high, Skew moderate, high, severe

TYPICAL OPTIMIZATION WORKFLOW:

  1. Start with clusters identified from sql_Analyze_Cluster_Stats

  2. Retrieve top queries by impact metric (usually 'ampcputime')

  3. Analyze SQL patterns for common issues:

    • Missing WHERE clauses or inefficient predicates

    • Cartesian products or missing JOIN conditions

    • Inefficient GROUP BY or ORDER BY operations

    • Suboptimal table access patterns

    • Missing or outdated statistics

  4. Develop specific optimization recommendations

QUERY LIMIT STRATEGY:

  • Use the query limit set in config file for pattern recognition and analysis, unless user specifies a different limit

OUTPUT INCLUDES:

  • Complete SQL query text for each query

  • All performance metrics, user, application, and workload context, cluster membership and rankings

  • Performance categories for quick filtering

rag_Execute_WorkflowA

Execute complete RAG workflow to answer user questions based on document context. This tool handles the entire RAG pipeline in a single step when a user query is tagged with /rag.

WORKFLOW STEPS (executed automatically):

  1. Configuration setup using configurable values from rag_config.yml

  2. Store user query with '/rag ' prefix stripping

  3. Generate query embeddings using either BYOM (ONNXEmbeddings) or IVSM functions based on config

  4. Perform semantic search against precomputed chunk embeddings

  5. Return context chunks for answer generation

CONFIGURATION VALUES (from rag_config.yml):

  • version: 'ivsm' or 'byom' to select embedding approach

  • All database names, table names, and model settings are configurable

  • Vector store metadata fields are dynamically detected

  • Embedding parameters are configurable

  • Default chunk retrieval count is configurable

  • Default values are provided as fallback

TECHNICAL DETAILS:

  • Strips the '/rag ' prefix if present from user questions

  • Creates query table if it does not exist (columns: id, txt, created_ts)

  • BYOM approach: Uses mldb.ONNXEmbeddings UDF for tokenization and embedding

  • IVSM approach: Uses ivsm.tokenizer_encode and ivsm.IVSM_score functions

  • Both approaches store embeddings in configured output table

  • Uses cosine similarity via TD_VECTORDISTANCE for semantic search

  • Returns the top-k matching chunks from the configured vector store

  • Each result includes chunk text, similarity score, and metadata fields

CRITICAL ANSWERING RULES:

  • Answer ONLY using retrieved chunks - no external knowledge, speculation, or inference

  • Quote source content directly without paraphrasing, summarizing, or rewriting

  • If no relevant context found: "Not enough information found in the provided context"

  • If partial context: "The available context does not fully answer the question"

  • Include document/page references when available (e.g., "On page 2 of 'demo_policy.pdf'...")

  • Execute entire workflow silently without showing function calls to user - only show final answer

LANGUAGE RESTRICTIONS:

  • Do not say "According to the context" or "The context says" - just quote directly

  • Do not say "It can be inferred that..." - no inference allowed

  • Use exact or near-verbatim quotes only

  • No paraphrasing, summarizing, or adding transitions between quotes

  • Copy content precisely - no expansion or interpretation

BEHAVIORAL REQUIREMENTS:

  • Execute all RAG workflow steps silently without showing tool calls to users

  • Only display final answers based strictly on retrieved context chunks

  • Do not use external knowledge or speculation - only quote retrieved content directly

  • RAG mode only triggers with explicit /rag prefix - do not enter RAG mode implicitly

  • Hide all tool execution steps from user

  • Provide clean, direct response with only the answer based on retrieved chunks

  • This is a complete end-to-end RAG workflow - no additional tools needed

EXECUTION: Run completely silently - user only sees their question and the final context-grounded answer.

graph_analyseDatabaseA

Composite graph analysis — runs findRootObjects, connectedComponents, detectCycles, and bfsLevels in a single MCP call with ONE shared edge fetch.

This tool eliminates the scalability bottleneck of serial MCP round- trips by combining four graph analyses that would otherwise require four separate tool calls, each independently fetching the same edge set from Teradata.

Performance vs individual tools:

  • 1 SQL round-trip instead of 4 (shared edge fetch)

  • 1 MCP response instead of 4 (eliminates stdio serialisation overhead)

  • Same algorithmic complexity (O(V+E) BFS, O(α·N) Union-Find, O(V+E) DFS)

  • In-memory edge sharing: all analyses operate on the same Python list

Use this for:

  • Full database migration readiness assessment

  • Pre-migration cycle + root + wave analysis in one call

  • Dashboard data population (all four analyses needed simultaneously)

  • Any workflow that would otherwise call 3+ individual graph tools

Arguments: container_pattern - str: CSV LIKE patterns for container scope. Supports wildcards (%) and CSV format. Examples: '%SALES%', '%SALES%,%FINANCE%', 'PROD_%'

                  CRITICAL: STRING type, not array.
                  CORRECT: container_pattern="%SALES%,%FINANCE%"
                  WRONG:   container_pattern=["%SALES%", "%FINANCE%"]

exclude_objects - str: CSV LIKE patterns to exclude. Default: '' (no exclusions)

top_n_roots - int: Number of top root objects (by downstream dependent count) to include in BFS wave analysis. Default: 4

max_depth_down - int: Maximum downstream BFS hops from roots. Default: 10

max_depth_up - int: Maximum upstream BFS hops from roots. 0 = skip upstream analysis. Default: 0

edge_repository - str: Edge repository view/table conforming to the Graph Edge Contract (Src_Container_Name, Src_Object_Name, Src_Kind, Tgt_Container_Name, Tgt_Object_Name, Tgt_Kind columns). Call graph_edgeContractDDL to generate one. Required parameter — no default.

Returns: ResponseType: single response containing all four analyses:

{ "root_objects": { "objects": [...], "summary": {...} }, "components": { "node_details": [...], "summaries": [...], "stats": [...] }, "cycles": { "details": [...], "summaries": [...], "stats": [...] }, "bfs_waves": { "nodes": [...], "cycle_candidates": [...], "summary": {...} }, "edge_stats": { "total_edges": N, "fetch_time_ms": N } }

Example calls:

Full analysis of Sales and Finance databases

handle_graph_analyseDatabase( conn=connection, container_pattern="%SALES%,%FINANCE%", edge_repository="MY_LINEAGE_DB.EdgeRepository" )

Single database family with top 8 roots

handle_graph_analyseDatabase( conn=connection, container_pattern="%FINANCE%", top_n_roots=8, edge_repository="MY_LINEAGE_DB.EdgeRepository" )

Exclude sandbox schemas

handle_graph_analyseDatabase( conn=connection, container_pattern="PROD_%,STAGE_%", exclude_objects="SANDBOX%,%.temp_%", edge_repository="MY_LINEAGE_DB.EdgeRepository" )

graph_bfsLevelsA

Compute BFS shortest-path hop distances from one or more root nodes.

Pure-Python implementation — no stored procedure required.

WHEN TO USE THIS TOOL vs graph_traceLineage:

Use graph_bfsLevels when asked to:

  • Sequence objects for deployment or migration (ORDER BY downstream_level gives correct topological deployment order for root objects)

  • Group objects into migration waves (nearest_root identifies which of the input root tables each object belongs to)

  • Find which migration root table each object is closest to across a multi-root migration scope

  • Identify cycle members by depth (direction='BOTH' nodes with unequal absolute upstream/downstream levels are cycle candidates)

  • Count objects within N hops of a change (blast-radius sizing)

  • Answer "how far is object X from the migration root tables?"

Do NOT use graph_bfsLevels for general lineage tracing, impact path analysis, or questions about which specific objects depend on which. Use graph_traceLineage for those — it returns the full edge set with relationship detail. graph_bfsLevels returns distances and wave groupings, not dependency paths or edge detail.

KEY DISTINCTION — root_node_list accepts EXACT FQ names only (no wildcards). Use graph_findRootObjects first to identify the seed objects, then pass their exact FQ names here.

Arguments: root_node_list - str: CSV of exact fully-qualified root node names. No wildcards — exact names only.

                  SINGLE ROOT:
                  'DEV01_StGeo_STD_T.mortgage_account'

                  MULTIPLE ROOTS (CSV):
                  'DEV01_StGeo_STD_T.mortgage_account,
                   DEV01_StGeo_STD_T.mortgage_borrower,
                   DEV01_StGeo_STD_T.mortgage_property'

                  CRITICAL: Exact FQ names, no wildcards.
                  Use graph_findRootObjects or
                  graph_traceLineage first to discover names.

max_depth_up - int: Maximum upstream hops to traverse. 0 = skip upstream analysis entirely. Default: 10

                  Upstream means "what this object DEPENDS ON" —
                  its sources, prerequisites, and ancestors.
                  For root objects with in-degree zero, upstream_level
                  will be NULL for all non-root nodes (correct).

max_depth_down - int: Maximum downstream hops to traverse. 0 = skip downstream analysis entirely. Default: 10

                  Downstream means "what DEPENDS ON this object" —
                  its consumers, dependents, and impact radius.
                  For root objects with in-degree zero, downstream_level
                  will show positive values for all consumers (correct).

exclude_objects - str: CSV of FQ object name LIKE patterns to exclude. Matched against both Src and Tgt sides of every edge. Python fnmatch is used for pattern matching (% → *). Example: 'DFJ%,C_D02%,%.temp_%' Default: '' (no exclusions)

include_containers - str: CSV of container name LIKE patterns to include. Only edges where BOTH Src and Tgt containers match at least one pattern are traversed. Python fnmatch used for matching (% → *). Empty = all containers included. Example: 'DEV01_StGeo%,MF_STGEO%,TABLEAU%,POWERBI%' Default: '' (all containers)

edge_repository - str: Edge repository view/table conforming to the Required parameter — no default.

Returns: ResponseType: formatted response with BFS node results + metadata. Schema is identical to handle_graph_bfsLevels (SP-based tool).

Response structure: { "nodes": [ { "node": "DEV01_StGeo_STD_T.mortgage_account", "container_name": "DEV01_StGeo_STD_T", "object_name": "mortgage_account", "object_kind": "Table", "upstream_level": None, // None (NULL) if unreachable or skipped "downstream_level": 0, // 0 for root, positive for consumers "nearest_root": "DEV01_StGeo_STD_T.mortgage_account", "direction": "ROOT", // ROOT / U / D / BOTH "is_root": "Y" }, ... ], "cycle_candidates": [...], // direction='BOTH' nodes with unequal // absolute upstream/downstream levels "summary": { "total_nodes": 46, "root_nodes": 3, "upstream_only": 12, "downstream_only": 28, "both_directions": 3, "cycle_candidates": 1, "max_upstream_depth": 4, "max_downstream_depth": 5, "nodes_per_nearest_root": {"DB.Root1": 20, "DB.Root2": 26}, "object_kind_counts": {"Table": 10, "View": 22, "Macro": 8, ...} } }

direction values: ROOT - One of the input root nodes U - Reachable upstream only (negative upstream_level) D - Reachable downstream only (positive downstream_level) BOTH - Reachable in both directions — possible cycle member. Unequal absolute levels indicate a back-edge (cycle). Equal absolute levels indicate a shared dependency.

Technical Implementation Notes:

  • One SQL round-trip to fetch all edges matching the container/exclusion filters. All BFS computation is then done in Python memory.

  • Standard queue-based BFS (O(V+E)) — optimal for unweighted graphs. This is more correct than the original Bellman-Ford style SQL relaxation loop that the SP inherited from the notebook.

  • Multi-source BFS: all root nodes are seeded simultaneously at level 0. Each non-root node settles at the distance to its nearest root, with ties broken deterministically by lexicographic root name order.

  • Upstream BFS follows Src→Tgt edges to discover Src-side ancestors.

  • Downstream BFS follows Tgt→Src edges to discover Tgt-side consumers.

  • This direction convention matches the corrected SP (Option B fix): upstream_level = NULL for root objects with in-degree zero (correct) downstream_level = positive for all consumers (correct)

  • Filter application order:

    1. SQL WHERE clause: fetch only edges matching include_containers (both Src and Tgt containers must match at least one pattern)

    2. Python post-filter: exclude edges where either endpoint matches an exclude_objects pattern (applied before building adjacency)

    3. BFS depth cap: enforced during queue processing

  • Node metadata (container_name, object_name, object_kind) is derived from the edge set and stored in a node registry during the fetch phase.

graph_connectedComponentsA

Identify all Weakly Connected Components (WCC) in the dependency graph.

Pure-Python implementation — no stored procedure required. Issues a single SQL SELECT to fetch the scoped edge set, then performs Union-Find WCC partitioning entirely in the MCP server process.

A connected component is a maximal set of nodes where every node can reach every other node when edge direction is ignored. This partitions the graph into isolated sub-graphs.

Use this tool for:

  • Understanding graph structure and partitioning

  • Identifying isolated sub-graphs

  • Scoping downstream impact analysis to a single component

  • Pre-filtering before cycle detection (cycles exist only within a component)

  • Identifying "islands" of related objects for migration or refactoring

  • Estimating blast radius

Arguments: container_pattern - str: CSV LIKE patterns for container scope. Supports wildcards (%) and CSV format. Examples: '%WBC%', '%WBC%,%StGeo%', 'DEV01_%,DEV02_%'

                  CRITICAL: STRING type, not array.
                  CORRECT: container_pattern="%WBC%,%StGeo%"
                  WRONG:   container_pattern=["%WBC%", "%StGeo%"]

exclude_objects - str: CSV LIKE patterns to exclude. Matches against container name (or DB.Object if the pattern contains a dot). Default: '' (no exclusions)

edge_repository - str: Edge repository view/table conforming to the Graph Edge Contract (Src_Container_Name, Src_Object_Name, Src_Kind, Tgt_Container_Name, Tgt_Object_Name, Tgt_Kind columns). For AI-Native Data Products use: '{ProductName}_Semantic.lineage_graph' Call graph_edgeContractDDL to generate a new one. Required — no default.

Returns: ResponseType: formatted response with connected component results.

Response structure: { "node_details": [...], // One row per node with Component_Id "component_summaries": [...], // One row per component "summary_stats": [...] // Single aggregate row }

node_details row fields: Node_FQ, DatabaseName, ObjectName, Component_Id, Object_Kind

component_summaries row fields: Component_Id, Node_Count, Node_List

summary_stats row fields: Component_Count, Node_Count, Edge_Count, Largest_Component, Smallest_Component, Singleton_Count, Summary_Message

graph_detectCyclesA

Detect circular dependencies (cycles) in the dependency graph.

Pure-Python implementation — no stored procedure required. Issues a single SQL SELECT to fetch the scoped edge set, then performs WCC partitioning followed by iterative DFS cycle detection entirely in the MCP server process.

Use this tool for:

  • Validating graph integrity (DAG property)

  • Finding objects that form circular references

  • Identifying stub-then-replace code patterns

  • Debugging topological sort hangs

  • Pre-deployment cycle checks

Arguments: container_pattern - str: CSV LIKE patterns for container scope. Supports wildcards (%) and CSV format. Examples: 'DFJ%' — single database family '%WBC%,%StGeo%' — multiple families 'DEV01_%,DEV02_%' — multiple prefixes

exclude_objects - str: CSV LIKE patterns to exclude from the scan. Matches against container name (or DB.Object if the pattern contains a dot). Default: '' (no exclusions)

edge_repository - str: Edge repository view/table conforming to the Graph Edge Contract (Src_Container_Name, Src_Object_Name, Src_Kind, Tgt_Container_Name, Tgt_Object_Name, Tgt_Kind columns). For AI-Native Data Products use: '{ProductName}_Semantic.lineage_graph' Call graph_edgeContractDDL to generate a new one. Required — no default.

Returns: ResponseType: formatted response with cycle detection results.

Response structure: { "cycle_details": [...], // One row per node per cycle "cycle_summaries": [...], // One row per cycle with path string "summary_stats": [...] // Single aggregate row }

cycle_details row fields: Cycle_Id, Cycle_Pos, Node_FQ, Cycle_Length, Component_Id

cycle_summaries row fields: Cycle_Id, Cycle_Length, Component_Id, Cycle_Path

summary_stats row fields: Cycle_Count, Total_Nodes_In_Cycles, Components_With_Cycles, Edge_Count, Components_Scanned, Summary_Message

graph_edgeContractDDLA

Generate DDL for a Graph Edge Contract-conforming table or view.

This tool does NOT require a database connection — it generates DDL text from templates. No SQL is executed. The conn parameter is accepted for ModuleLoader calling convention compatibility but is not used.

Required columns in the generated schema (6): Src_Container_Name, Src_Object_Name, Src_Kind, Tgt_Container_Name, Tgt_Object_Name, Tgt_Kind

Optional enrichment columns (2): Edge_Relationship — nature of the edge (ETL_INPUT, ETL_OUTPUT, DIRECT…) Transformation_Type — process category (ETL, FEATURE_ENG, AGGREGATION…) These are ignored by graph analysis tools but useful for visualisation.

AI-Native Data Product shortcut: If you are working within an AI-Native Data Product, the view {ProductName}Semantic.lineage_graph (Observability Module v1.5) already conforms to this contract. You do not need to generate DDL — pass that view's fully-qualified name directly as edge_repository on any graph* tool. Example: edge_repository='StGeoMortgage_Semantic.lineage_graph'

Arguments: conn: TeradataConnection (unused — accepted for ModuleLoader compatibility). target_database: Database in which to create the edge repository. For AI-Native Data Products this is typically {ProductName}_Semantic. Example: 'StGeoMortgage_Semantic' object_name: Name for the edge table/view. Default: 'EdgeRepository' output_type: 'TABLE' or 'VIEW'. TABLE: generates CREATE TABLE DDL + separate sample DML. Includes all 6 required + 2 optional columns. VIEW: generates a CREATE VIEW template for mapping an existing lineage source to all 8 contract columns. Default: 'TABLE'

Returns: list[dict]: Response payload containing: - ddl: DDL script (CREATE TABLE/VIEW + COMMENTs) - sample_dml: Sample INSERT statements + validation query (TABLE only; absent for VIEW) - output_type: 'TABLE' or 'VIEW' - contract_version: Contract version string

graph_findRootObjectsA

Find root objects (objects with no upstream dependencies) in specified containers.

Root objects are ideal starting points for downstream impact analysis as they represent the foundational data sources that nothing else depends upon.

Use this for:

  • Finding starting points for downstream impact analysis

  • Identifying source tables and base objects in data pipelines

  • Discovering independent objects that can be safely analysed in isolation

  • Understanding data flow origins in a schema or database

  • Planning migration or refactoring by identifying foundation objects

Arguments: container_pattern - str: Database/schema pattern(s) to search. SUPPORTS WILDCARDS (%) and CSV.

                  IMPORTANT: This is a STRING parameter (type: str), not an array.
                  Pass multiple patterns as a single comma-separated string.

                  SINGLE CONTAINER:
                  'DEV01_StGeo_STD_T' - Specific database

                  WILDCARDS (%):
                  '%WBC%' - All databases containing WBC
                  'DEV01_%' - All databases starting with DEV01_
                  '%_STD_T' - All databases ending with _STD_T

                  MULTIPLE CONTAINERS (CSV format):
                  '%WBC%,%StGeo%' - All WBC and StGeo databases
                  'DEV01_StGeo_STD_T,DEV02_WBC_STD_T' - Specific databases
                  'DEV01_%,DEV02_%' - All DEV01 and DEV02 databases

                  WHITESPACE HANDLING:
                  Whitespace is automatically trimmed, so these are equivalent:
                  ✅ '%WBC%,%StGeo%' (no spaces)
                  ✅ '%WBC%, %StGeo%' (spaces after commas - OK)

                  HOW TO PASS IN CODE:
                  Python: container_pattern="%WBC%,%StGeo%"
                  JSON: {"container_pattern": "%WBC%,%StGeo%"}

                  CRITICAL: This is a STRING type parameter.
                  ✅ CORRECT: Pass as string: container_pattern="%WBC%,%StGeo%"
                  ❌ WRONG: Pass as array: container_pattern=["%WBC%", "%StGeo%"]

exclude_objects - str: Comma-separated list of patterns to exclude (SERVER-SIDE filter). Matches against DatabaseName.ObjectName format.

                  Common exclusion patterns:
                  'PRD_%,PROD_%' - Exclude production databases
                  '%.temp_%,%.bak_%' - Exclude temporary and backup objects
                  'DFJ%,C_D02%' - Exclude personal/sandbox schemas

                  Performance: Reduces result set and improves query time
                  Default: '' (empty string = no exclusions)

edge_repository - str: Edge repository table/view conforming to the Required parameter — no default.

object_types - str: Comma-separated list of object types to include (optional filter). Examples: 'T' (tables), 'V' (views), 'P' (procedures), 'M' (macros) Multiple: 'T,V' (tables and views only) Empty = all object types included Default: '' (all types)

return_format - str: Output format: 'detailed' or 'summary' 'detailed' (default): Full object list with metadata 'summary': High-level statistics and counts only Default: 'detailed'

Returns: ResponseType: formatted response with root objects + metadata

Example queries that trigger this tool:

  • "Which objects in WBC and StGeo databases have no dependencies?"

  • "Find root objects in DEV01 databases"

  • "What are the starting points for impact analysis in StGeo?"

  • "Show me base tables with no upstream dependencies"

  • "Which objects should I start analysing for downstream impact?"

Example calls:

Find root objects in WBC and StGeo databases

handle_graph_findRootObjects( conn=connection, container_pattern="%WBC%,%StGeo%" )

Find only root tables (no views/procedures)

handle_graph_findRootObjects( conn=connection, container_pattern="DEV01_%", object_types="T" )

Find root objects excluding production and temporary objects

handle_graph_findRootObjects( conn=connection, container_pattern="%WBC%,%StGeo%", exclude_objects="PRD_%,%.temp_%,%.bak_%" )

Quick summary of root objects

handle_graph_findRootObjects( conn=connection, container_pattern="DEV01_StGeo_STD_T", return_format="summary" )

Technical Implementation:

  • Queries the edge repository to find all objects in specified containers

  • Identifies objects that appear as sources but never as targets

  • These are "root" objects - they have no upstream dependencies

  • Results are filtered by exclude_objects and object_types parameters

  • Returns list of root objects suitable for downstream impact analysis

graph_traceLineageA

Analyse object dependencies in Teradata. Supports wildcards (%) and CSV patterns.

Hybrid implementation — no stored procedure required. Python constructs Teradata recursive CTEs that execute entirely server-side. Only the reachable subgraph crosses the network — not the full edge table.

Examples: 'DB.Table' (single), '%WBC%.%' (wildcard), 'DB.T1,DB.T2' (CSV)

Finds upstream dependencies (what the object depends on) and downstream dependents (what depends on the object). Returns nodes and edges representing the dependency subgraph.

When multiple patterns are provided via CSV, one upstream CTE and one downstream CTE is executed per pattern. Results are merged and deduplicated by Python before assembly.

Use this for:

  • Impact analysis: "What breaks if I change or drop this object?"

  • Lineage tracing: "Where does this data come from?"

  • Dependency discovery: "What does this object use?"

  • Pre-deployment validation: checking impacts before making changes

Arguments: object_name - str: Object name pattern(s). Supports wildcards (%) and CSV format. STRING type — not an array.

                   Single:   'DEV01_StGeo_STD_T.mortgage_account'
                   Wildcard: '%WBC%.%'
                   Multiple: '%WBC%.%,%StGeo%.%'

max_depth_up - int: Maximum levels to traverse upstream (0-10). 0 = no upstream analysis. Default: 3

max_depth_down - int: Maximum levels to traverse downstream (0-10). 0 = no downstream analysis. Default: 3

exclude_objects - str: CSV LIKE patterns to exclude. Matches against DB.Object format. Example: 'PRD_%,%.temp_%' Default: '' (no exclusions)

include_containers - str: CSV of container LIKE patterns to include (whitelist). Empty = all containers. Default: '' (all containers)

edge_repository - str: Edge repository view/table conforming to the Required parameter — no default.

return_format - str: 'detailed' (default), 'summary', or 'edges_only'

Returns: ResponseType: formatted response with dependency analysis results.

detailed response structure: { "nodes": [...], // Unique nodes (deduplicated) "upstream_edges": [...], // One row per upstream edge "downstream_edges":[...], // One row per downstream edge "summary": {...} // Aggregate statistics }

Edge row fields: DependentObjectDBName, DependentObjectName, FQDependentObjectName, ReferencedObjectDBName, ReferencedObjectName, FQReferencedObjectName, Src_Kind, Tgt_Kind, Depth, DependencyPath

sec_userDbPermissionsA

List the database-level access permissions (SELECT, INSERT, UPDATE, DELETE, etc.) granted directly to a specific Teradata user across all databases. Use when the user asks what a named user can DO in each database — their access rights, grants, or privileges on database objects. Do NOT use to see what roles a user has — use sec_userRoles for that. Requires a user name.

Arguments: user_name - User name to analyze. persist - If True, materializes result as a volatile table and returns table name

sec_rolePermissionsA

List the database-level permissions granted to a named Teradata role. Use when the user asks what access rights a ROLE has, what a role is allowed to do, or what permissions have been granted to a role. Do NOT confuse with user-level queries — use sec_userDbPermissions for a user's direct permissions or sec_userRoles for a user's role membership. Requires a role name.

Arguments: role_name - Role name to analyze. persist - If True, materializes result as a volatile table and returns table name

sec_userRolesA

List the roles currently assigned to a specific Teradata user account. Use when the user asks which roles a named user HAS, belongs to, or has been assigned. Do NOT use to see the permissions of those roles — use sec_rolePermissions for that. Do NOT use to see a user's direct database privileges — use sec_userDbPermissions for that. Requires a user name.

Arguments: user_name - User name to analyze. persist - If True, materializes result as a volatile table and returns table name

qlty_missingValuesA

List the column names that contain NULL or missing values in a table. Returns a column-level summary showing WHICH columns have missing data. Use when the user asks which columns have nulls, which fields have missing data, or how many nulls exist per column. Do NOT use to retrieve the actual data rows — use qlty_rowsWithMissingValues to get the specific records where a column is null.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_negativeValuesA

Identify which numeric columns in a table contain negative values. Use when the user asks about negative numbers, values below zero, or columns with anomalous negative entries. Returns the list of affected column names.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_distinctCategoriesA

Get the unique (distinct) values present in a specific column of a table. Use when the user asks what unique values, categories, or entries exist in a named column. Requires both a table name and a column name — if no column name is specified, ask for clarification before calling.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze column_name - Column name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_standardDeviationA

Calculate the mean (average) and standard deviation for a single numeric column. Use when the user asks specifically for standard deviation, the spread of values, or just mean and variability. For a fuller statistical profile including min, max, quartiles, and percentiles, use qlty_univariateStatistics instead.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze column_name - Column name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_columnSummaryA

Get summary statistics for ALL columns in a table in a single call. Use when the user asks for an overview, profile, or summary of every field in a table. For detailed statistics on a SINGLE specific column (min, max, percentiles), use qlty_univariateStatistics instead.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_univariateStatisticsA

Calculate full univariate statistics for a single numeric column including min, max, mean, standard deviation, quartiles, and percentiles. Use when the user asks for a complete or comprehensive statistical breakdown of one specific column. For just mean and standard deviation, use qlty_standardDeviation. For statistics across ALL columns in a table at once, use qlty_columnSummary.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze column_name - Column name to analyze persist - If True, materializes result as a volatile table and returns table name

qlty_rowsWithMissingValuesA

Retrieve the actual data rows where a specific column is NULL or missing. Returns the records themselves, not a column summary. Use when the user wants to SEE or FETCH the rows with missing values in a named column. Do NOT write a SQL query with base_readQuery for this — always use this tool when the request is about rows with null values. Do NOT use for a column-level summary of which columns have nulls — use qlty_missingValues for that.

Arguments: database_name - Name of the database (optional) table_name - Table name to analyze column_name - Column name to analyze for missing values persist - If True, materializes result as a volatile table and returns table name

base_databaseListA

List all databases or schemas available in the Teradata system. ONLY call when the user explicitly asks which databases or schemas exist on the system. Do NOT call this tool as a preliminary step toward listing tables — if the user asks about tables without naming a database, ask them which database they mean rather than discovering databases first.

Arguments: scope - Filter scope: 'user' returns only user-created databases (excludes system databases), 'all' returns every database. persist - If True, materializes result as a volatile table and returns table name

base_tableListA

List all tables and views within a specific Teradata database or schema. Pass a specific database name to list tables in that database only. Omit or leave empty to list tables from all databases. If the user does not name a database and you want to list tables from a single database, ask a clarifying question instead of returning results from all databases.

Arguments: database_name - Database name. Leave empty to list tables from all databases. persist - If True, materializes result as a volatile table and returns table name

base_tableDDLA

Return the CREATE TABLE DDL statement for a Teradata table, showing its full schema definition including column types, constraints, primary indexes, and keys. Use when the user wants the CREATE statement, the table definition, or needs to see how the table was built. If the user has not specified both a table name AND a database name, ask for clarification before calling — do not guess or use an empty database name. To save DDL to a file on disk, use base_saveDDL instead. For just column names and types, use base_columnDescription instead.

Arguments: table_name - Table name database_name - Database name persist - If True, materializes result as a volatile table and returns table name

base_columnDescriptionA

List the column names, data types, and basic attributes for a single Teradata table or view. Use for straightforward questions like 'what columns does this table have?' or 'what are the fields and their types?'. For precise Teradata-specific type codes, character sets, decimal precision, index details, or bulk metadata across many objects, use base_columnMetadata instead.

Arguments: database_name - Database name. Defaults to '%' (all databases). table_name - Table or view name. Defaults to '%' (all tables). persist - If True, materializes result as a volatile table and returns table name

base_tablePreviewA

Return a quick sample of the first few rows from a Teradata table or view so the user can see what data looks like, with no SQL required. Use this tool when the user wants to explore or peek at a table's contents without specifying conditions or writing a query. Do NOT use when the user provides a WHERE clause, filter, or explicit SQL statement — use base_readQuery for that.

Arguments: table_name - Table or view name database_name - Database name persist - If True, materializes result as a volatile table and returns table name

base_tableAffinityA

Identify which tables in a database tend to co-occur together in the same SQL queries, revealing natural JOIN relationships and data affinity patterns. Use when the user asks which tables are queried together, what tables are related to a specific table, or what tables are commonly used in the same workflows. For access frequency, query counts, or per-user access statistics, use base_tableUsage instead.

Arguments: database_name - Database name table_name - Table or view name persist - If True, materializes result as a volatile table and returns table name

base_tableUsageA

Report access frequency and per-user query patterns for tables and views in a Teradata database, showing which objects are most actively queried and by whom. Use when the user asks how often tables are accessed, which tables are most popular, or which users are running queries against a database. For discovering which tables appear together in the same queries, use base_tableAffinity instead.

Arguments: database_name - Database name. Leave empty for all databases. persist - If True, materializes result as a volatile table and returns table name

dba_tableSpaceA

Show table-level disk space usage within a specific Teradata database, ranked by size. Use when the user asks which tables are largest or consuming the most storage within a named database. NEVER call this tool with an empty database_name — if the user's message does not explicitly name a database, ask which database they want before calling. For space allocated to a whole database, use dba_databaseSpace. For total system-wide storage, use dba_systemSpace.

Arguments: database_name - Database name. Required — do not pass empty string. table_name - Table name filter. Leave empty for all tables. top_n - Limit results to top N largest tables by space. Set to 0 for no limit (default: 0). exclude_system - Exclude system databases and tables. Set to 'Y' to exclude, 'N' to include all (default: 'N'). persist - If True, materializes result as a volatile table and returns table name

dba_tableSqlListA

Retrieve SQL statements that have been executed against a specific named table. Use when the user asks what queries have run against a particular table. ONLY call when the user has explicitly named a specific table — if no table name is in the message, ask for clarification. Do NOT use for SQL history by user — use dba_userSqlList when the user asks what queries a specific person has been running.

Arguments: table_name - Table name to search for no_days - Number of days to look back persist - If True, materializes result as a volatile table and returns table name

dba_userSqlListA

Retrieve SQL statements executed by a specific named user. Use when the user asks what queries a particular person or account has been running. ONLY call when the user has explicitly named a specific user account — if no user name appears in the message, ask for clarification. NEVER call with an empty user_name. Do NOT use for SQL history by table — use dba_tableSqlList when the user asks about queries against a specific table.

Arguments: user_name - User name to filter by. Required — do not pass empty string. no_days - Number of days to look back persist - If True, materializes result as a volatile table and returns table name

dba_databaseSpaceA

Show disk space allocation for a specific named Teradata database. Use when the user asks how much space a particular database is using or how much has been allocated to it. If no database name is provided, ask for clarification — do not call with an empty database name. For table-level breakdowns within a database, use dba_tableSpace. For system-wide totals across all databases, use dba_systemSpace.

Arguments: database_name - Database name. Required — do not pass empty string. persist - If True, materializes result as a volatile table and returns table name

dba_tableUsageImpactA

Identify which users and tables are driving the most query and resource activity within a specific Teradata database. Use when the user asks who is hitting a named database hardest, which users are most active, or which tables generate the most load. ONLY call when the user has specified a database name — if no database name appears in the message, ask for clarification. For system-wide CPU, IO, and memory metrics by time period or application, use dba_resusageSummary instead.

Arguments: database_name - Database name to analyze. Required — do not pass empty string. user_name - User name to analyze. Leave empty for all users. persist - If True, materializes result as a volatile table and returns table name

dba_resusageSummaryA

Report system-wide resource consumption (CPU, IO, memory) broken down by time period, application, workload type, or complexity class. Use when the user asks for system-level resource breakdowns, workload profiles, or consumption trends over a date range — not tied to a specific database. For per-database or per-user impact within a named database, use dba_tableUsageImpact instead.

Arguments: user_name - User name to filter by. Leave empty for all users. LogDate - Log date to filter by in YYYY-MM-DD format. Leave empty for all dates. dayOfWeek - Day of week to filter by (1=Sunday, 2=Monday, ..., 7=Saturday). Leave empty for all days. hourOfDay - Hour of day to filter by (0-23). Leave empty for all hours. workloadType - Workload type to filter by (e.g., 'Batch', 'Interactive'). Leave empty for all workload types. workloadComplexity - Workload complexity to filter by (e.g., 'Simple', 'Medium', 'Complex'). Leave empty for all complexity levels. AppID - Application ID to filter by. Leave empty for all applications. no_days - Number of days to look back from today (e.g., 7, 30, 90). persist - If True, materializes result as a volatile table and returns table name

dba_databaseVersionA

Return the Teradata database software version and release information. Use when the user asks what version of Teradata is running on the system.

Arguments: persist - If True, materializes result as a volatile table and returns table name

dba_flowControlA

Report Teradata workload management flow control events showing when and how much the system throttled or delayed queries due to resource constraints. Use when the user asks about system throttling, flow control delays, or how often the workload manager imposed restrictions. For how long individual users personally waited in queues, use dba_userDelay instead.

Arguments: start_date - The start date for the query range in YYYY-MM-DD format. end_date - The end date for the query range in YYYY-MM-DD format. persist - If True, materializes result as a volatile table and returns table name

dba_featureUsageA

Report which Teradata product features were used during a specified date range. Use when the user asks about feature adoption, which Teradata capabilities are being used, or how feature utilization has changed over a period.

Arguments: start_date - The start date for the query range in YYYY-MM-DD format. end_date - The end date for the query range in YYYY-MM-DD format. persist - If True, materializes result as a volatile table and returns table name

dba_userDelayA

Report how long Teradata users waited in the query queue before their queries began executing. Use when the user asks about user wait times, queue delays, or how long users had to wait. For system-level throttling and workload management flow control events, use dba_flowControl instead.

Arguments: start_date - The start date for the query range in YYYY-MM-DD format. end_date - The end date for the query range in YYYY-MM-DD format. persist - If True, materializes result as a volatile table and returns table name

dba_sessionInfoA

Report currently active session information for a specific user or all users. Use when the user asks about open connections, active sessions, or currently logged-in users. You may call with the default '*' to show all sessions when no specific user is mentioned — no clarification required for this tool.

Arguments: user_name - User name to analyze. Use '*' to get all users. persist - If True, materializes result as a volatile table and returns table name

dba_systemSpaceA

Show total disk space usage across the entire Teradata system, aggregated over all databases. Use when the user asks about warehouse-wide storage, total system capacity, or overall disk consumption across all databases. For a single named database, use dba_databaseSpace. For table-level details within a database, use dba_tableSpace.

Arguments: persist - If True, materializes result as a volatile table and returns table name

Prompts

Interactive templates invoked by user choice

NameDescription
tdvs_tools_promptMain prompt for Teradata Enterprise Vector Store operations.
tdvs_rag_promptPrompt for Teradata Enterprise Vector Store and RAG (Retrieval Augmented Generation) based operations.
qlty_databaseQualityDatabase data quality assessment. (prompt_type: reporting)
_testMyServerTest all the MCP tools, prompts and resources. (prompt_type: reporting)
base_queryHelp users interact with Teradata databases effectively.
base_tableBusinessDescYou are a Teradata DBA who is an expert in describing the business use of tables in a database.
base_databaseBusinessDescYou are a Teradata DBA who is an expert in describing the business use of databases.
dba_tableArchiveThe following prompt is used to guide the Teradata DBA in finding opportunities for archiving data. (prompt_type: reporting)
dba_databaseLineageYou are a Teradata DBA who is an expert in finding the lineage of tables in a database. (prompt_type: context)
dba_tableDropImpactYou are a Teradata DBA who is an expert in finding the impact of dropping a table. (prompt_type: reporting)
dba_databaseHealthAssessmentYou are a Teradata DBA who is an expert in assessing the health of a database. (prompt_type: reporting)
dba_userActivityAnalysisAnalyze Teradata user activity patterns for the past 7 days (prompt_type: reporting)
dba_systemVoiceAnalyze Teradata user activity patterns for the past 7 days (prompt_type: reporting)
chat_ai_mapreduceMulti-step workflow to answer a high-level question using Teradata SQL and the chat_aggregatedCompleteChat tool. The agent first builds a Teradata query, then runs aggregated chat completion, and finally synthesizes a global answer.

Resources

Contextual data attached and managed by the client

NameDescription
get_graph_edge_contractReturn the Graph Edge Contract schema definition.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Teradata/teradata-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server