graph_bfsLevels
Compute BFS hop distances from root nodes to sequence objects for deployment, group objects into migration waves by nearest root, and identify cycle candidates.
Instructions
Compute BFS shortest-path hop distances from one or more root nodes.
Pure-Python implementation — no stored procedure required.
WHEN TO USE THIS TOOL vs graph_traceLineage:
Use graph_bfsLevels when asked to:
Sequence objects for deployment or migration (ORDER BY downstream_level gives correct topological deployment order for root objects)
Group objects into migration waves (nearest_root identifies which of the input root tables each object belongs to)
Find which migration root table each object is closest to across a multi-root migration scope
Identify cycle members by depth (direction='BOTH' nodes with unequal absolute upstream/downstream levels are cycle candidates)
Count objects within N hops of a change (blast-radius sizing)
Answer "how far is object X from the migration root tables?"
Do NOT use graph_bfsLevels for general lineage tracing, impact path analysis, or questions about which specific objects depend on which. Use graph_traceLineage for those — it returns the full edge set with relationship detail. graph_bfsLevels returns distances and wave groupings, not dependency paths or edge detail.
KEY DISTINCTION — root_node_list accepts EXACT FQ names only (no wildcards). Use graph_findRootObjects first to identify the seed objects, then pass their exact FQ names here.
Arguments: root_node_list - str: CSV of exact fully-qualified root node names. No wildcards — exact names only.
SINGLE ROOT:
'DEV01_StGeo_STD_T.mortgage_account'
MULTIPLE ROOTS (CSV):
'DEV01_StGeo_STD_T.mortgage_account,
DEV01_StGeo_STD_T.mortgage_borrower,
DEV01_StGeo_STD_T.mortgage_property'
CRITICAL: Exact FQ names, no wildcards.
Use graph_findRootObjects or
graph_traceLineage first to discover names.max_depth_up - int: Maximum upstream hops to traverse. 0 = skip upstream analysis entirely. Default: 10
Upstream means "what this object DEPENDS ON" —
its sources, prerequisites, and ancestors.
For root objects with in-degree zero, upstream_level
will be NULL for all non-root nodes (correct).max_depth_down - int: Maximum downstream hops to traverse. 0 = skip downstream analysis entirely. Default: 10
Downstream means "what DEPENDS ON this object" —
its consumers, dependents, and impact radius.
For root objects with in-degree zero, downstream_level
will show positive values for all consumers (correct).exclude_objects - str: CSV of FQ object name LIKE patterns to exclude. Matched against both Src and Tgt sides of every edge. Python fnmatch is used for pattern matching (% → *). Example: 'DFJ%,C_D02%,%.temp_%' Default: '' (no exclusions)
include_containers - str: CSV of container name LIKE patterns to include. Only edges where BOTH Src and Tgt containers match at least one pattern are traversed. Python fnmatch used for matching (% → *). Empty = all containers included. Example: 'DEV01_StGeo%,MF_STGEO%,TABLEAU%,POWERBI%' Default: '' (all containers)
edge_repository - str: Edge repository view/table conforming to the Required parameter — no default.
Returns: ResponseType: formatted response with BFS node results + metadata. Schema is identical to handle_graph_bfsLevels (SP-based tool).
Response structure: { "nodes": [ { "node": "DEV01_StGeo_STD_T.mortgage_account", "container_name": "DEV01_StGeo_STD_T", "object_name": "mortgage_account", "object_kind": "Table", "upstream_level": None, // None (NULL) if unreachable or skipped "downstream_level": 0, // 0 for root, positive for consumers "nearest_root": "DEV01_StGeo_STD_T.mortgage_account", "direction": "ROOT", // ROOT / U / D / BOTH "is_root": "Y" }, ... ], "cycle_candidates": [...], // direction='BOTH' nodes with unequal // absolute upstream/downstream levels "summary": { "total_nodes": 46, "root_nodes": 3, "upstream_only": 12, "downstream_only": 28, "both_directions": 3, "cycle_candidates": 1, "max_upstream_depth": 4, "max_downstream_depth": 5, "nodes_per_nearest_root": {"DB.Root1": 20, "DB.Root2": 26}, "object_kind_counts": {"Table": 10, "View": 22, "Macro": 8, ...} } }
direction values: ROOT - One of the input root nodes U - Reachable upstream only (negative upstream_level) D - Reachable downstream only (positive downstream_level) BOTH - Reachable in both directions — possible cycle member. Unequal absolute levels indicate a back-edge (cycle). Equal absolute levels indicate a shared dependency.
Technical Implementation Notes:
One SQL round-trip to fetch all edges matching the container/exclusion filters. All BFS computation is then done in Python memory.
Standard queue-based BFS (O(V+E)) — optimal for unweighted graphs. This is more correct than the original Bellman-Ford style SQL relaxation loop that the SP inherited from the notebook.
Multi-source BFS: all root nodes are seeded simultaneously at level 0. Each non-root node settles at the distance to its nearest root, with ties broken deterministically by lexicographic root name order.
Upstream BFS follows Src→Tgt edges to discover Src-side ancestors.
Downstream BFS follows Tgt→Src edges to discover Tgt-side consumers.
This direction convention matches the corrected SP (Option B fix): upstream_level = NULL for root objects with in-degree zero (correct) downstream_level = positive for all consumers (correct)
Filter application order:
SQL WHERE clause: fetch only edges matching include_containers (both Src and Tgt containers must match at least one pattern)
Python post-filter: exclude edges where either endpoint matches an exclude_objects pattern (applied before building adjacency)
BFS depth cap: enforced during queue processing
Node metadata (container_name, object_name, object_kind) is derived from the edge set and stored in a node registry during the fetch phase.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| root_node_list | Yes | ||
| max_depth_up | No | ||
| max_depth_down | No | ||
| exclude_objects | No | ||
| include_containers | No | ||
| edge_repository | No |