Column level lineage
K provides context and trust to your data items by helping you understand the lineage and relationships between data items. You can see where your data item sources data from, where it flows to, how it joins to other data items.
K automatically detect column level lineage and help you understand how column level data is used, transformed, and derived throughout data processes.
Column level lineage has limitations due to the variety and complexity of SQL dialects. See Known limitations below for currently known limitations.
Enabling column level lineage
The column level lineage parse is currently available as a manually triggered process to supplement lineage captured during daily processes.
The column level lineage process analyses actively run code captured in K. If your environment is complex with a large amount of code, running the process may take time.
To trigger this column lineage job, go to Settings → Batch Manager and select QUERY_LINEAGE
Known limitations
Limitation | Fixed in Release |
---|---|
SQL code that is not wrapped in a CREATE… statement. This is common in DBT code used on Redshift | Under investigation |
Create statements that use a CTE (WITH) inside the create statement | Under investigation |
Lineage from subsequent table in a union where select * is used. For example, the table A’s column level lineage is processed however table B’s columns are not. e.g (SELECT * FROM A UNION SELECT * FROM B) | Under investigation |