Skip to main content
Skip table of contents

Dataset upstream source extract

There may be a time when you need to see all the upstream objects for datasets in K.

A current workaround is to follow the below instructions to generate an extract

Extract details

Columns

Description

Example

name

Name of the dataset

Customer model

object_type

Type of the dataset

Dataset

object_id

ID of the dataset

753f5c32-5cf3-3af6-a499-364d749344e5

source_name

Name of the dataset source

Power BI

upstream_name

Name of the upstream object

dim_customer

upstream_object_type

Type of the upstream object

Table

upstream_object_signature

Fully qualified location fo the upstream object

source.database.schema.table

upstream_id

ID of the upstream object

8a05f40a-0e6a-3ab1-93eb-9a6db10e0601

upstream_source_name

Name of the upstream source

Snowflake

Instructions to produce the extract

CODE
# CONNECT to the postgres pod and start a psql session
kubectl exec -it postgres-statefulset-0  -- psql -U postgres -d cerebrum

# Run this query
COPY (
SELECT DISTINCT
   ds.name AS name,
   dsr.name AS object_type,
   ds.id AS object_id,
   dss.name AS source_name,
   r.name AS upstream_name,
   rr.name AS upstream_object_type,
   r.signature AS upstream_object_signature,
   r.id AS upstream_id,
   rs.name AS upstream_source_name
FROM node ds
INNER JOIN node_ref dsr ON dsr.id = ds.node_ref_id
INNER JOIN source dss ON dss.id =  ds.source_id
INNER JOIN edge ON edge.source_node_id = ds.id AND edge.source_node_ref_id = ds.node_ref_id
INNER JOIN node r ON r.id = edge.target_node_id AND r.node_ref_id = edge.target_node_ref_id
INNER JOIN source rs ON rs.id = r.source_id
INNER JOIN node_ref rr ON rr.id = r.node_ref_id
WHERE ds.node_ref_id IN (15,27)
AND edge.edge_ref_id IN (6,26,32)
AND edge.source_node_ref_id IN (15,27)
AND edge.target_node_ref_id IN (3,4)
AND r.node_ref_id IN (3,4)
ORDER BY source_name, object_type, name, upstream_source_name, upstream_object_type, upstream_object_signature
) TO '/tmp/extract.csv' DELIMITER ',' CSV HEADER;

# Exist out of postgres pod.
# Use kubectl to copy the csv out of the pod.
kubectl cp postgres-statefulset-0:/tmp/extract.csv extract_table_size.csv
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.