K Knowledge Base
Breadcrumbs

Custom interface - Data Quality

Introduction to Data Quality Custom Interfaces

The following interface contract describes how Data Quality scores can be loaded into the K platform.

Each interface covers different details to be loaded: the Metadata interface provides details about each DQ Test, the Metrics interface provides details about each DQ Test execution including the score to be recorded, and the Linkage interface provides details about what each DQ Test is linked to in K.

The interfaces must conform to the following specifications:

Property

Value

Encoding

UTF-8 (No BOM)

File Delimiter

| (pipe)

Headers

Ordered and present as per contract, all in uppercase. Each header value enclosed in double quotes.

Record Quoting

All fields enclosed in double quotes. Any double quotes inside the field value must be escaped with an additional double quote. e.g. ...|"SELECT ""COLUMN NAME"" from tableA"|...

Record Delimiter

\

(new line character)


Empty Fields

Non mandatory fields may be left empty. Note that an empty field must still be double quoted.


Metadata Interface

Extract contains metadata about each DQ Test.

File name: META_YYYYMMDDHHMMSS.csv

Daily Load: Extract for new or updated objects since last extract OR full snapshot.
Historical Load: Full snapshot object metadata.

Column

Data Type

Description

Mandatory

OBJECT_TYPE

STRING

Set to DATA_QUALITY_TEST

Y

OBJECT_ID

STRING

UUID of the DQ Test. Must be unique e.g. "aaa7f9c"

Y

TOOL_OBJECT_TYPE

STRING

Type of the DQ Test e.g. "Accuracy check"

N

NAME

STRING

Name of the DQ Test

N

DESCRIPTION

STRING

Description of the DQ Test

N

URL

STRING

A URL to the object in the tool (only if tool has a web UI)

N

PATH

STRING

Fully qualified path to the object. Replace any full stop '.' with '_' when constructing the path.

N

QUERY_CODE

STRING

Code used in the DQ Test

N

CREATED_AT

TIMESTAMP

No longer used — leave blank.

N

UPDATED_AT

TIMESTAMP

No longer used — leave blank.

N

LAST_RUN_AT

TIMESTAMP

No longer used — leave blank.

N


Metrics Interface

Extract contains metrics from each DQ test run.

File name: METRICS_YYYYMMDDHHMMSS.csv

Daily Load: Extract for new events since last extract.
Historical Load: Full snapshot of events.

Column

Data Type

Description

Mandatory

TIMESTAMP

TIMESTAMP

Time of DQ Test execution in UTC. Format: YYYY-MM-DD HH:MI:SS.sss

Y

METRIC_TYPE

STRING

Set to "DATA_QUALITY_SCORE"

Y

RELATED_OBJECT_TYPE

STRING

Set to "DATA_QUALITY_TEST"

Y

RELATED_OBJECT_ID

STRING

Object ID of the DQ Test e.g. "aaa7f9c"

Y

OBJECT_TYPE

STRING

OBJECT_TYPE of the target. Valid values: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE

Y

OBJECT_ID

STRING

Fully qualified path of the target e.g. "demo.adventureworks.person.person.businessentityid"

Y

USER_ID

STRING

User that executed the test

Y

VALUE

STRING

DQ Test score. Must be a value between 0 to 100

Y


Linkage Interface

Extract contains relationships between a DQ Test and the object that it is testing.

File name: LINKAGES_YYYYMMDDHHMMSS.csv

Daily Load: Extract for new or updated relationships since last extract OR full snapshot.
Historical Load: Full snapshot of linkages.

Column

Data Type

Description

Mandatory

SRC_OBJECT_TYPE

STRING

Set to "DATA_QUALITY_TEST"

Y

SRC_OBJECT_ID

STRING

Object ID of the DQ Test e.g. "aaa7f9c"

Y

TRG_OBJECT_TYPE

STRING

OBJECT_TYPE of the target. Valid values: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE

Y

TRG_OBJECT_ID

STRING

Fully qualified path of the target

Y

RELATIONSHIP

STRING

Set to "TARGETS"

Y