Introduction to Data Quality Custom Interfaces
The following interface contract describes how Data Quality scores can be loaded into the K platform.
Each interface covers different details to be loaded: the Metadata interface provides details about each DQ Test, the Metrics interface provides details about each DQ Test execution including the score to be recorded, and the Linkage interface provides details about what each DQ Test is linked to in K.
The interfaces must conform to the following specifications:
|
Property |
Value |
|---|---|
|
Encoding |
UTF-8 (No BOM) |
|
File Delimiter |
| (pipe) |
|
Headers |
Ordered and present as per contract, all in uppercase. Each header value enclosed in double quotes. |
|
Record Quoting |
All fields enclosed in double quotes. Any double quotes inside the field value must be escaped with an additional double quote. e.g. |
|
Record Delimiter |
\ |
|
(new line character) |
|
|
Empty Fields |
Non mandatory fields may be left empty. Note that an empty field must still be double quoted. |
Metadata Interface
Extract contains metadata about each DQ Test.
File name: META_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new or updated objects since last extract OR full snapshot.
Historical Load: Full snapshot object metadata.
|
Column |
Data Type |
Description |
Mandatory |
|---|---|---|---|
|
OBJECT_TYPE |
STRING |
Set to DATA_QUALITY_TEST |
Y |
|
OBJECT_ID |
STRING |
UUID of the DQ Test. Must be unique e.g. "aaa7f9c" |
Y |
|
TOOL_OBJECT_TYPE |
STRING |
Type of the DQ Test e.g. "Accuracy check" |
N |
|
NAME |
STRING |
Name of the DQ Test |
N |
|
DESCRIPTION |
STRING |
Description of the DQ Test |
N |
|
URL |
STRING |
A URL to the object in the tool (only if tool has a web UI) |
N |
|
PATH |
STRING |
Fully qualified path to the object. Replace any full stop '.' with '_' when constructing the path. |
N |
|
QUERY_CODE |
STRING |
Code used in the DQ Test |
N |
|
CREATED_AT |
TIMESTAMP |
No longer used — leave blank. |
N |
|
UPDATED_AT |
TIMESTAMP |
No longer used — leave blank. |
N |
|
LAST_RUN_AT |
TIMESTAMP |
No longer used — leave blank. |
N |
Metrics Interface
Extract contains metrics from each DQ test run.
File name: METRICS_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new events since last extract.
Historical Load: Full snapshot of events.
|
Column |
Data Type |
Description |
Mandatory |
|---|---|---|---|
|
TIMESTAMP |
TIMESTAMP |
Time of DQ Test execution in UTC. Format: YYYY-MM-DD HH:MI:SS.sss |
Y |
|
METRIC_TYPE |
STRING |
Set to "DATA_QUALITY_SCORE" |
Y |
|
RELATED_OBJECT_TYPE |
STRING |
Set to "DATA_QUALITY_TEST" |
Y |
|
RELATED_OBJECT_ID |
STRING |
Object ID of the DQ Test e.g. "aaa7f9c" |
Y |
|
OBJECT_TYPE |
STRING |
OBJECT_TYPE of the target. Valid values: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE |
Y |
|
OBJECT_ID |
STRING |
Fully qualified path of the target e.g. "demo.adventureworks.person.person.businessentityid" |
Y |
|
USER_ID |
STRING |
User that executed the test |
Y |
|
VALUE |
STRING |
DQ Test score. Must be a value between 0 to 100 |
Y |
Linkage Interface
Extract contains relationships between a DQ Test and the object that it is testing.
File name: LINKAGES_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new or updated relationships since last extract OR full snapshot.
Historical Load: Full snapshot of linkages.
|
Column |
Data Type |
Description |
Mandatory |
|---|---|---|---|
|
SRC_OBJECT_TYPE |
STRING |
Set to "DATA_QUALITY_TEST" |
Y |
|
SRC_OBJECT_ID |
STRING |
Object ID of the DQ Test e.g. "aaa7f9c" |
Y |
|
TRG_OBJECT_TYPE |
STRING |
OBJECT_TYPE of the target. Valid values: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE |
Y |
|
TRG_OBJECT_ID |
STRING |
Fully qualified path of the target |
Y |
|
RELATIONSHIP |
STRING |
Set to "TARGETS" |
Y |