Custom interface - Data Quality
Introduction to Data Quality Custom Interfaces
The following interface contract describes how Data Quality scores can be loaded into the K platform.
Each interface covers different details to be loaded:
Metadata interface: Details about each DQ Test
Metrics interface: Details about each DQ Test execution including the score to be recorded
Linkage interface: Details about what each DQ Test is linked to in K
The interfaces must conform to the following specifications
Property | Value |
---|---|
Encoding | UTF-8 (No BOM) |
File Delimiter | | (pipe) |
Headers | Ordered and present as per contract all in uppercase. Each header values enclosed in double quotes. |
Record Quoting | All fields enclosed in double quotes. “value”. Any double quotes inside the field value must be escaped with an additional double quote. eg |
Record Delimiter | \n (new line character) |
Empty Fields | Non mandatory fields may be left empty. Note that an empty field must still be doubled quoted. |
Metadata Interface
Extract contains metadata about each DQ Test.
File name META_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new or updated objects since last extract. OR Full snapshot.
Historical Load: Full snapshot object metadata.
Column | Data Type | Description | Value Mandatory |
---|---|---|---|
OBJECT_TYPE | STRING | Set to DATA_QUALITY_TEST | Y |
OBJECT_ID | STRING | Set to UUID of the DQ Test. Must be unique e.g. “aaa7f9c” | Y |
TOOL_OBJECT_TYPE | STRING | Type of the DQ Test. “Accuracy check” | N |
NAME | STRING | Name of the DQ Test e.g. Column value must be between 0 to 100 | N |
DESCRIPTION | STRING | Description of the DQ Test “Checks the value of this field is accurate and valid” | N |
URL | STRING | A URL to the object in the tool. Only applicable if the tool has a web UI to view the object. | N |
PATH | STRING | A fully qualified path to the object if possible otherwise parent of this object. Replace any full stop ‘.' with '_’ when constructing the path. | N |
QUERY_CODE | STRING | Code used in the DQ Test | N |
CREATED_AT | TIMESTAMP | No longer used leave blank. | N |
UPDATED_AT | TIMESTAMP | No longer used leave blank. | N |
LAST_RUN_AT | TIMESTAMP | No longer used leave blank. | N |
Metrics Interface
Extract contains metrics from each DQ test run.
File name METRICS_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new events since last extract.
Historical Load: Full snapshot of events.
Column | Data Type | Description | Value Mandatory |
---|---|---|---|
TIMESTAMP | TIMESTAMP | Time of the DQ Test execution in UTC time Value in UTC time. Format: YYYY-MM-DD HH:MI:SS.sss | Y |
METRIC_TYPE | STRING | Set to “DATA_QUALITY_SCORE” | Y |
RELATED_OBJECT_TYPE | STRING | See OBJECT_TYPE in Object Metadata Interface. Set to “DATA_QUALITY_TEST” | Y |
RELATED_OBJECT_ID | STRING | See OBJECT_ID in Object Metadata Interface Set to the relevant Object ID that the metric relates to e.g. “aaa7f9c” | Y |
OBJECT_TYPE | STRING | Set to the OBJECT_TYPE of the target of the DQ Test. E.g. If the DQ Test is on a column in a table, set the Object_Type to “COLUMN”. Valid values include: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE | Y |
OBJECT_ID | STRING | Fully qualified path of the target of the DQ Test. E.g. “demo.adventureworks.person.person.businessentityid” if businessentityid is the target column. Must align with the path in K | Y |
USER_ID | STRING | User that executed the test | Y |
VALUE | STRING | DQ Test score. Must be a value between 0 to 100 | Y |
Linkage Interface
Extract contains relationships between a DQ Test and the object that it is testing
File name LINKAGES_YYYYMMDDHHMMSS.csv
Daily Load: Extract for new or updated relationships since last extract. OR Full snapshot.
Historical Load: Full snapshot of linkages between objects.
Column | Data Type | Description | Value Mandatory |
---|---|---|---|
SRC_OBJECT_TYPE | STRING | Set to “DATA_QUALITY_TEST” | Y |
SRC_OBJECT_ID | STRING | See OBJECT_ID in Object Metadata Interface e.g. “aaa7f9c” | Y |
TRG_OBJECT_TYPE | STRING | Set to the OBJECT_TYPE of the target of the DQ Test. E.g. If the DQ Test is on a column in a table, set the Object_Type to “COLUMN”. Valid values include: COLUMN, TABLE, DATASET_FIELD, DATASET_TABLE | Y |
TRG_OBJECT_ID | STRING | Fully qualified path of the target of the DQ Test. E.g. “demo.adventureworks.person.person.businessentityid” if businessentityid is the target column. Must align with the path in K | Y |
RELATIONSHIP | STRING | Set to “TARGETS” | Y |