About Collectors
Pre-requisites
Collector Server Minimum Requirements
DB2 Requirements
-
The DB2 user that the collector will be using must have select access to the following tables
-
syscat.tables
-
syscat.views
-
syscat.columns
-
syscat.procedures
-
syscat.functions
-
syscat.roleauth
-
syscat.tableauth
-
sysibm.sqlforeignkeys
-
Enabling DB2 Audit
To capture usage information audit needs to be enabled in db2.
See https://www.ibm.com/docs/en/db2/11.1?topic=facility-audit-policies
KADA audit policy guidelines
-
KADA recommending to start using the WITHOUT DATA directive to limit logging. However if dynamic sql is used WITH DATA may need to be enabled.
-
KADA only requires the successful EXECUTE events
CREATE AUDIT POLICY KADA CATEGORIES EXECUTE WITHOUT DATA STATUS SUCCESS ERROR TYPE NORMAL COMMIT
AUDIT DATABASE USING POLICY KADA COMMIT
After the logs are captured they need to decoded and loaded into db2 tables. KADA will extract the usage information from the audit tables. Follow the guide https://www.ibm.com/docs/en/db2/11.1?topic=logs-creating-tables-db2-audit-data
Step 2: Create the Source in K
Create a DB2 source in K
-
Go to Settings, Select Sources and click Add Source
-
Select DB2 Source Type
-
Select "Load from File system" option
-
Give the source a Name - e.g. DB2 Production
-
Add the Host name for the DB2 Server
-
Click Finish Setup
Step 3: Getting Access to the Source Landing Directory
Step 4: Install the Collector
You can download the Latest Core Library and whl via Platform Settings → Sources → Download Collectors
Run the following command to install the collector
pip install kada_collectors_extractors_<version>-none-any.whl
You will also need to install the common library kada_collectors_lib for this collector to function properly.
pip install kada_collectors_lib-<version>-none-any.whl
Step 5: Configure the Collector
The DB2 collector currently only supports meta_only=true, do not set this to false.
|
FIELD |
FIELD TYPE |
DESCRIPTION |
EXAMPLE |
|---|---|---|---|
|
server |
string |
DB2 Server. If using a custom port append with comma |
"10.1.18.19" |
|
username |
string |
Username to log into the DB2 account |
"myuser" |
|
password |
string |
Password to log into the DB2 account |
|
|
database_name |
string |
The DB2 database to connect to |
"db2inst" |
|
output_path |
string |
Absolute path to the output location |
"/tmp/output" |
|
mask |
boolean |
To enable masking or not |
true |
|
compress |
boolean |
To gzip the output or not |
true |
|
meta_only |
boolean |
Extract meta only |
true |
|
host_name |
string |
This is the host value that you will be or have onboarded the source into K as. |
db2prod |
|
audit_schema |
string |
The schema for the audit tables, default is audit |
audit |
|
audit_table |
string |
The table name for the audit table, default is execute |
execute |
kada_db2_extractor_config.json
{
"server": "",
"username": "",
"password": "",
"database_name": "",
"output_path": "/tmp/output",
"mask": true,
"compress": true,
"meta_only": true,
"host_name": "",
"audit_schema": "audit",
"audit_table": "execute"
}
Step 6: Run the Collector
This is the wrapper script: kada_db2_extractor.py
import os
import argparse
from kada_collectors.extractors.utils import load_config, get_hwm, publish_hwm, get_generic_logger
from kada_collectors.extractors.db2 import Extractor
get_generic_logger('root')
_type = 'db2'
dirname = os.path.dirname(__file__)
filename = os.path.join(dirname, 'kada_{}_extractor_config.json'.format(_type))
parser = argparse.ArgumentParser(description='KADA DB2 Extractor.')
parser.add_argument('--config', '-c', dest='config', default=filename)
parser.add_argument('--name', '-n', dest='name', default=_type)
args = parser.parse_args()
start_hwm, end_hwm = get_hwm(args.name)
ext = Extractor(**load_config(args.config))
ext.test_connection()
ext.run(**{"start_hwm": start_hwm, "end_hwm": end_hwm})
publish_hwm(args.name, end_hwm)
Step 7: Check the Collector Outputs
K Extracts
A set of files (eg metadata, databaselog, linkages, events etc) will be generated in the output_path directory.
High Water Mark File
A high water mark file is created called db2_hwm.txt.
Step 8: Push the Extracts to K
Once the files have been validated, you can push the files to the K landing directory.