About Collectors
Collectors are extractors that are developed and managed by you (A customer of K).
KADA provides python libraries that customers can use to quickly deploy a Collector.
Why you should use a Collector
There are several reasons why you may use a collector vs the direct connect extractor:
-
You are using the KADA SaaS offering and it cannot connect to your sources due to firewall restrictions
-
You want to push metadata to KADA rather than allow it pull data for Security reasons
-
You want to inspect the metadata before pushing it to K
Using a collector requires you to manage
-
Deploying and orchestrating the extract code
-
Managing a high water mark so the extract only pull the latest metadata
-
Storing and pushing the extracts to your K instance.
Pre-requisites
Collector Server Minimum Requirements
Integration with Cognos requires Cognos Analytics APIs.
Cognos Analytics APIs are available from version 11.1.7 onwards.
Previous versions are currently not supported.
Cognos Requirements
-
Cognos access
-
Cognos Analytics user that has the ability to read all objects in Cognos
-
A SQL Authenticated user Database User for the underlying Audit Database configured for Cognos
-
-
Cognos auditing must be enabled (Log level - Basic)
Collector currently only supports a SQLServer version 2016 or higher Audit Database, if you use another Database type, please contact KADA support.
Step 1) Setup KADA user configuration in Cognos
This step is performed by a Cognos Admin.
-
Log into your Cognos instance.
-
Note down the URL you use e.g. https://kada-cognos.cloudapp.net/ to be used in Step 3
-
-
Create a new KADA user.
-
Follow the steps here - https://www.ibm.com/docs/en/cognos-analytics/11.2.0?topic=namespace-creating-managing-users
-
Add the user to a role that has read access to objects to be profiled/monitored.
-
To enable K to monitor ALL objects, the user will need read access to ALL Cognos objects.
-
Note down the Namespace ID for the namespace where the user was created.
-
Step 2) Setup KADA user in the Cognos Audit Database
-
Log into your Cognos Audit Database e.g SQL Server
-
Create a new KADA database user
-
Give the KADA database user READ ONLY access to the following tables in the Audit Database:
-
COGIPF_VIEWREPORT
-
COGIPF_USERLOGON
-
COGIPF_RUNREPORT
-
COGIPF_RUNJOB
-
Step 3: Create the Source in K
Create a Cognos source in K
-
Log into your K instance
-
Go to Platform Settings, select Sources and click Add Source
-
Select Cognos
-
Select "Load from File" option
-
Give the source a Name - e.g. Cognos Production
-
Add the Host name - use the cognos URL from Step 1
-
Click Finish Setup
Step 4: Getting Access to the Source Landing Directory
Step 5: Install the Collector
You can download the latest Core Library and whl via Platform Settings → Sources → Download Collectors
Run the following command to install the collector
pip install kada_collectors_extractors_<version>-none-any.whl
You will also need to install the common library kada_collectors_lib for this collector to function properly.
pip install kada_collectors_lib-<version>-none-any.whl
Note that you will also need an ODBC package installed at the OS level for pyodbc to use as well as a SQLServer ODBC driver, refer to https://docs.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver15
Step 6: Configure the Collector
|
FIELD |
FIELD TYPE |
DESCRIPTION |
EXAMPLE |
|---|---|---|---|
|
server_url |
string |
Cognos server address domain including the protocol (e.g. |
|
|
username |
string |
Username to log into Cognos server created in Step 1 |
"cognos" |
|
password |
string |
Password to log into Cognos server |
|
|
namespace |
string |
The user namespace which the user will log into. By default the namespace is |
"CognosEx" |
|
timeout |
integer |
API timeout for Cognos APIs in seconds. |
20 |
|
db_host |
string |
IP address or address of the Audit database. |
"10.1.19.15" |
|
db_username |
string |
Username for the Audit database created in Step 2 |
"kada" |
|
db_password |
string |
Password for the database user created in Step 2 |
|
|
db_port |
integer |
Default is usually 1433 for SQLServer |
1433 |
|
db_name |
string |
Database name where the audit tables are stored |
"Audit" |
|
db_schema |
string |
Schema name where the audit tables are stored |
dbo |
|
db_driver |
string |
Driver name must match the one installed on the collector machine |
"ODBC Driver 17 for SQL Server" |
|
db_use_kerberos |
boolean |
Does the database request impersonation, e.g. Kerberos |
false |
|
meta_only |
boolean |
For meta only set this to true otherwise leave it as false |
false |
|
output_path |
string |
Absolute path to the output location |
"/tmp/output" |
|
mask |
boolean |
To enable masking or not |
true |
|
mapping |
json |
Mapping of data source names to onboarded K hosts |
{"somehost.adw": "analytics.adw"} |
|
compress |
boolean |
To gzip the output or not |
true |
kada_cognos_extractor_config.json
{
"server_url": "http://xxx:9300",
"username": "",
"password": "",
"namespace": "",
"timeout": 20,
"db_host": "",
"db_username": "",
"db_password": "",
"db_port": 8060,
"db_name": "",
"db_schema": "",
"db_use_kerberos": false,
"meta_only": false,
"output_path": "/tmp/output",
"mask": false,
"mapping": {},
"compress": false
}
Step 7: Run the Collector
This is the wrapper script: kada_cognos_extractor.py
import os
import argparse
from kada_collectors.extractors.utils import load_config, get_hwm, publish_hwm, get_generic_logger
from kada_collectors.extractors.cognos import Extractor
get_generic_logger('root')
_type = 'cognos'
dirname = os.path.dirname(__file__)
filename = os.path.join(dirname, 'kada_{}_extractor_config.json'.format(_type))
parser = argparse.ArgumentParser(description='KADA Cognos Extractor.')
parser.add_argument('--config', '-c', dest='config', default=filename)
parser.add_argument('--name', '-n', dest='name', default=_type)
args = parser.parse_args()
start_hwm, end_hwm = get_hwm(args.name)
ext = Extractor(**load_config(args.config))
ext.test_connection()
ext.run(**{"start_hwm": start_hwm, "end_hwm": end_hwm})
publish_hwm(args.name, end_hwm)
Step 8: Check the Collector Outputs
K Extracts
A set of files (eg metadata, databaselog, linkages, events etc) will be generated in the output_path directory.
High Water Mark File
A high water mark file is created called cognos_hwm.txt.
Step 9: Push the Extracts to K
Once the files have been validated, you can push the files to the K landing directory.