Bot Configuration

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Sample Bot Configuration

We will start with a sample Bot config. Below is a “Data Source Ownership” Bot Configuration which:

  • Reads all data sources and associated attributes from the rdbms_datasources Alation Analytics table

  • Checks if the Steward OR the Data Owner field for a data source is empty

  • If empty, then start a conversation to notify Alation user #2 with subject Datasource without Steward(s) or Data Owner(s).

    {
        "bot_name": "Bot - Data Source Ownership",
        "description": "Bot to check for without Steward(s) or Data Owner(s)",
        "conn_type": {"sf":
                        {"otype": "DATA",
                        "aa_query": "SELECT * FROM public.rdbms_datasources WHERE deleted = False",
                        "last_etl": "SELECT last_load_time from information_schema.load_history where table_name = 'STG_RDBMS_DATASOURCES' ORDER BY last_load_time DESC LIMIT 1"}
                      },
        "create_policy": True,
        "rules": [
            {
                "fields_to_check": [
                    {"Steward":{"condition":"IS NULL", "field_type":  "OBJECT_SET"}},
                    {"Data Owner":{"condition":"IS NULL", "field_type":  "OBJECT_SET"}}
                ],
                "fields_to_check_operator":"or",
                "notification": {"type": "conversation",
                                 "subject": "Datasource without Steward(s) or Data Owner(s)",
                                 "frequency": "always",
                                 "to": [{"user": 2}]}
            }
        ]
    }
    

The notified user can be any valid Alation user ID, and is typically an administrator.

Let’s examine each of the parameters used in a Bot configuration.

Bot Parameters

Here is a complete list of the Bot configuration parameters:

bot_name

The name of the Bot in the Bot Manager

The name of the Bot Violation Policy within Alation

description

This description will be added to the description in the created Bot Violation Policy within Alation

conn_type

How your Bot will interact with Alation Analytics. There are two possible settings here.

sf represents Snowflake, and this value tells your bot to use the Snowflake connector to communicate with Alation Analytics

pg represents Postgres, and this value tells your bot to use the Postgres connector to communicate with Alation Analytics

Example:

"conn_type": {"sf":

{"otype": "DATA",

"aa_query":"",

"last_etl":""}

}

otype

Alation otype that the Bot will be working with. See supported otypes below

custom_template_name

This is required when running a Bot against a BUSINESS_POLICY. The entry needs to be the exact name of the policy template entered into Alation.

aa_query

Data that the Bot will be analyzing. See Alation Analytics example queries below

last_etl

When a Bot runs, it stores the date_time of that run. Before it runs again, it compares the stored date_time to the last run of Alation ETL to see if there is any new data. If there is no new data, the Bot will not run. See last_etl examples below.

create_policy

Bot Violation policies are not always required. Pass true or false to instruct the Bot to create a policy.

There will be times when you just want your Bot to run and perform an action if a condition is met. For example, attach a specific policy to an object if a condition is met. This example does not need a Bot Violation Policy; just have the Bot perform the action.

rules

A collection of rules your bot will run. Rules are waterfall-based, and the first rule condition satisfied wins. Any following rules will not be examined. Bots have a max limit of 4 rules.

fields_to_check

These are the fields that the condition in the rule should check. They are Alation fields and must be typed exactly as they are labeled within Alation. Spaces are acceptable, and the names have to match.

Multiple fields_to_check entries are acceptable. If more than one entry, then the fields_to_check_operator is required.

Example:

"fields_to_check":[

  {"Certified":{"condition":"IS NULL", "field_type": "PICKER"}},

  {"Phase One":{"condition":"IS NULL", "field_type": "PICKER"}},

  {"Phase Two":{"condition":"IS NULL", "field_type": "PICKER"}},

  {"Phase Three":{"condition":"IS NULL", "field_type": "PICKER"}}

  ],

"fields_to_check_operator": "or",

fields_to_check_operator

If more than one fields_to_check entry is added to a rule, a fields_to_check_operator is required. This can be and or or.

properties_to_check

You can create properties to check from your aa_query. These properties are analyzed just like fields_to_check. Multiple properties_to_check are permitted, but if more than one, the properties_to_check_operator is required.

Example:

"aa_query": "SELECT \*, DATEDIFF(day, verified_date::DATE,

CURRENT_DATE) as date_span FROM public.business_policy

WHERE policy_type_name = 'Compliance' AND deleted = False",

{"date_span":{"condition":"GREATER THAN", "value":"120", "field_type": "TEXT"}}

properties_to_check_operator

If more than one properties_to_check entry is added to a rule, a fields_to_check_operator is required. This can be and or or.

condition

Supported conditions are

IS_NULL

IS NOT NULL

CONTAINS

DOES NOT CONTAIN

GREATER THAN

LESS THAN

value

Required if the following condition is used

CONTAINS

DOES NOT CONTAIN

GREATER THAN

LESS THAN

field_type

Support field types are

TEXT

DATE

MULTI_PICKER

OBJECT_SET

PICKER

RICH_TEXT

actions_to_perform

If a condition is satisfied, one or more actions can be performed. An action is altering the data within a field.

Example:

{"Business Policy":{"action":"ADD","field_type": "OBJECT_SET", "values":["business_policy_24"]}}

action

Supported actions are

ADD

REMOVE

field_type

Supported Alation fields are

TEXT

DATE

MULTI_PICKER

OBJECT_SET

PICKER

RICH_TEXT

value

The value that will be passed to the field. Notice that this is a list since a MULTI_PICKER field is supported.

Notifications

These are the parameters used when a notification is required:

type

conversation

email

slack

subject

{string}

frequency

updates - Any time there is a change in the violation count + or -

always - Every time the Bot runs, a notification is sent regardless of the violation count

Conversation Example

"notification": {"type": "conversation",

  "subject": "Attach PII Policy to required tables",

  "frequency": "always",

  "to": [{"user": 2}]}

A Conversation is then created with the specified Alation user ID - user 2 in this example.

Email Example

"notification": {

  "type": "email",

  "subject": "Table Completeness",

  "frequency": "always",

  "to": [

    "[email protected]",

    "[email protected]"

  ]

}

Slack Example

"notification": {

  "type": "slack",

  "subject": "Table Completeness",

  "frequency": "always",

  "to": [

    "[email protected]",

    "[email protected]"

  ],

  "channel": "valid_slack_channel"

}

Supported otypes

SCHEMA

rdbms_schemas

TABLE

rdbms_tables

ATTRIBUTE

rdbms_columns

DATA

rdbms_datasources

BUSINESS_POLICY

business_policy

aa_query examples

Data Source

SELECT * FROM public.rdbms_datasources WHERE deleted = False AND EXCLUDED = False

Schema

SELECT * FROM public.rdbms_schemas WHERE deleted = False AND EXCLUDED = False

Table

SELECT * FROM public.rdbms_tables WHERE deleted = False AND EXCLUDED = False

Column (Attribute)

SELECT * FROM public.rdbms_columns WHERE deleted = False AND EXCLUDED = False

Policy

SELECT * FROM public.business_policy WHERE deleted = False AND EXCLUDED = False

last_etl examples

You will have noticed a second sql query labelled “last_etl”. Each time the Alation Analytics ETL job runs within your Alation Instance it leaves a “last updated” timestamp in the target database for each object it updates. If you make a catalog change to table metadata for example, the last updated stamp for the rdbms_table table will be updated to that time. When a Bot runs it will check to see if it has a stored timestamp from the last time the Bot ran. If it does, it will compare that timestamp to the one in the data object that the Bot is running against. If the time stamp for the data object is newer then the one stored, it will run the bot because there must be some new data. Otherwise the Bot will not run. If there is no timestamp stored because this is the first time the Bot has been run, it will ignore timestamps and just run. This efficiency feature eliminates unnecessary Bot executions on data that has not changed.

Data Source

SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_DATASOURCES’ ORDER BY last_load_time DESC LIMIT 1

Schema

SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_SCHEMAS’ ORDER BY last_load_time DESC LIMIT 1

Table

SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_TABLES’ ORDER BY last_load_time DESC LIMIT 1

Column (Attribute)

SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_COLUMNS’ ORDER BY last_load_time DESC LIMIT 1

Policy

SELECT last_load_time from information_schema.load_history where table_name = ‘STG_BUSINESS_POLICY’ ORDER BY last_load_time DESC LIMIT 1

Further Bot Examples

Elapsed Time

It is possible to create Bots with rules based on elapsed time. This can be useful for managing content that should be checked on a regular basis, or finding content that may have expired for example.

This is achieved by creating a date diff property in the aa_query we run. The date diff will return the number of days between two dates. So in the query we take the current date and the date stored in a custom field and return the difference in days. Then in the Bot rule we test that number and choose to do something if the number exceeds some threshold.

The following Bot checks if a policy has been verified within a specified (120 days) time range.

{
      "bot_name": "Bot Sample - Verified Policy",
      "description": "Bot to check if policy has been verified within a time range",
      "conn_type": {
                      "sf":{
                          "otype": "BUSINESS_POLICY",
                          "custom_template_name": "Compliance",
                          "aa_query": "SELECT *, DATEDIFF(day, verified_date::DATE, CURRENT_DATE) as date_span FROM public.business_policy WHERE policy_type_name = 'Compliance' AND  deleted = False",
                          "last_etl": "SELECT last_load_time from information_schema.load_history WHERE table_name = 'STG_BUSINESS_POLICY' ORDER BY last_load_time DESC LIMIT 1"
                      }
                },
    "create_policy": true,
    "rules": [
      {
          "properties_to_check": [
              {"date_span":{"condition":"GREATER THAN", "value":"120", "field_type":  "TEXT"}}
          ],
          "notification": {"type": "conversation",
                           "subject": "Policy must be verified!",
                           "frequency": "always",
                           "to": [{"user": 2}]}
      }
    ]
}