Configure Query Log Ingestion

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

On the Query Log Ingestion tab, you can select the QLI options for your data source and schedule the QLI job if necessary.

For Impala, Alation supports QLI from query log files stored on HDFS or Amazon S3.

Configure QLI in Alation

Before configuring QLI in Alation, based on the the CDP Distribution ensure that you have performed all the steps mentioned in the Prepare for QLI: Private Cloud and Prepare for QLI: Public Cloud sections.

You can configure QLI on the Query Log Ingestion tab of the data source settings page.

  1. Open the Query Log Ingestion tab of the settings page.

  2. Under Configure Connection Type, select either WebHDFS or Amazon S3, depending on where you made the logs available.

  3. If the logs are on HDFS, provide the following information and click Save:

    Parameter

    Description

    Logs Directory

    /user/history/done

    WebHDFS Server

    IP address or the name of the CDP server

    WebHDFS Port

    Specify the HDFS port number for your environment. The default is 9870. If you are using a different port, clear the Use Default checkbox under the WebHDFS Port field.

    Use Default

    Leave it selected if you are using the default port. Clear this checkbox if you are using a port number other than the default.

    Use JDBC Auth Credentials

    Select this checkbox to use the same auth credentials that are used in the Data Source Connection section. You do have to fill in the configurations that already exists in the Data Source Connection section. Clear the checkbox if you are using a different authentication mechanism for WebHDFS.

    This field is available from connector version 2.0.0.

    WebHDFS Authentication Method

    Select the required WebHDFS authentication type from the dropdown:

    • Username

    • Username/Password

    • Kerberos/Username/Password

    • Kerberos/Username/Keytab

    This field is available from connector version 2.0.0.

    WebHDFS User

    Specify the WebHDFS user if Username/Password is selected in WebHDFS Authentication Method

    WebHDFS password

    Specify the WebHDFS password if Username/Password is selected in WebHDFS Authentication Method

    This field is available from connector version 2.0.0.

    Enable SSL

    Select the Enable SSL checkbox to enable SSL. If enabled, upload the SSL certificate using the upload link. The supported certificate type is .jks.

    This field is available from connector version 2.0.0.

    WebHDFS Truststore password

    Specify the password for the SSL certificate.

    Note

    The password is deleted when you delete the data source connection.

    This field is available from connector version 2.0.0.

    WebHDFS Kerberos Configuration File

    If Kerberos/Username/Password or Kerberos/Username/Keytab is selected upload the krb5.conf file using the upload link

    This field is available from connector version 2.0.0.

    WebHDFS Keytab File

    If Kerberos/Username/Keytab is is selected in the WebHDFS Authentication Method field, upload the keytab file using the upload link

    This field is available from connector version 2.0.0.

  4. If the logs are on Amazon S3, provide the following information and click Save:

    Parameter

    Description

    Impala Log Path

    Your Amazon S3 bucket name

    Impala Log File Name Prefix

    Log file name prefix

    Number of Log files

    Number of log files to ingest

    AWS Access Key ID

    Access key ID to access the bucket

    AWS Access Key Secret

    Access key secret to access the bucket

    AWS Region

    Your AWS region

Perform QLI

You can either perform QLI manually on demand or enable automated QLI:

  1. To perform manual QLI, under the Automated and Manual Query Log Ingestion section of the Query Log Ingestion tab, ensure that the Enable Automated Query Log Ingestion toggle is disabled.

    Note

    Metadata extraction must be completed first before running QLI.

  2. Click Preview to get a sample of the query history data to be ingested.

  3. Click the Import button to perform QLI on demand.

  4. To schedule QLI, enable the Enable Automated Query Log Ingestion toggle.

  5. Set a schedule under Automated Query Log Ingestion Time by specifying values in the week, day, and time fields. The next QLI job will run on the schedule you have specified.

    Note

    The hourly schedule for automated QLI is not supported.