Configure Query Log Ingestion¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
On the Query Log Ingestion tab, you can select the QLI options for your data source and schedule the QLI job if necessary.
For Impala, Alation supports QLI from query log files stored on HDFS or Amazon S3.
Configure QLI in Alation¶
Before configuring QLI in Alation, based on the the CDP Distribution ensure that you have performed all the steps mentioned in the Prepare for QLI: Private Cloud and Prepare for QLI: Public Cloud sections.
You can configure QLI on the Query Log Ingestion tab of the data source settings page.
Open the Query Log Ingestion tab of the settings page.
Under Configure Connection Type, select either WebHDFS or Amazon S3, depending on where you made the logs available.
If the logs are on HDFS, provide the following information and click Save:
Parameter
Description
Logs Directory
/user/history/done
WebHDFS Server
IP address or the name of the CDP server
WebHDFS Port
Specify the HDFS port number for your environment. The default is 9870. If you are using a different port, clear the Use Default checkbox under the WebHDFS Port field.
Use Default
Leave it selected if you are using the default port. Clear this checkbox if you are using a port number other than the default.
Use JDBC Auth Credentials
Select this checkbox to use the same auth credentials that are used in the Data Source Connection section. You do have to fill in the configurations that already exists in the Data Source Connection section. Clear the checkbox if you are using a different authentication mechanism for WebHDFS.
This field is available from connector version 2.0.0.
WebHDFS Authentication Method
Select the required WebHDFS authentication type from the dropdown:
Username
Username/Password
Kerberos/Username/Password
Kerberos/Username/Keytab
This field is available from connector version 2.0.0.
WebHDFS User
Specify the WebHDFS user if Username/Password is selected in WebHDFS Authentication Method
WebHDFS password
Specify the WebHDFS password if Username/Password is selected in WebHDFS Authentication Method
This field is available from connector version 2.0.0.
Enable SSL
Select the Enable SSL checkbox to enable SSL. If enabled, upload the SSL certificate using the upload link. The supported certificate type is .jks.
This field is available from connector version 2.0.0.
WebHDFS Truststore password
Specify the password for the SSL certificate.
Note
The password is deleted when you delete the data source connection.
This field is available from connector version 2.0.0.
WebHDFS Kerberos Configuration File
If Kerberos/Username/Password or Kerberos/Username/Keytab is selected upload the krb5.conf file using the upload link
This field is available from connector version 2.0.0.
WebHDFS Keytab File
If Kerberos/Username/Keytab is is selected in the WebHDFS Authentication Method field, upload the keytab file using the upload link
This field is available from connector version 2.0.0.
If the logs are on Amazon S3, provide the following information and click Save:
Parameter
Description
Impala Log Path
Your Amazon S3 bucket name
Impala Log File Name Prefix
Log file name prefix
Number of Log files
Number of log files to ingest
AWS Access Key ID
Access key ID to access the bucket
AWS Access Key Secret
Access key secret to access the bucket
AWS Region
Your AWS region
Perform QLI¶
You can either perform QLI manually on demand or enable automated QLI:
To perform manual QLI, under the Automated and Manual Query Log Ingestion section of the Query Log Ingestion tab, ensure that the Enable Automated Query Log Ingestion toggle is disabled.
Note
Metadata extraction must be completed first before running QLI.
Click Preview to get a sample of the query history data to be ingested.
Click the Import button to perform QLI on demand.
To schedule QLI, enable the Enable Automated Query Log Ingestion toggle.
Set a schedule under Automated Query Log Ingestion Time by specifying values in the week, day, and time fields. The next QLI job will run on the schedule you have specified.
Note
The hourly schedule for automated QLI is not supported.