Configure Sampling and Profiling¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Refer to Configure Sampling and Profiling for OCF Data Sources.
Sampling and Profiling¶
For Amazon Redshift, Alation supports a number of ways to retrieve data samples and column profiles. For general information about sampling and profiling, see Configure Sampling and Profiling for OCF Data Sources.
Column Profiling¶
The default queries for column profiling can be modified on the Custom Settings tab of the Settings page. Do not use fully qualified table names in custom profiling queries for Amazon Redshift. The following query template can be used to create a custom query:
Numeric columns:
SELECT MIN({column_name}) AS Minimum, MAX({column_name}) AS Maximum, AVG({column_name}) AS Average, MEDIAN({column_name}) AS Median, SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END) as "#NULL", (SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END) / COUNT(*)) * 100 AS "%NULL" FROM {table_name};
Non-numeric columns:
SELECT SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END) as "#NULL", (SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END) / COUNT(*)) * 100 AS "%NULL" FROM {table_name};