Data Health¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Important
You are viewing documentation for Alation’s Classic User Experience.
The Open Data Quality Initiative is an Alation innovation that provides a framework for tracking rule-based measures of data health, data reliability, and overall data quality within the Alation data catalog. Rules can be generated in two ways:
Manually, using the Data Health API
Automatically, by data quality software vendors such as Soda and BigEye
If Data Health information has been enabled in your Alation data catalog, any table with an associated data health rule will show an active Health tab.
Enable Data Health¶
Data Health is not active by default. It can be enabled by a Server Admin. We recommend enabling Data Health on all Alation instances as it provides a useful framework for automating consistent data health observations across your data.
To enable Data Health, toggle Enable Health Data on the Feature Configuration page of Administrator Settings.
Table with Health Information¶
Once Data Health is enabled and at least one rule has been defined using the API, you can view health information for any table to which a rule applies. For example, a rule might specify that a field consists only of numeric data or string data.
When you open a catalog page for a table with rules, the Health tab is active and the Health column shows the status of any rules applied to particular columns. Statuses include Good, Warning, and Alert.
The Health column is also visible on the Columns tab.
Select the Health tab to view the data health information for all active rules.
Table with Health Information¶
Once Data Health is enabled and at least one rule has been defined using the API, you can view health information for any table to which a rule applies.
For example, we defined a rule specifying that the season_number field in the Episodes table of the IMDb schema consists of only numeric data. We also created similar rules checking that the episode title and parent TV show title consisted of string data.
When we open the Episodes catalog page, we see the Health tab is active and the Health column shows the status of any rules applied to particular columns:
Our rules show the three types of status available in a health rule: Good, Warning, and Alert. The Health tab shows the most severe status indicated.
Go to Data Health tab to view the data health information for all active rules.
View Data Health Propagated Via Lineage¶
You can see data health information propagated through Lineage on downstream tables, BI data sources, or BI reports if any upstream objects have data health issues. If so, the Health tab will be active and show the status of the most severe data health rule impacting the object.
Click the Health tab to view the data health information, then click the number beside Upstream Issues to view the Upstream Issues tab. Here you see the upstream source with issues and a summary, which may include upstream object deletion.
Click the down arrow to the right of the summary to see an expanded view of the information, including the rules that are in place and the objects they apply to.
View Data Health in Search Results¶
You can see data health alerts and warnings in search results. Tables appear flagged with an icon based on the most severe status currently applied to the table.
If the most severe status is Alert, an Alert icon is displayed.
If the most severe status is Warning, the Warning icon is displayed.