Upload a Data Dictionary 2023.3.2 and Newer¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Important
You are viewing documentation for Classic Alation.
Note
This page describes version 2023.3.2 and newer versions. To upload a data dictionary in versions before 2023.3.2, see Upload a Data Dictionary 2023.3.1 and Earlier.
Overview¶
Catalog users with the Composer, Steward, Source Admin, Catalog Admin, and Server Admin roles have the ability to upload data dictionaries into the Alation catalog to curate catalog fields in bulk. There is no need to create data dictionary source files from scratch, although you could do that if necessary. A much faster way to get started would be to Download a Data Dictionary from Alation and edit it.
After modifying a downloaded dictionary file, you can upload it to apply changes to multiple titles, descriptions, and custom field values on data source, schema, table, and column catalog pages. Data dictionaries can be uploaded in CSV or TSV formats. Ensure TSV files are saved with a .tsv extension, as .txt and .dm1 extensions are not supported.
Uploading a data dictionary can help update the values of existing fields, but cannot add new fields to or remove fields from catalog pages.
Note
Alation also offers APIs for uploading a data dictionary and for uploading custom field values in bulk.
Alternatively, in the user interface, you can bulk-update field values using Stewardship Workbench—requires Governance App.
The source file can be streamlined to include the read-only fields and keys of only those catalog objects that need to be updated and the fields to be updated. There is no requirement to import the full dictionary every time. For example, if you want to update only one field on one data object, you can include, minimally, only the necessary read-only fields, the key, and the field to be updated.
Important
To import successfully, make sure your data dictionary CSV or TSV source file conforms to the Requirements for CSV/TSV Source Files.
If your source CSV file includes Unicode characters, make sure it is encoded in UTF-8.
Source files must have unique field names. Duplicate field names are not permitted and validation includes verifying imported field names against both custom field names and built-in field names. A warning message appears when an uploaded file conflicts with an existing field name. For example, If you have a custom field called description included in the data dictionary file you are attempting to upload, you get an error because of the conflict with the built-in description field in the catalog. You’ll have to remove one of the conflicting fields from the source file.
Both object-level and field-level permissions are taken into consideration when an uploaded data dictionary attempts to change a value.
Data Dictionary Upload Performance Configurability¶
Applies from version 2023.3.3
If you are loading a large data dictionary and notice the upload taking an exceptionally long time, you can use a server configuration option to disable permission checks during the upload process and limit the upload dictionary functionality to Catalog Admins and Server Admins only. This option should be used with caution as fewer users will be able to upload data dictionaries after it’s enabled. To use this option, the feature flag alation.feature_flags.disable_perm_check_on_upload_dd
can be set to True
using alation_conf. For help with alation_conf, see Using alation_conf.
Note
Alation Cloud Service customers can request server configuration changes through Alation Support.
Import a Data Dictionary from a CSV or TSV File¶
The procedure described here is for updating an existing data source. If you are migrating information to a newly populated data source, see Data Migration.
To import a data dictionary from a CSV or TSV file:
Open the catalog page of a data source, schema, or table.
The Upload Dictionary page opens:
Drag and drop your file in the drag-and-drop area, or press Click to upload your dictionary file. The maximum file size is 25MB. After you drag and drop or upload, Alation parses the source file and displays a dialog saying the file is being analyzed and the preview is being prepared:
Click Check Status to see the current status, or continue working. An email will be sent to you when the analysis is complete. Click the View in Alation link in the email or click Check Status to view the preview table, which includes any errors that may have occurred as well as the pending changes:
Initially, the preview shows all values in gray. You need to select an upload option to see what is going to be updated vs. what is going to be ignored. Using the three options under Update Catalog with Dictionary File, specify how existing field values and blank fields are to be handled:
Keep Existing Values—If this option is selected, only the new values for empty fields are uploaded from the data dictionary. This option gives precedence to values that currently exist in the catalog over the values for the fields contained in the imported data dictionary. In other words, this option loads the new values for previously empty fields, and does not change the values that already exist in the catalog.
Replace Existing Values - Skip Empty—If this option is selected, the values from the dictionary overwrite the values that currently exist in the catalog. This option uploads both the new values for empty fields and updates the existing values with values from the dictionary. If a field is empty in the uploaded data dictionary file but there is an existing value in the catalog, the catalog value will be preserved and the empty value in the data dictionary file will be ignored.
Replace Existing Values - Apply Empty—If this option is selected, the values in the dictionary overwrite the values that currently exist in the catalog. This option uploads both the new values for empty fields and updates the existing values with values from the dictionary. If a field is empty in the uploaded data dictionary file but there is an existing value in the catalog, the catalog value will be cleared.
After you’ve selected an upload option, the review will demonstrate how the data dictionary values will apply:
The “active” values that are going to be updated appear in darker font color.
The “disabled” values that are going to be ignored or overwritten appear in lighter font color. If you switch between the upload options, the font color toggles between “active” and “disabled” values.
Click Download Report to download the Preview Report, or, if errors are present, you can choose to download only the errors. Correct any errors and re-upload the data dictionary before proceeding. You can also choose to proceed to the next step with open errors. The update will be performed on non-erroneous records.
Click Update Catalog to finish the import. A dialog appears informing you that this action cannot be canceled, and asking if you want to proceed. Click Yes to update the catalog.
You see a message saying the catalog update is in process:
Click Check Status to see the progress. If the update is complete, you see a message telling you so. You can also click Download Report to download a full update report that lists all successful and erroneous updates.
To import a data dictionary from a CSV/TSV file:
Open the catalog page of a data source, schema, or table.
The Upload Dictionary page opens:
Drag and drop your file in the drag-and-drop area, or press Click to upload your dictionary file. The maximum file size is 25MB. After you drag and drop or upload, Alation parses the source file and displays a dialog saying the file is being analyzed and the preview is being prepared:
Click Check Status to see the current status, or continue working. An email will be sent to you when the analysis is complete. Click the View in Alation link in the email or click Check Status to view the preview table, which includes any errors that may have occurred as well as the pending changes:
The preview shows what is going to be updated vs. what is going to be ignored:
The “active” values that are going to be updated appear in darker font color.
The “disabled” values that are going to be ignored or overwritten appear in lighter font color. If you switch between the upload options, the font color toggles between “active” and “disabled” values.
Click Download Report to download the full preview report, or, if errors are present, you can choose to download only the errors. Correct any errors and re-upload the data dictionary before proceeding. You can also choose to proceed to the next step with open errors. The update will be performed on non-erroneous records.
On the preview page, specify how existing field values are to be handled:
Keep Existing Values: If this option is selected, only the new values for empty fields are uploaded from the data dictionary. This option gives precedence to values that currently exist in the catalog over the values for these fields contained in the imported data dictionary. In other words, this option loads the new values for previously empty fields, and does not change the values that already exist in the catalog.
Replace Existing Values: If this option is selected, the values in the dictionary overwrite the values that currently exist in the catalog. This option uploads both the new values for empty fields and updates the existing values with values from the dictionary.
Click Update Catalog to finish the import. A dialog appears informing you that this action cannot be canceled, and asking if you want to proceed. Click Yes to update the catalog.
You see a message saying the catalog update is in process:
Click Check Status to see the progress. If the update is complete, you see a message telling you so.
Data Dictionary Upload Notifications¶
As you are uploading a data dictionary, Alation will notify you of the stages of this process via email using the email in your user profile. Notifications may be handy if you’re handling very large dictionaries and the process is not fast.
There is currently no ability for users to opt out of the notifications.
Email notifications are sent about the following events:
Preview of dictionary upload is ready—Alation informs you that the analysis of the source file is complete and that a preview can be viewed in the user interface. The preview is limited to 1,000 records, but you can download a Preview Report from the notification or from the user interface to review all the planned updates.
Your update report for <catalog object name> is ready—Alation informs you that the update of the catalog is complete and that an Update Report can be downloaded.
Data Dictionary Upload Reports¶
As you work with data dictionaries, Alation will generate reports on the status of the upload process.
Preview Report¶
The preview report is generated by Alation as part of the source file analysis after you upload the data dictionary file. The report is a CSV file that duplicates your source data dictionary but appends three more fields on the right:
errors
—Contains error messages if present.
erroneous_fields
—Lists fields that issued an error.
warnings
—Contains warnings if present.
You can download the preview report from the user interface using the Download Report button on the Upload Data Dictionary page or from the email notification about preview completion. The name of the downloaded file will include the words preview_report
, for example: data_209_preview_report_557_2024-06-11T19-12-12-433537.csv
.
Update Report¶
The update report is generated by Alation when the catalog update based on the uploaded data dictionary file is complete. The status report is a CSV file that includes a list of all catalog objects from the source file and two additional fields:
errors
—Contains error messages if present.
commit status
—Contains information on the status of the update for each row:
succeeded
—The update of all fields was successful.
partial_success
—The update of some fields was successful but failed for other fields. This status will be accompanied by an error message in the errors field.
not_attempted
—There were no updates to apply for a row.
failed
—The update failed. This status will be accompanied by an error message in the errors field.
You can download the update report from the user interface using the Download Report button on the Upload Data Dictionary page or from the email notification about update completion. The name of the downloaded file will include the words commit_report
, for example: data_209_commit_report_557_2024-06-11T21-07-21-875403.csv
.
Data Migration¶
If you are migrating data curation from one data source to another, you perform the same steps as for updating a single data source, but you may need some additional preparation. If you are starting from a downloaded data dictionary, be aware that from Alation version 2023.3.2, those downloaded data dictionaries include a column al_datadict_item_properties
, which contains object IDs that greatly facilitate efficient resolution of catalog updates through the dictionary upload. However, those object IDs may not be accurate in a different data source or instance. Thus, in the migration scenario, we recommend that you remove the al_datadict_item_properties
column before uploading the edited data dictionary.
For users who attempted migration scenarios with a 2023.3.2 data dictionary and encountered errors, Alation offers two potential solutions:
Update the data dictionary manually using the Requirements for CSV/TSV Source Files, in particular the advice to triple-quote (
"""
) any value containing a period (.
).Request a migration script from Alation Support; this requires two downloaded data dictionaries, one from the original data source and one from the target data source, and then matches objects and produces a final, uploadable data dictionary from the two.