Overview¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
The Databricks Unity Catalog OCF connector should be used to catalog Databricks workspaces that have Unity Catalog enabled. It supports both interactive clusters and SQL warehouses for metadata extraction. The connector can catalog metadata objects from multiple workspaces using a single data source connection. Extracted schemas will be referenced with multipart names (catalog.schema
).
The connector supports Azure Databricks, Databricks on AWS, and Databricks on GCP.
The connector is available as a Zip file that can be uploaded and installed in the Alation application. The latest connector package can be downloaded from the Alation Customer Portal. Ask an Alation admin with access to Customer Portal to download the connector from the Connectors section (Customer Portal > Connectors).
Connector Version Compatibility¶
Newer versions of the connector offer more features and may require newer Alation releases. See Databricks Unity Catalog OCF Connector Release Notes for version information.
Team¶
You may require assistance from your Databricks account administrator when configuring this connector in Alation.
Databricks administrator:
Creates a user for Alation and grants it the required permissions to access metadata
Generates a personal access token
Provides the JDBC URI to access metadata
Assists in enabling the Public Preview features (system lineage and audit tables)
Assists with configuring OAuth authentication for Compose
Alation Server Admin:
Installs the connector
Creates and configures a Databricks Unity Catalog OCF data source in Alation
Scope¶
The table below shows which metadata objects are extracted by this connector and which operations are supported.
Feature |
Scope |
Availability |
---|---|---|
Authentication |
||
Basic |
Authentication with a username and password |
Yes* |
Token-based |
Authentication with a personal access token (PAT) |
Yes |
SSO authentication |
SSO authentication with an identity provider application |
No |
AWS Secrets Manager with Alation Agent |
Authentication using credentials an Alation Agent has retrieved from AWS Secrets Manager, for Alation Cloud Service on the cloud-native architecture only |
Yes |
Metadata extraction (MDE) |
||
Default MDE |
Extraction of metadata based on default extraction queries in the connector code |
Yes |
Custom query-based MDE |
Extraction of metadata based on extraction queries provided by a user |
Yes |
Popularity |
Indicator of the popularity (intensity of use) of a data object, such as a table or a column |
Yes |
Extracted metadata objects |
||
Data source |
Data source object in Alation that is parent to extracted metadata |
Yes |
Schemas |
List of schemas, with multipart schema names
|
Yes |
Tables |
List of tables |
Yes |
Columns |
List of columns |
Yes |
Column data types |
Column data types |
Yes |
Views |
List of views |
Yes** |
Source comments |
Source comments |
Yes*** |
Primary keys |
Primary key information for extracted tables |
Yes**** |
Foreign keys |
Foreign key information for extracted tables |
Yes**** |
Functions |
Extraction of function metadata |
No |
Sampling and profiling |
||
Table sampling |
Extracts data samples from all extracted tables |
Yes |
Column sampling |
Extracts data samples from all extracted columns |
Yes |
Deep column profiling |
On-demand profiling of specific columns with the calculation of value distribution stats |
Yes |
Dynamic profiling |
On-demand table and column profiling by individual users who use their own database accounts to retrieve the profiles |
Yes |
Custom query-based table sampling |
Ability to use custom queries for sampling specific tables |
Yes |
Custom query-based column sampling |
Ability to use custom queries for profiling specific columns |
Yes |
Query log ingestion (QLI) (beta) |
||
Extraction and ingestion of query history (Available from connector version 2.0.3.6564 and and Alation version 2023.1.7.1) |
Extraction of query history from the system audit table and ingestion of query history metadata into the catalog |
Yes |
Query history, filters, expressions, joins, and popularity |
Query history, filters, joins, and popularity information is calculated from the query history metadata extracted and ingested with QLI |
Yes |
Lineage extraction (beta) |
||
Extraction of lineage information (Available from connector version 1.0.3.4144 and Alation version 2023.1.2) |
Lineage information is calculated during metadata extraction (direct lineage extraction). Additionally, lineage is generated based on DDL queries run in Compose Users can also create lineage manually or add it using the public API |
Yes |
Data upload |
Yes |
|
Compose |
||
Customer-managed (on-premise) Alation instances |
Connections from Compose and querying |
Yes |
Alation Cloud Service instances |
Compose on Alation Cloud Service instances: depending on your network configuration, you may need to use Alation Agent to connect to your data source Compose with Agent is supported from connector version 1.2.1.5335 |
Yes |
Personal Access Token (PAT) authentication in Compose |
Authentication in Compose with a PAT |
Yes |
SSO through OAuth in Compose |
Authentication in Compose with OAuth via Azure Active Directory OAuth authentication is supported from connector version 2.1.0. This is supported only with Azure Databricks Unity Catalog. |
Yes |
* Databricks on AWS only; not supported for Azure Databricks, and Databricks on GCP
** Databricks provides view_definition only for Views that the service account creates in the Unity Catalog.
*** Supported only for column source comments. From version 3.0.0 onwards, the connector supports both table and column source comments.
**** Supported only for query-based extraction. From version 3.0.0 onwards, the connector supports extraction of primary and foreign keys.