Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

The Databricks Unity Catalog OCF connector should be used to catalog Databricks workspaces that have Unity Catalog enabled. It supports both interactive clusters and SQL warehouses for metadata extraction. The connector can catalog metadata objects from multiple workspaces using a single data source connection. Extracted schemas will be referenced with multipart names (catalog.schema).

The connector supports Azure Databricks, Databricks on AWS, and Databricks on GCP.

The connector is available as a Zip file that can be uploaded and installed in the Alation application. The latest connector package can be downloaded from the Alation Customer Portal. Ask an Alation admin with access to Customer Portal to download the connector from the Connectors section (Customer Portal > Connectors).

Connector Version Compatibility

Newer versions of the connector offer more features and may require newer Alation releases. See Databricks Unity Catalog OCF Connector Release Notes for version information.

Team

You may require assistance from your Databricks account administrator when configuring this connector in Alation.

  • Databricks administrator:

    • Creates a user for Alation and grants it the required permissions to access metadata

    • Generates a personal access token

    • Provides the JDBC URI to access metadata

    • Assists in enabling the Public Preview features (system lineage and audit tables)

    • Assists with configuring OAuth authentication for Compose

  • Alation Server Admin:

    • Installs the connector

    • Creates and configures a Databricks Unity Catalog OCF data source in Alation

Scope

The table below shows which metadata objects are extracted by this connector and which operations are supported.

Feature

Scope

Availability

Authentication

Basic

Authentication with a username and password

Yes*

Token-based

Authentication with a personal access token (PAT)

Yes

SSO authentication

SSO authentication with an identity provider application

No

AWS Secrets Manager with Alation Agent

Authentication using credentials an Alation Agent has retrieved from AWS Secrets Manager, for Alation Cloud Service on the cloud-native architecture only

Yes

Metadata extraction (MDE)

Default MDE

Extraction of metadata based on default extraction queries in the connector code

Yes

Custom query-based MDE

Extraction of metadata based on extraction queries provided by a user

Yes

Popularity

Indicator of the popularity (intensity of use) of a data object, such as a table or a column

Yes

Extracted metadata objects

Data source

Data source object in Alation that is parent to extracted metadata

Yes

Schemas

List of schemas, with multipart schema names catalog.schema

Yes

Tables

List of tables

Yes

Columns

List of columns

Yes

Column data types

Column data types

Yes

Views

List of views

Yes**

Source comments

Source comments

Yes***

Primary keys

Primary key information for extracted tables

Yes****

Foreign keys

Foreign key information for extracted tables

Yes****

Functions

Extraction of function metadata

No

Sampling and profiling

Table sampling

Extracts data samples from all extracted tables

Yes

Column sampling

Extracts data samples from all extracted columns

Yes

Deep column profiling

On-demand profiling of specific columns with the calculation of value distribution stats

Yes

Dynamic profiling

On-demand table and column profiling by individual users who use their own database accounts to retrieve the profiles

Yes

Custom query-based table sampling

Ability to use custom queries for sampling specific tables

Yes

Custom query-based column sampling

Ability to use custom queries for profiling specific columns

Yes

Query log ingestion (QLI) (beta)

Extraction and ingestion of query history

(Available from connector version 2.0.3.6564 and and Alation version 2023.1.7.1)

Extraction of query history from the system audit table and ingestion of query history metadata into the catalog

Yes

Query history, filters, expressions, joins, and popularity

Query history, filters, joins, and popularity information is calculated from the query history metadata extracted and ingested with QLI

Yes

Lineage extraction (beta)

Extraction of lineage information

(Available from connector version 1.0.3.4144 and Alation version 2023.1.2)

Lineage information is calculated during metadata extraction (direct lineage extraction). Additionally, lineage is generated based on DDL queries run in Compose

Users can also create lineage manually or add it using the public API

Yes

Data upload

Yes

Compose

Customer-managed (on-premise) Alation instances

Connections from Compose and querying

Yes

Alation Cloud Service instances

Compose on Alation Cloud Service instances: depending on your network configuration, you may need to use Alation Agent to connect to your data source

Compose with Agent is supported from connector version 1.2.1.5335

Yes

Personal Access Token (PAT) authentication in Compose

Authentication in Compose with a PAT

Yes

SSO through OAuth in Compose

Authentication in Compose with OAuth via Azure Active Directory

OAuth authentication is supported from connector version 2.1.0. This is supported only with Azure Databricks Unity Catalog.

Yes

* Databricks on AWS only; not supported for Azure Databricks, and Databricks on GCP

** Databricks provides view_definition only for Views that the service account creates in the Unity Catalog.

*** Supported only for column source comments. From version 3.0.0 onwards, the connector supports both table and column source comments.

**** Supported only for query-based extraction. From version 3.0.0 onwards, the connector supports extraction of primary and foreign keys.