Overview¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
The connector for Azure Data Factory is available for download from the Connector Hub on the Alation Customer Portal. Follow instructions in this documentation to install or manage the connector.
Use this connector to catalog Azure Data Factory data source in Alation. This connector extracts Azure Data Factory objects such as data factory, pipeline, and activity that includes copy data, data flow, execute SSIS package, execute pipeline, and general activity from a resource group provided in the configuration page.
This connector can also extract pipeline parameters, but the current version only supports parameters of the following types: string, integer, float, and boolean.
The transformation used within a dataflow is also captured and are streamed as columns in Alation. Lineage is supported for table level.
You can search and find Azure Data Factory (ADF) objects, curate the corresponding catalog object pages, and understand the transformation of the data using Lineage diagrams. This connector supports both internal lineage (lineage between tables) and cross-system lineage (lineage between source and target table with scanned data sources tables in Alation).
Team¶
You may need the assistance of your database administrator to configure this data source.
Azure Data Factory administrator:
Creates a service account for Alation.
Registers an application with Microsoft Entra ID.
Read scope for the registered application.
Creates client secret and add files.
Alation server administrator:
Ensures that Alation Connector Manager is installed and running or installs it.
Install the connector.
Creates and configures the Azure Data Factory data source in the catalog.
Performs initial extraction and prepares the data source for Alation users.
Scope¶
The table below lists the features supported by the connector.
Feature |
Scope |
Availability |
---|---|---|
Authentication |
||
Basic (username and password) |
Authentication with a service account created on the database using username and password |
|
Key pair authentication |
Authentication with the private-public key pair |
|
OAuth |
Authentication with the Azure Token. Authentication is accomplished using an Azure token that relies on parameters from the connector’s General Settings screen Tenant ID, Client ID, and Client Secret. |
|
Metadata extraction (MDE) |
||
Default MDE |
Extraction of metadata from Azure Data Factory OCF Connector based on default queries in the connector code through REST API. |
|
Query-based MDE |
Extraction of metadata based on custom extraction queries provided by a user |
|
Extracted metadata objects |
||
Columns |
List of columns |
|
Column data types |
Column data types |
|
External tables |
Extraction of external table metadata |
|
Foreign keys |
Foreign key information for extracted tables |
|
Functions |
Extraction of function metadata |
|
Function Definitions |
Extraction of function definition metadata |
|
Policies |
Extraction of row access policies and data masking policies. Available if Policy Center is enabled in the Governance application. (Paid feature) |
|
Primary keys |
Primary key information for extracted tables |
|
Schemas |
List of schemas |
|
Source comments |
Source comments |
Not applicable |
Stored procedures |
Extraction of stored procedure metadata that appears in search results |
|
Tables |
List of tables |
|
Tags |
Extraction of tags |
|
Views |
List of views |
|
Sampling and Profiling |
||
Column sampling |
Retrieval of data samples from extracted columns |
|
Custom query-based table sampling |
Ability to use custom queries for sampling specific tables |
|
Custom query-based column sampling |
Ability to use custom queries for profiling specific columns |
|
Deep column profiling |
Profiling of columns with the calculation of value distribution stats |
|
Dynamic profiling |
Ability for individual users to connect with their own database accounts to retrieve table and column samples and profiles |
|
Table sampling |
Retrieval of data samples from extracted tables |
|
Query Log Ingestion (QLI) - Not supported |
||
Lineage |
||
Table-level Lineage |
Auto-calculation of lineage based on query history ingested from MDE queries |
|
Cross-system Lineage |
Lineage for cross-system |
|
Column-level lineage |
Calculation of lineage data at the column level |
|
Compose - Not supported |
Lineage¶
Data Source or Target |
Table-level Lineage |
Cross-system Lineage |
---|---|---|
Azure Blob Storage |
|
|
Amazon RDS |
|
|
Azure SQL Database |
|
|
Oracle |
|
|
Snowflake |
|
|
Object Mapping¶
Azure Data Factory Object |
Alation Object |
---|---|
Activity Type |
Table Source Comments |
Copy Data Activity Mappings |
Columns |
Copy Data Activity Source |
Table |
Copy Data Activity Sink |
Table |
Data Flow Activity |
Table |
Data Factory Name or Pipeline Name |
Schema |
Data Flow Expression Name |
Columns |
Data Flow Parameter Supported Parameter Type: String, Int, Boolean, Float. |
Table Source Comments |
Data Flow Source |
Table |
Data Flow Source Column |
Columns |
Data Flow Sink |
Table |
Data Flow Sink Columns |
Columns |
Execute Pipeline |
Table |
Execute Pipeline Parameter Name |
Column |
Expression |
Column Source Comments |
General Activity Name |
Table Name |
General Activity Type |
Table Source Comments |
Invoked Pipeline |
Table Source Comments |
Parameter Value |
Column Source Comments |
Pipeline Parameters (Supported Parameter Types: String, Int, Boolean, Float) |
Schema Source Comments |
SSIS Activity Name (Only Embedded package is Supported) |
Table |
SSIS Parameters Name |
Column |
SSIS Parameter Value |
Column Source Comments |