Azure SQL DB OCF Connector: Install and Configure¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Prerequisite¶
Network Connectivity¶
Open the outbound TCP port 1433 to Azure SQL DB (default).
Note
Azure SQL DB instances can listen on non-standard ports. Open the appropriate outbound TCP ports from the Alation server.
Service Account¶
Currently, only SQL authentication has been certified for this data source type by Alation. Kerberos authentication, NTLM, and SSL support will be certified with upcoming connector versions.
SQL Authentication¶
For SQL authentication, create a service account for Alation on the master
database.
Example:
CREATE LOGIN [login_value] WITH PASSWORD = [password]; CREATE USER [service_account] FOR LOGIN [login_value];
Create the corresponding user account in Azure SQL DB.
Example:
CREATE USER [login_name] FOR LOGIN [login_name];
Azure Active Directory¶
Note
Azure Active Directory is now called Microsoft Entra.
Set an admin account on the Azure portal. Login to the Azure SQL DB using your admin credentials.
For more information on setting up an admin account, refer to Set up an admin account.
Create a service account for Alation on Azure SQL DB.
Example
CREATE USER [<Microsoft_Entra_principal_name>] FROM EXTERNAL PROVIDER;
The Microsoft_Entra_principal_name
can be the user principal email id of a Azure Active Directory (Microsoft Entra) user.
For more information on Azure Active Directory (Microsoft Entra) user creation, refer to Create Users.
Authentication with Active Directory is supported through basic authentication (username and password). Use the following format for the username when configuring the data source settings in Alation:
For Username, use
Microsoft_Entra_principal_name
, whereMicrosoft_Entra_principal_name
represents the email id of a Azure Active Directory (Microsoft Entra) user.For Password, use the Active Directory password of the
Microsoft_Entra_principal_name
.
Permissions for Metadata Extraction¶
Grant the service account the VIEW DEFINITION
rights on all schemas you want to extract into the data catalog. This permission allows Alation to extract schema, table, and view metadata.
Example:
GRANT VIEW DEFINITION ON SCHEMA :: schema_name TO [login_name];
Permissions for Sampling and Profiling¶
If you want to use sampling and profiling, grant the SELECT
rights on schemas and tables you want to retrieve samples from.
Permissions for Query Log Ingestion¶
For query log ingestion, the service account requires the CONTROL
permission to access the sys.fn_get_audit_file
file. See the details in Query Log Ingestion below.
Example:
GRANT CONTROL TO [login_name];
JDBC URI¶
Minimal Format¶
sqlserver://<hostname_or_ip>:<port>
Example¶
sqlserver://test-101303.database.windows.net:1433;
JDBC URI for Azure Active Directory¶
To obtain the JDBC URI for Azure Active Directory, perform the following steps:
Go to Azure Portal and locate the the Azure SQL DB resource that you want to connect to.
Go to Settings > Connection strings > JDBC.
Copy the JDBC URI.
Example:
jdbc:sqlserver://tf-testal-102414.database.windows.net:1433;database=SqlServerAzDB_<datbaseName>;user=<username>;password={password};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;The connector requires only the
host
,protocol
, andport
.Database
is optional. Use this parameter if you want to fetch an object from a particular database.
Incremental MDE from Compose¶
When users create tables and views in Compose, Alation triggers a background extraction job to make the new table and view objects available in the catalog. As a result, users will see the corresponding table or view metadata in the data catalog without re-running MDE on the Settings page of the data source.
Azure SQL DB does not allow the database name in the CREATE VIEW queries. For the incremental MDE from Compose to work, add the database name as the database
parameter to the JDBC URI.
Format¶
sqlserver://<hostname_or_ip_address>:<port_number>;database=<database_name>;
Example¶
sqlserver://tf-test-101303.database.windows.net:1433;database=SqlServerAzDB_1;
Note
If the main connection URI does not include the database name, however, in Compose users add the database name to their Compose connection URI, incremental MDE from Compose will fail with exception.
Configuration in Alation¶
STEP 1: Install the Connector¶
Alation On-Prem¶
Important
Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.
To install an OCF connector:
If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.
Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.
Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.
Alation Cloud Service¶
Note
On Alation Service Cloud instances, Alation Connector Manager is available by default.
Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.
Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.
STEP 2: Create and Configure an New Data Source¶
In Alation, add a new data source:
Log in to Alation as a Server Admin.
Expand the Apps menu on the right of the main toolbar and select Sources.
On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.
On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.
On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.
Note
Agent-based connectors will have the Agent name appended to the connector name.
The name of this connector is Azure SQL DB OCF Connector.
Access Tab¶
On the Access tab, set the data source visibility using these options:
Public Data Source—The data source will be visible to all users of the catalog.
Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.
You can add new Data Source Admin users in the Data Source Admins section.
General Settings¶
Note
This section describes configuring settings for credentials and connection information stored in the Alation database. If your organization has configured Azure KeyVault or AWS Secrets Manager to hold such information, the user interface for the General Settings page will change to include the following icons to the right of most options:
By default, the database icon is selected, as shown. In the vault case, instead of the actual credential information, you enter the ID of the secret. See Configure Secrets for OCF Connector Settings for details.
Perform the configuration on the General Settings tab.
Application Settings¶
Specify Application Settings if applicable. Click Save to save the changes after providing the information.
Parameter |
Description |
---|---|
BI Connection Info |
This parameter is used to generate lineage between the current data source and another source in the catalog, for example a BI source that retrieves data from the underlying database. The parameter accepts host and port information of the corresponding BI data source connection. Use the following format: You can provide multiple values as a comma-separated list:
Find more details in BI Connection Info. |
Disable Automatic Lineage Generation |
Select this checkbox to disable automatic lineage generation from QLI, MDE, and Compose queries. By default, automatic lineage generation is enabled. |
Connector Settings¶
Datasource Connection¶
Populate the data source connection information and save the values by clicking Save in this section.
Parameter |
Description |
---|---|
JDBC URI |
Specify the JDBC URI in the required format. |
Azure Active Directory |
Select to connect using Azure Active Directory. |
Username |
Specify the service account username. If connecting using Azure Active Directory, specify the Microsoft_Entra_principal_name. |
Password |
Specify the service account password. If connecting using Azure Active Directory, specify the Active Directory password. |
Enable SSL |
Not supported yet |
Enable Kerberos Authentication |
Not supported yet |
Keytab |
Not supported yet |
Enable NTLM |
Not supported yet |
Logging Configuration¶
Select the logging level for the connector logs and save the values by clicking Save in this section. The available log levels are based on the Log4j framework.
Parameter |
Description |
---|---|
Log level |
Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL. |
Obfuscate Literals¶
Obfuscate Literals—Enable this toggle to hide actual values in the query statements that are ingested during query log ingestion or executed in Compose. This toggle is disabled by default.
Test Connection¶
Under Test Connection, click Test to validate network connectivity.
Metadata Extraction¶
You can configure metadata extraction (MDE) for an OCF data source on the Metadata Extraction tab of the Settings page. Refer to Configure Metadata Extraction for OCF Data Sources for information about the available configuration options.
The default queries that this connector uses to extract metadata can be found in Extraction Queries for Azure SQL DB. You can customize these queries to adjust the extraction to your specific needs.
Compose¶
For details about configuring the Compose tab of the Settings, refer to Configure Compose for OCF Data Sources.
Note
User impersonation in Compose with a kerberized account is not supported yet.
Creating Tables and Views in Compose¶
To ensure the successful incremental MDE from Compose, use fully qualified table names when creating tables. For Alation to incrementally extract views created in Compose, both the service account connection JDBC URI and the Compose connection URI should include the database
parameter. For more information on the URI format, see JDBC URI above.
Example SQL for CREATE TABLE:
CREATE TABLE <DATABASE>.<SCHEMA>.<TABLE> ({column properties});
Example SQL for CREATE VIEW:
USE <DATABASE>; CREATE VIEW <SCHEMA>.<VIEW> AS {view condition};
Sampling and Profiling¶
Sampling and profiling is supported. For details, see Configure Sampling and Profiling for OCF Data Sources.
Query Log Ingestion¶
The Azure SQL DB data source supports custom query-based query log ingestion (QLI). QLI requires that auditing should be enabled in Azure.
Important
To use the QLI functionality, ensure that the service account has the
CONTROL
permission as it needs to access thesys.fn_get_audit_file
. For details, refer to the corresponding Azure documentation: sys.fn_get_audit_file (Transact-SQL).
Enable Auditing in Azure¶
To enable auditing in Azure:
Log in to the Azure portal and navigate to your Azure resource.
Click on Auditing under Security.
Switch Enable Azure SQL Auditing to ON.
Select the Storage checkbox.
Select your Subscription.
Select a storage account. The storage account is a container to store the audit files. We recommend creating a new storage account for Alation. Select general purpose v1 as account kind.
Expand Advanced properties.
Set the appropriate retention period. For example, you can change the retention period to
1
. This assumes that Alation ingests audit files daily—ensure that you select the corresponding configuration in the data source settings in Alation.With the retention period set to
1
, Azure will log one day of query data in the storage. When configuring query log ingestion in Alation, set the automatic QLI to run daily to ingest this batch of query history. We recommend a low value for the retention period to avoid filling up the storage account with logs and to reduce the cost for the resource. Keeping this number as low as possible also helps reduce the amount of logs to retrieve into Alation per each query log ingestion job.If the retention period is set to
0
, Azure will store the logs until they get manually deleted (0
is indefinite storage). This is not a recommended setting.
Click on Save. Auditing is now enabled and audited data will be streamed to a binary flat file format (.xel).
Note
The .xel file is stored in the container that was selected in step 6.
Configure QLI in Alation¶
QLI can be configured on the Query Log Ingestion tab of the data source Settings page. The template for the QLI query is given below. You can customize it by adding, removing, or changing the filter, but the columns and their aliases must remain as is since Alation expects this query structure.
Note
When using the example below, substitute the placeholder
<container_URI>
with the container URI you find in your container properties, for example:https://testaccountvipra.blob.core.windows.net/sqldbauditlogs/my_database/
.Do not substitute the
STARTTIME
andENDTIME
parameters in the WHERE filter. These parameters are not actual column names and should stay as is. They are expected by the connector and will be substituted with the start and end date of the QLI range selected in the user interface when QLI is run manually or on schedule.
To configure query-based QLI:
Go to the Query Log Ingestion tab of the Settings page of your OCF data source.
Under Connector Settings > Query Extraction, in the Custom QLI Query field, specify the QLI query.
Click Save.
QLI Query Template¶
SELECT server_principal_name as userName, event_time as startTime, [statement] as queryString, session_id as sessionId, event_time as sessionStartTime, duration_milliseconds as milliSeconds, 'N' as cancelled, database_name as defaultDatabases FROM sys.fn_get_audit_file('<container_URI>', default, default) WHERE event_time BETWEEN (STARTTIME) AND (ENDTIME);
Perform QLI¶
You can either perform QLI manually on demand or enable automated QLI:
To perform manual QLI, under the Automated and Manual Query Log Ingestion section of the Query Log Ingestion tab, ensure that the Enable Automated Query Log Ingestion toggle is disabled.
Note
Metadata extraction must be completed first before running QLI.
Click Preview to get a sample of the query history data to be ingested.
Click the Import button to perform QLI on demand.
To schedule QLI, enable the Enable Automated Query Log Ingestion toggle.
Set a schedule under Automated Query Log Ingestion Time by specifying values in the week, day, and time fields. The next QLI job will run on the schedule you have specified.
Note
The hourly schedule for automated QLI is not supported.
Log Location¶
Refer to Troubleshooting for information about logs.