Databricks Unity Catalog Connector Extraction Queries¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Important
This section is applicable from connector versions 3.0.0 onwards.
The default extraction queries for Databricks Unity Catalog Connector OCF data sources are listed below. You can customize them to better suit your extraction requirements. Custom queries can be provided on the Metadata Extraction tab of the data source Settings page under the Custom extraction queries section.
You can customize all or some of the queries.
If specified, custom extraction queries will take precedence over the default MDE , selective extraction, and schema selection.
Catalog¶
Make sure that your query has a column labeled as CATALOG
in the SELECT statement.
SELECT
CATALOG_NAME AS CATALOG
FROM
SYSTEM.INFORMATION_SCHEMA.CATALOGS
WHERE
CATALOG_NAME NOT IN('system');
Schema¶
Make sure that your query has columns labeled as CATALOG
, SCHEMA
and REMARKS
in the SELECT statement.
SELECT
CATALOG_NAME AS CATALOG,
SCHEMA_NAME AS SCHEMA,
COMMENT AS REMARKS
FROM
SYSTEM.INFORMATION_SCHEMA.SCHEMATA
WHERE
CATALOG_NAME NOT IN ('system')
AND CONCAT(CATALOG_NAME,
'.',
SCHEMA_NAME) NOT IN ('''')
AND SCHEMA_NAME NOT IN('information_schema');
Table¶
Make sure that your query has columns labeled as CATALOG
, SCHEMA
, TABLE_NAME
, TABLE_TYPE
, and REMARKS
in the SELECT statement.
SELECT
TABLE_CATALOG AS CATALOG,
TABLE_SCHEMA AS SCHEMA,
TABLE_NAME AS TABLE_NAME,
TABLE_TYPE AS TABLE_TYPE,
COMMENT AS REMARKS
FROM
SYSTEM.INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE NOT IN( 'MATERIALIZED_VIEW', 'VIEW')
AND TABLE_CATALOG NOT IN ('system')
AND CONCAT(TABLE_CATALOG,
'.',
TABLE_SCHEMA)
NOT IN ('''')
AND TABLE_SCHEMA NOT IN('information_schema');
View¶
Make sure that your query has columns labeled as CATALOG
, SCHEMA
, VIEW_NAME
, VIEW_CREATE_STATEMENT
, 'VIEW' AS VIEW_TYPE
, and REMARKS
in the SELECT statement.
SELECT
V.TABLE_CATALOG AS CATALOG,
V.TABLE_SCHEMA AS SCHEMA,
V.TABLE_NAME AS VIEW_NAME,
'VIEW' AS VIEW_TYPE,
CASE
WHEN lower(V.VIEW_DEFINITION) LIKE 'create view%' THEN V.VIEW_DEFINITION
WHEN lower(V.VIEW_DEFINITION) LIKE 'create or replace view%' THEN V.VIEW_DEFINITION
WHEN lower(V.VIEW_DEFINITION) LIKE 'create materialized view%' THEN V.VIEW_DEFINITION
WHEN lower(V.VIEW_DEFINITION) LIKE 'create or replace materialized view%' THEN V.VIEW_DEFINITION
WHEN T.TABLE_TYPE = 'MATERIALIZED_VIEW' THEN 'CREATE OR REPLACE MATERIALIZED VIEW ' || V.TABLE_CATALOG || '.' || V.TABLE_SCHEMA || '.' || V.TABLE_NAME || ' AS ' || V.VIEW_DEFINITION
ELSE 'CREATE OR REPLACE VIEW ' || V.TABLE_CATALOG || '.' || V.TABLE_SCHEMA || '.' || V.TABLE_NAME || ' AS ' || V.VIEW_DEFINITION
END AS VIEW_CREATE_STATEMENT,
T.COMMENT AS REMARKS
FROM
SYSTEM.INFORMATION_SCHEMA.VIEWS V
LEFT JOIN SYSTEM.INFORMATION_SCHEMA.TABLES T
ON
V.TABLE_CATALOG = T.TABLE_CATALOG
AND V.TABLE_SCHEMA = T.TABLE_SCHEMA
AND T.TABLE_TYPE IN ('MATERIALIZED_VIEW', 'VIEW')
AND V.TABLE_NAME = T.TABLE_NAME
WHERE
V.TABLE_CATALOG NOT IN ('system')
AND CONCAT(V.TABLE_CATALOG,
'.',
V.TABLE_SCHEMA) NOT IN ('''')
AND V.TABLE_SCHEMA NOT IN('information_schema');
Column¶
Make sure that your query has columns labeled as CATALOG
, SCHEMA
, TABLE_NAME
, TYPE_NAME
, DATA_TYPE
, COLUMN_NAME
, ORDINAL_POSITION
, IS_NULLABLE
, REMARKS
, and COLUMN_DEFAULT
in the SELECT statement.
SELECT
TABLE_CATALOG AS CATALOG,
TABLE_SCHEMA AS SCHEMA,
TABLE_NAME AS TABLE_NAME,
COLUMN_NAME AS COLUMN_NAME,
ORDINAL_POSITION AS ORDINAL_POSITION,
DATA_TYPE AS TYPE_NAME,
FULL_DATA_TYPE AS DATA_TYPE,
FULL_DATA_TYPE AS COLUMN_DEF,
COLUMN_DEFAULT AS COLUMN_DEFAULT,
IS_NULLABLE AS IS_NULLABLE,
COMMENT AS REMARKS
FROM
SYSTEM.INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_CATALOG NOT IN ('system')
AND CONCAT(TABLE_CATALOG,
'.',
TABLE_SCHEMA) NOT IN ('''')
AND TABLE_SCHEMA NOT IN('information_schema');
Primary Key¶
Make sure that your query has columns labeled as CATALOG
, SCHEMA
, TABLE_NAME
, and COLUMN_NAME
in the SELECT statement.
SELECT
KCU.TABLE_CATALOG AS CATALOG,
KCU.TABLE_NAME AS TABLE_NAME,
KCU.COLUMN_NAME AS COLUMN_NAME,
KCU.TABLE_SCHEMA AS SCHEMA
FROM
SYSTEM.INFORMATION_SCHEMA.TABLE_CONSTRAINTS TC
JOIN SYSTEM.INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU
ON
KCU.CONSTRAINT_SCHEMA = TC.CONSTRAINT_SCHEMA
AND KCU.CONSTRAINT_NAME = TC.CONSTRAINT_NAME
AND KCU.TABLE_SCHEMA = TC.TABLE_SCHEMA
AND KCU.TABLE_NAME = TC.TABLE_NAME
WHERE
KCU.TABLE_CATALOG NOT IN ('system')
AND TC.CONSTRAINT_TYPE = 'PRIMARY KEY'
AND
CONCAT(KCU.TABLE_CATALOG,
'.',
KCU.TABLE_SCHEMA)
NOT IN ('''')
AND KCU.TABLE_SCHEMA NOT IN
('information_schema');
Foreign Key¶
Make sure that your query has columns labeled as PK_CATALOG
, PK_SCHEMA
, PK_TABLE
, PK_COLUMN
, FK_CATALOG
, FK_SCHEMA
, FK_TABLE
, and FK_COLUMN
in the SELECT statement.
SELECT
KCUF.TABLE_CATALOG AS FK_CATALOG,
RC.CONSTRAINT_NAME FK_NAME,
KCUF.TABLE_SCHEMA FK_SCHEMA,
KCUF.TABLE_NAME FK_TABLE,
KCUF.COLUMN_NAME FK_COLUMN,
RC.UNIQUE_CONSTRAINT_NAME PK_NAME,
KCUP.TABLE_CATALOG AS PK_CATALOG,
KCUP.TABLE_SCHEMA PK_SCHEMA,
KCUP.TABLE_NAME PK_TABLE,
KCUP.COLUMN_NAME PK_COLUMN
FROM
SYSTEM.INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC
JOIN SYSTEM.INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCUF ON
RC.CONSTRAINT_NAME = KCUF.CONSTRAINT_NAME
JOIN SYSTEM.INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCUP ON
RC.UNIQUE_CONSTRAINT_NAME = KCUP.CONSTRAINT_NAME
JOIN SYSTEM.INFORMATION_SCHEMA.TABLE_CONSTRAINTS TC ON
KCUP.CONSTRAINT_SCHEMA = TC.CONSTRAINT_SCHEMA
AND KCUF.CONSTRAINT_NAME = TC.CONSTRAINT_NAME
WHERE
KCUF.TABLE_CATALOG NOT IN ('system')
AND TC.CONSTRAINT_TYPE = 'FOREIGN KEY'
AND CONCAT(KCUF.TABLE_CATALOG,
'.',
KCUF.TABLE_SCHEMA)
NOT IN ('''')
AND KCUF.TABLE_SCHEMA NOT IN
('information_schema');