HDFS OCF Connector: Overview¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Available from release 2022.3.2
The OCF connector for HDFS is developed by Alation and available on demand as a Zip file for you to upload and install in the Alation application.
The latest HDFS OCF connector package is available on the Connector Hub. Ask an Alation admin with access to the Customer Portal to download the connector from the Connectors section of the Portal (Customer Portal > Connectors > Alation Connector Hub).
Use the OCF connector for HDFS to catalog HDFS as a file system source in Alation. It extracts HDFS objects such as root folders along with their content, which includes files and folders. It enables catalog users to discover, search, browse, and curate HDFS objects, such as files and folders, from the Alation user interface. The HDFS OCF connector can be used to catalog metadata from the HDFS file system deployed on the Cloudera Distributed Hadoop (CDH) or Cloudera Data Platform (CDP) platforms.
Team¶
The following administrators are required to install this connector:
Alation Server Admin:
Validates the availability of Alation Connector Manager or installs it
Installs the connector
Adds and configures the HDFS file system source in Alation
Kerberos Server Administrator:
Provides the Kerberos configuration file (krb5.conf)
Scope¶
The table below lists the features supported by the connector.
Feature |
Scope |
Availability |
---|---|---|
Authentication |
||
Basic (username) |
Authentication with a service account. |
Yes |
SSL |
Connection over the TLS protocol. |
Yes |
Kerberos |
Authentication with the Kerberos protocol. |
Yes |
LDAP |
Authentication with the LDAP protocol. |
No |
OAuth |
Authentication with the OAuth 2.0 protocol. |
No |
SSO |
Authentication using an SSO flow through an IdP application. |
No |
Metadata extraction (MDE) |
Metadata Extraction uses the WebHDFS REST API. The HDFS OCF connector calls the List a Directory API to get the list of files and folders. |
Yes |
Extracted metadata objects |
||
Files |
Files on the HDFS server. |
Yes |
Folders |
Folders on the HDFS server. |
Yes |
Attributes or columns |
Table attributes or columns within a file or folder on the HDFS server. |
Yes |
Object permissions |
Information related to object permissions. |
Yes |
Object owner |
Information related to the object owner. |
Yes |
Object group |
Object owner group name. |
Yes |
Lineage |
||
Automatic lineage generation |
Auto-calculation of lineage based on query history ingested from MDE queries. |
Yes |
Direct lineage |
Extraction of lineage from system tables during MDE. |
Yes |
Column-level lineage |
Extraction of lineage on the column level |
No |
Object Mapping¶
File System Object |
Alation Object |
---|---|
File system |
File system source. |
Folder |
Directory or File.
|
File |
File |