Overview¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Available from release 2022.3.5
Overview¶
The OCF connector for Amazon S3 is developed by Alation and is available for download from Alation’s Customer Portal (Customer Portal homepage > Connectors).
Use this connector to catalog S3 buckets as a file system source in Alation.
The connector offers these capabilities:
Metadata extraction — Extract and catalog S3 objects, such as buckets, and the content of buckets, such as folders and files. Users will be able to discover, search, browse, and curate S3 objects as Alation objects in the Alation user interface.
Column or Schema extraction — Extract and catalog column headers found in semi-structured file formats. Currently supported for Parquet, CSV, PSV, and TSV. Users can search and curate the column headers cataloged from each file as column objects. The column or schema extraction may be a time-intensive operation as it involves reading individual files.
File sampling — As a catalog user, initiate S3 file sampling, on-demand. This operation will retrieve randomly sampled rows of data from the file, providing a deeper insight into the file’s structure and data.
To retrieve the S3 bucket metadata, the connector reads from an Amazon S3 inventory. For more information about inventories, refer to Amazon S3 Inventory in AWS documentation.
Note
When you run MDE, the Amazon S3 OCF connector obtains the list of inventory files from the latest manifest.json file for the respective bucket. Manifest files are available at the following location in the destination bucket:
destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.json
For more information about manifest.json file, refer to Inventory manifest in AWS documentation.
Team¶
The following administrators are required to install this connector:
Amazon S3 administrator:
Performs the required configuration in S3, such as configuring the inventory and providing access for Alation.
Creates an IAM user and roles to authorize read-only access to the data and inventory buckets.
Assists in collecting the necessary configuration information from AWS.
Alation Server Admin:
Installs the Alation Connector Manager if required.
Installs the connector.
Adds and configures the S3 file system source in Alation.
Scope¶
The table below shows the catalog features supported by this connector. For version support information, refer to Support Matrices.
Feature |
Availability |
---|---|
Core Capabilities |
|
Metadata extraction (MDE) via basic authentication (AWS Access Key and Secret Key) |
|
Metadata extraction (MDE) via AWS IAM role and IAM user-based STS authentication |
|
Column extraction |
|
Search |
|
Catalog page curation |
|
Catalog sets |
|
Propagation of trust flags |
|
Popularity |
|
Sampling |
|
File sampling via basic authentication or SSO |
|
File sampling via STS authentication |
|
Extracted Metadata |
|
Folders |
|
Files |
|
Columns |
|