Kafka OCF Connector: Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

The OCF connector for Apache Kafka is available as a Zip file from Alation’s Connector Hub (requires a login via the Alation Customer Portal). The connector file can be uploaded and installed in the Alation application. The connector is compiled together with the required database driver, so no additional effort is needed to procure and install the driver.

This connector should be used to catalog Apache Kafka or Confluent Kafka as a data source on Alation on-premise and Cloud Service instances. It extracts and catalogs such database objects as schemas, tables, and columns. After the metadata is extracted, it is represented in the data catalog as a hierarchy of catalog pages under the parent data source. Alation users can leverage the full catalog functionality to search for and find the extracted metadata, curate the corresponding catalog pages, create documentation about the data source, and exchange information about it.

Team

The following administrators are required to install this connector:

  • Kafka administrator:

    • Creates a service account with the required privileges to extract metadata.

    • Provides the JDBC URI.

    • Provides the authentication information and assists in configuring the authentication.

  • Alation administrator:

    • Ensures that Alation Connector Manager is installed and running or installs it.

    • Installs the connector.

    • Creates and configures a Kafka data source in the catalog.

    • Performs initial extraction and prepares the data source for Alation users.

Scope

The table below lists the features supported by the connector.

Feature

Scope

Availability

Authentication

Basic

Authentication with a service account created on the database using username, password, and Bootstrap server

Yes

SSL

Connection over the TLS protocol

No

LDAP

Authentication with the LDAP protocol

No

OAuth

Authentication with the OAuth 2.0 protocol.

No

SSO

Authentication using an SSO flow through an IdP application

No

Metadata Extraction (MDE)

Default MDE

Extraction of metadata based on the JDBC driver methods in the connector code

Yes

Custom query-based MDE

Extraction of metadata based on extraction queries provided by a user

No

Popularity

Indicator of the popularity (intensity of use) of a data object, such as a table or a column

No

Extracted metadata objects

Data Source

Data source object in Alation that is parent to the extracted metadata

Yes

Schemas

List of schemas

Yes

Tables

List of tables

Yes

Columns

List of columns

Yes

Column data types

Column data types

Yes

Views

List of views

N/A

Source comments

Source comments

N/A

Primary keys

Primary key information for extracted tables

N/A

Foreign keys

Foreign key information for extracted tables

N/A

Functions

Extraction of function metadata

N/A

Function definitions

Extraction of function definition metadata

N/A

Sampling and Profiling

Table sampling

Retrieval of data samples from extracted tables

Yes

Column sampling

Retrieval of data samples from extracted columns

Yes

Deep column profiling

On-demand profiling of specific columns with the calculation of value distribution stats

Yes

Dynamic profiling

On-demand table and column profiling by individual users who use their own database accounts to retrieve the profiles

Yes

Custom query-based table sampling

Ability to use custom queries for sampling specific tables

No

Custom query-based column profiling

Ability to use custom queries for profiling specific columns

Yes

Query Log Ingestion (QLI)

Not supported

Lineage

Not supported

Compose

Not supported

Object Mapping

The following table shows how extracted Kafka objects map onto Alation objects:

Kafka Concept

Mapping in Alation

ApacheKafka (Single model data source)

Schema

Topic or Subject

Table*

Message

Records in Table

* The connector extracts the schema related to the topics from the schema registry and presents it as a Table.