OpenLineage Integration with Apache Airflow (Beta)

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Overview

The integration with Airflow lets Alation consume OpenLineage events from your Airflow environment and make cross-source lineage available under the relevant data sources so that users can trace data movement across pipelines and downstream systems.

Key Behaviors

  • Lineage is formed only from successful OpenLineage events. Failed or incomplete events don’t create lineage links.

  • Each event must include both input datasets (sources) and output datasets (targets); Alation relies on this to “stitch” lineage together.

  • Resulting lineage appears on the Lineage tab of relevant objects in the catalog and can participate in Impact Analysis. The Lineage diagram displays additional details:

    • Airflow indicators show jobs originating from Airflow

    • Dataflow details include metadata from the Airflow DAG

    • The job name

    • Namespace

    • Event type

    • Event completion time

    ../../../_images/OpenLineage_Airflow_ExampleChart.png

Supported Airflow Environments

Supported Airflow Versions

Any Airflow distribution that can install and run the official OpenLineage provider. Consult your distribution’s compatibility matrix.

Supported Operators

OpenLineage events are generated only for specific Airflow operators:

  • See the authoritative list in the Airflow OpenLineage provider docs: Supported classes.

  • Alation has validated lineage resolution with the following commonly-used operators:

    • SnowflakeOperator

    • PostgresOperator

    • Redshift operators:

      • PostgresOperator

      • SQLExecuteQueryOperator

    • MySqlOperator

    • CopyFromExternalStageToSnowflakeOperator (S3 to Snowflake)

How the Integration Works

  1. Your Airflow deployment emits OpenLineage events during task execution via the OpenLineage provider.

  2. Events are sent over HTTPS to your Alation ingestion endpoint and include:

    • job run context

    • namespace

    • inputs

    • outputs

  3. Alation processes events and builds lineage links between sources and targets it discovered.