Timeliness Checks

Timeliness checks (also known as freshness checks) determine the relative age of data by measuring how recently the data was last updated or added. These checks are essential for ensuring data currency and detecting stale data that may impact business decisions and operational processes.

Timeliness checks validate that data is sufficiently current by measuring the age of the most recently added or updated records. These checks help organizations maintain data currency requirements, detect data pipeline issues, and ensure that business decisions are based on up-to-date information

  • Validation rules that measure how recently data was last updated or ingested.

  • Quality indicators that detect stale or outdated data that may impact business operations.

  • Monitoring tools that ensure data pipeline SLAs are met and data flows are functioning properly.

  • Age calculators that compare the timestamp of the most recent data to the current time.

Consider the following components for a timeliness check:

  • Timestamp Column: The date/time column used to measure data age (created_at, updated_at, etc.)

  • Reference Point: The “now” time used for comparison (typically scan execution time)

  • Threshold: Maximum acceptable age for the data (hours, days, weeks)

  • Time Units: Granularity of measurement (seconds, minutes, hours, days)

  • Timezone Handling: How to handle timezone differences in global systems

  • Business Context: SLA requirements and business criticality of data currency

Follow these best practices for timeliness checks:

  • Apply to time-sensitive data sources like transaction logs, customer interactions, and real-time feeds.

  • Set appropriate thresholds based on business SLAs and data update frequencies.

  • Consider timezone implications when working with global data sources.

  • Monitor timeliness trends to identify data pipeline issues or source system problems.

  • Combine with completeness checks to ensure both data presence and currency.

Available Metrics

Metric

Description

Supported Data Types

Calculation Method

freshness

Measures the age of the most recently added data in a timestamp column by comparing the maximum timestamp value to the current time

TIMESTAMP, DATE

Current time - MAX(timestamp_column)

Time Unit Specifications

Unit

Syntax

Description

Example

Days

#d

Number of days

3d (3 days)

Hours

#h

Number of hours

6h (6 hours)

Minutes

#m

Number of minutes

30m (30 minutes)

Combined

#d#h

Days and hours

1d6h (1 day and 6 hours)

Combined

#h#m

Hours and minutes

2h30m (2 hours and 30 minutes)

Configuration Examples

Check Description

Configuration Steps

Example Check

Ensure data is updated within 24 hours

  1. Select timestamp column last_updated.

  2. Choose Freshness check.

  3. Select < operator.

  4. Set threshold to 1d.

freshness(last_updated) < 1d

Validate hourly data ingestion

  1. Select timestamp column created_at.

  2. Choose Freshness check .

  3. Select < operator.

  4. Set threshold to 2h.

freshness(created_at) < 2h

Check weekly data refresh

  1. Select timestamp column processed_date.

  2. Choose Freshness check.

  3. Select < operator.

  4. Set threshold to 7d.

freshness(processed_date) < 7d

Common Timeliness Check Types

Real-Time Data Validation

-- Transaction data should be near real-time
freshness(transaction_timestamp) < 5m

-- Sensor data should be current within minutes
freshness(sensor_reading_time) < 10m

-- User activity tracking
freshness(last_activity_time) < 30m

Hourly Data Ingestion

-- Hourly sales data updates
freshness(sales_hour_timestamp) < 2h

-- Log file processing validation
freshness(log_processed_time) < 1h30m

-- API data synchronization
freshness(api_sync_timestamp) < 4h

Daily Data Refresh

-- Daily ETL process validation
freshness(etl_completion_time) < 1d

-- Customer data synchronization
freshness(customer_sync_date) < 1d6h

-- Inventory updates
freshness(inventory_update_date) < 2d

Weekly and Monthly Data

-- Weekly reporting data
freshness(week_end_date) < 7d

-- Monthly aggregation validation
freshness(month_close_date) < 30d

-- Quarterly data validation
freshness(quarter_end_date) < 90d

Conditional Timeliness

-- Business hours data validation
freshness(order_timestamp) < 1h:
filter: EXTRACT(hour FROM order_timestamp) BETWEEN 9 AND 17

-- Weekend processing validation
freshness(batch_completion_time) < 2d:
filter: EXTRACT(dow FROM batch_completion_time) IN (0, 6)

-- High-priority customer data
freshness(vip_customer_update) < 30m:
filter: customer_tier = 'VIP'

Multi-System Timeliness

-- CRM data synchronization
freshness(crm_last_sync) < 4h

-- Data warehouse refresh validation
freshness(dwh_load_timestamp) < 1d

-- External API data currency
freshness(external_data_timestamp) < 6h