Update Alation from Version 2023.3.x to Version 2024.3

Customer Managed Applies to customer-managed instances of Alation

Use the steps on this page to update Alation from version 2023.3.x.

Note

For general information about upgrading to 2024.3, see Update Alation to 2024.3.

In version 2024.3, Alation’s internal database and the Alation Analytics database are on version 16.2. The database upgrade happens automatically during the update but requires additional actions before and after.

Step 1: Scan Postgres

We recommend using the scan_postgres action to validate that the internal Postgres database is in a healthy state before the update. For steps, see How to Scan Postgres for Corrupted Indexes.

Note

If in your instance the Postgres scan runs on a schedule, you can check the scan-postgres.log file in /opt/alation/site/logs inside the Alation shell to check the Postgres state.

Step 2: Verify Backup Availability

Ensure you have a valid, up-to-date Alation backup. We recommend taking the most recent backup possible to closely match your current data.

Step 3: Prepare for the Postgres Upgrade

As part of the update to 2024.3, Alation’s internal PostgreSQL database, or Postgres, will be upgraded to version 16.2. To facilitate a successful upgrade:

Note

On a High Availability (HA) pair that is updated through splitting and rebuilding the HA cluster, these additional steps must be performed on both primary and secondary instances.

Check Disk Space

As an optional precaution, we recommend running a few checks on disk space and physical memory availability to ensure that your instance configuration can support the PostgreSQL upgrade. The likelihood of an issue is very low—primarily if the physical memory allocation on devtmpfs is set too low (for example, in megabytes). Such cases are rare and may result from previous instance configuration choices. While the PostgreSQL upgrade may still complete successfully, insufficient memory allocation could lead to performance issues later, for example when extracting large volumes of metadata from data sources.

We recommend:

  • Verifying the available space in the /data1 and /data2 mounts to ensure it’s at least 20% of the total allocated space.

  • Checking that the devtmpfs file system is set to 50% of the physical memory. The setting of 50% is usually default unless it has been modified.

To perform the disk space and memory checks:

  1. Use SSH to connect to the Alation server.

  2. Run the following command:

    df -h
    
  3. Review the output:

    • Look for the Mounted on values for /data1 and /data2.

    • In the Use% column, ensure the value is 80% or less, indicating that at least 20% of space is available.

    Example output:

    Filesystem      Size  Used Avail Use% Mounted on
    /dev/xvdd1      147G  482M  139G   1% /data1
    /dev/xvdb1       98G  4.6G   89G   5% /data2
    devtmpfs         15G  156K   15G   1% /dev
    

    Note

    To calculate available disk space, subtract the Use% value from 100%: 100% - Use% = available space. For example, if we had to calculate the available space for the example output below, /data1 has 99% available, and /data2 has 95% available. Each has more than 20% available.

    If Use% is greater than 80%, add more disk space before performing any system updates.

    Note

    If you use Alation Analytics, please note that it has similar space requirement for this update. See more in Check Disk Space for Alation Analytics. If you need to add space, consider increasing disk space for both simultaneously to Alation to avoid repetitive adjustments.

  4. Next, check memory allocation for the RAM-based devtmpfs file system and ensure it uses 50% of the total system’s RAM. Start with displaying the current memory allocation to /dev. In the output, check the value for Avail.

    df -h /dev
    

    Example output:

    Filesystem      Size   Used   Avail   Use%   Mounted on
    devtmpfs        15G    156K   15G    1%      /dev
    

    In this example, the available physical memory is 15G.

  5. Check the total available RAM:

    free -h
    

    Example output:

             total    used     free    shared    buff/cache   available
    Mem:     64G      2.5G     62G     500M      1.5G         63G
    
  6. Calculate and adjust memory allocation:

    • Check the value for total.

    • Calculate how much of the total is allocated to /dev. In the example above, 15G available for devtmpfs is approximately 23% of the total of 64G.

    • If /dev is using less than 50% of total RAM, you need to adjust this to 50%. In the example above, you would set /dev to use 32G (50% of 64G):

    sudo mount -o remount,size=32G devtmpfs /dev
    
  7. If necessary, update the /etc/fstab file to ensure this configuration persists across reboots. This isn’t required for all systems, but some, like Ubuntu, appear to require it.

    • For the example we looked at in these steps, you would set it to 32G.

    devtmpfs      /dev    tmpfs   defaults,size=32G 0 0
    

Update Extensions and Drop Custom Aggregate Functions

To ensure compatibility with the newer Postgres version, you’ll need to update database extensions and identify and drop any custom aggregate functions of types anyarray and anyelement in your Postgres databases. Such functions are not typically part of the standard Alation product but may exist due to previous customizations of Postgres. After the upgrade, any custom functions that were dropped must be manually restored to maintain their functionality.

If any of the user-defined functions exist during the update, it is expected to fail with the following exception in the update logs installer.log (/opt/alation/site/logs inside the chroot):

Your installation contains user-defined objects that refer to internal polymorphic functions with arguments of type “anyarray” or “anyelement”. These user-defined objects must be dropped before upgrading and restored afterwards, changing them to refer to the new corresponding functions with arguments of type “anycompatiblearray” and “anycompatible”.

Note

If you have custom aggregate functions in your Postgres instances, you may need to review and update the SQL code of the function definition when recreating them in the newer PostgreSQL version. You will be able to retrieve the current function definitions using this guide. Carefully review the instructions and plan your steps accordingly. You can reach out to Alation Support if you need assistance recreating the functions.

Alation provides a script that helps prepare for the Postgres upgrade. The script covers both types of Postgres deployments: built-in or externalized to Amazon RDS. The script should be used to:

  • Update Postgres extensions

  • Check for the presence of custom aggregate functions in the internal Postgres databases (Rosemeta, Lineage, Template1).

If custom aggregate functions are found, the same script will be used to drop them.

To prepare for the Postgres upgrade:

  1. Download the Zip file with the script to your local machine from Alation Community: 2024.x Update Helper Script (requires a login)

  2. Extract the script file pgupgrade_helper.py.

  3. Copy the file to the Alation server. Initially, you can place it into the /tmp directory on the host.

  4. Use SSH to connect to the Alation server.

  5. Transfer the extracted pgupgrade_helper.py file to a directory accessible from the Alation chroot. For example, if you have it in the /tmp directory on the host, you could copy it to the /tmp directory /opt/alation/alation/data1/tmp/, which is /data1/tmp when accessed from inside the chroot:

    sudo cp /tmp/pgupgrade_helper.py /opt/alation/alation/data1/tmp/
    

    Note

    The same script will also be used when updating Alation Analytics. We recommend leaving the script in this directory until you have upgraded all Alation components.

  6. Enter the Alation shell.

    sudo /etc/init.d/alation shell
    
  7. Change the user to alation.

    sudo su alation
    
  8. Navigate to the directory where you placed the pgupgrade_helper.py file, for example /data1/tmp/ (inside the chroot) if you previously moved it there.

  9. Run the script with the --pre parameter. The script will update Postgres extensions and check for the presence of custom aggregate functions of types anyarray and anyelement. If any functions are dropped, their definitions will be saved on the Alation server. The script will not drop any functions yet, allowing for an analysis of what customizations your instance contains.

    python pgupgrade_helper.py --pre
    
  10. Analyze the output: look for the text Aggregate functions Check [Failed] and a list of functions below it.

    • If you don’t find this line and a list of functions, the script hasn’t identified any functions to be dropped. Exit from the alation user and the Alation shell by using exit twice. Continue with updating Alation.

      Example output when no functions were identified:

      (env) PROD [alation@ip-10-13-31-60 ~]$ python pgupgrade_helper.py --pre
      Pre flag: True
      Post flag: False
      CPU: None
      Drop Function: False
      Connected to database postgres [Success]
      Updating Extensions [Success]
      Connected to database rosemeta [Success]
      Updating Extensions [Success]
      Connected to database template1 [Success]
      Updating Extensions [Success]
      Connected to database lineage [Success]
      Updating Extensions [Success]
      Pre Upgrade tasks completed [Success]
      
    • If you find the line Aggregate functions Check [Failed] and a list of functions below it, those functions need to be dropped. Continue to step 11 of this instruction.

      Example output where a function was identified:

      (env) PROD [alation@ip-10-13-31-60 ~]$ python pgupgrade_helper.py --pre
      Pre flag: True
      Post flag: False
      CPU: None
      Drop Function: False
      Connected to database postgres [Success]
      Updating Extensions [Success]
      Connected to database rosemeta [Success]
      Aggregate functions Check [Failed]
      - array_agg_array
      Updating Extensions [Success]
      Connected to database template1 [Success]
      Updating Extensions [Success]
      Connected to database lineage [Success]
      Updating Extensions [Success]
      Pre Upgrade tasks completed [Success]
      
  11. Run the script again with the --pre and --drop-func parameters. The script will retrieve the definitions of the identified aggregate functions, record them into a file, and drop the functions.

    Note

    The --pre parameter is still required during the second run as it identifies the pre-upgrade stage of the process.

    Example output:

    (env) PROD [alation@ip-10-13-31-60 ~]$ python pgupgrade_helper.py --pre --drop-func
    Pre flag: True
    Post flag: False
    CPU: None
    Drop Function: True
    Connected to database postgres [Success]
    Updating Extensions [Success]
    Connected to database rosemeta [Success]
    Aggregate functions Check [Failed]
    - array_agg_array
    Dropped function array_agg_array [Success]
    Updating Extensions [Success]
    Connected to database template1 [Success]
    Updating Extensions [Success]
    Connected to database lineage [Success]
    Updating Extensions [Success]
    Saved all the dropped functions definition: /data1/tmp/recreate-agg-function.sql
    Pre Upgrade tasks completed [Success]
    
  12. Check the contents of the recreate-agg-function.sql file. (The path to the file will be in the line Saved all the dropped functions definition:. It depends on where you have the script, as the file is generated in your working directory). The file contains the information about the databases and the function definitions of all dropped functions. For example:

    (env) PROD [alation@ip-10-13-31-60 ~]$ cat /data1/tmp/recreate-agg-function.sql
    rosemeta,create aggregate public.array_agg_array(anyarray)
    (sfunc = array_cat, stype = anyarray);
    

    Where:

    • rosemeta is the database from which a function was deleted

    • create aggregate public.array_agg_array(anyarray) (sfunc = array_cat, stype = anyarray); is the corresponding function definition.

  13. Copy the file to your local machine, renaming it to recreate-agg-function_Postgres.sql. It’s crucial to restoring the functions manually after the update.

    Warning

    The file should be saved locally as it will be overwritten when you run the same script during the Alation Analytics update.

  14. Exit from the alation user.

    exit
    
  15. Exit from the Alation shell.

    exit
    
  1. Proceed with the Alation update: Step 4: Update the Alation Application.

Step 4: Update the Alation Application

Step 5: Update Alation Connector Manager

This step applies if you are using Open Connector Framework (OCF) and OCF connectors.

Update Alation Connector Manager using the steps in Update Alation Connector Manager.

Step 6: Update Alation Analytics

This step applies if you are using the Alation Analytics application.

Use the steps in Update Alation Analytics Database to 16.2 or 16.4 and a Compatible Release to update the Alation Analytics.

Step 7: Update Extensions and Restore Custom Aggregate Functions

After updating Alation, it’s important to update the extensions and restore the custom aggregate functions if any were dropped to ensure compatibility with the new version.

Update Extensions

To update the Postgres extensions:

  1. Use SSH to connect to the Alation instance.

  2. We recommend running the next command using a terminal multiplexer, such as Screen. If Screen is available, start a screen session.

    screen -S alation-extensions
    
  3. Enter the Alation shell.

    sudo /etc/init.d/alation shell
    
  4. Change the user to alation.

    sudo su alation
    
  5. Navigate to the directory where you placed the pgupgrade_helper.py file.

  6. Run the script with the --post parameter. The script will update Postgres extensions and run an analysis of the internal Postgres databases on your instance.

    python pgupgrade_helper.py --post
    

    Example output:

    (env) PROD [alation@ip-10-13-31-60 ~]$ python pgupgrade_helper.py --post
    Pre flag: False
    Post flag: True
    CPU: None
    Drop Function: False
    Connected to database template1 [Success]
    Updating Extensions [Success]
    Connected to database postgres [Success]
    Updating Extensions [Success]
    Connected to database rosemeta [Success]
    Updating Extensions [Success]
    Connected to database lineage [Success]
    Updating Extensions [Success]
    vacuumdb: vacuuming database "lineage"
    vacuumdb: vacuuming database "postgres"
    vacuumdb: vacuuming database "rosemeta"
    vacuumdb: vacuuming database "template1"
    Analyze DB [Success]
    Post Upgrade tasks completed [Success]
    
  7. Your next step depends on whether or not any custom aggregate functions were dropped before the update.

Restore Custom Aggregate Functions

To restore the functions:

  1. Review the function definitions from the recreate-agg-function_Postgres.sql file that was generated by the script.

  2. Update the SQL code as necessary to be compatible with the newer PostgreSQL version. You may need to change anyarray to anycompatiblearray and anyelement to anycompatible or update the SQL code in other ways, depending on your specific customization.

  3. On the Alation host, enter the Alation shell if you have exited if previously.

    sudo /etc/init.d/alation shell
    
  4. Enter the Postgres shell.

    alation_psql
    
  5. Run the following command for each function to be recreated, replacing the placeholder func-definition with the SQL code of the specific definition you are restoring.

    CREATE AGGREGATE func-definition;
    

    Example:

    CREATE AGGREGATE public.array_agg_array(anyarray) (sfunc = array_cat, stype = anyarray);
    
  6. Exit alation_psql.

    \q
    

Step 8: Rebuild Search Index

Rebuild your search index if:

  • Your search index hasn’t been updated since version 2022.4. Even if you’ve updated Alation from version 2023.3.x, the index may still be using the old search schemas.

Determine if Search Index Requires Rebuilding

To determine if your search index requires rebuilding:

  1. Use SSH to connect to the Alation server.

  2. Enter the Alation shell using the following command:

    sudo /etc/init.d/alation shell
    
  3. Run the following command to inspect the schema of your search index:

    curl localhost:9200/live/_mappings | grep dbtype.
    
  4. In the output, look for the dbtype field output:

    • If dbtype is text, then a rebuild is required.

      ../../../_images/UpdateTo202335Text.png
    • If dbtype is keyword, then no action is needed.

      ../../../_images/UpdateTo202335Keyword.png
  5. If you’ve determined that an index rebuild is required, proceed to index rebuilding.

Rebuild Search Index

Rebuilding the search index aligns it with the latest search schemas and can be performed without downtime. For detailed steps, see How to Rebuild Search Index Without Downtime.