Customers migrate their mission-critical legacy on-premises databases based on IBM Db2 for z/OS to Amazon Relational Database Service (Amazon RDS) for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition for its scalability, performance, agility and availability.
In heterogeneous database migrations, not all features in the source database are supported or available in the target database. One such feature that is available in Db2 for z/OS is ROW CHANGE TIMESTAMP, which returns the last time when a row was inserted or modified.
In this post, we show you how to migrate ROW CHANGE TIMESTAMP from Db2 for z/OS to Amazon RDS for PostgreSQL or Aurora PostgreSQL-Compatible, which helps you record the last time when a row was inserted or modified in a table and retain similar source functionality on the target database. Additionally, we show you how to preserve the ROW CHANGE TIMESTAMP column value while copying the data to a similar structure table within the target database.
Understanding Db2 ROW CHANGE TIMESTAMP
When we define ROW CHANGE TIMESTAMP on a column, Db2 generates a timestamp value for each row as the row is inserted, and for any row in which any column is updated. If multiple records are inserted or updated with a single batch statement, the value for the ROW CHANGE TIMESTAMP column might be different for each row depending on when the record was processed.
As an example use case, we create the following sample employee tables in Db2.
ROW CHANGE TIMESTAMP is defined on the column EMP_INFOCHANGE, as shown in the following SQL statement, which automatically stores the timestamp value of when a particular record was created or modified.
The following query returns all the records that have changed in the last 30 days:
ROW CHANGE TIMESTAMP in PostgreSQL
PostgreSQL doesn’t provide the ROW CHANGE TIMESTAMP feature to determine the timestamp when the row was last updated or inserted. However, you can implement the following solution.
Connect to your PostgreSQL database.
In the following DDL, EMP_INFOCHANGE represents the ROW CHANGE TIMESTAMP column in the target table, which is equivalent to the respective Db2 column in the source:
The default value for the EMP_INFOCHANGE column must be clock_timestamp(), which gives the current date and time and changes for every DML statement run on the table.
Define a trigger function as shown in the following code so that the EMP_INFOCHANGE column is updated with the latest timestamp whenever the trigger operation is an update or insert:
Define a row level before the trigger at the table level to run during an insert or update operation:
This trigger is invoked one time for every row that is inserted (or updated).
Now, when we insert or update any record in the table source_1.emp_info, the EMP_INFOCHANGE column is generated or modified automatically, as shown in the following examples.
The following is code for an example insert:
We get the following output:
The following code shows an example update:
We get the following output:
If a user tries to override the timestamp value for the EMP_INFOCHANGE column, the operation is ignored because the trigger overrides the value with clock_timestamp(). We show you how to handle this case later in this post.
Preserving ROW CHANGE TIMESTAMP values in the target table while copying data from the source
There are two ways to transfer or migrate data from a source database table to a new target database table with ROW CHANGE TIMESTAMP column defined while maintaining the original timestamp value in the target table.
Let’s take our case, where we are migrating data from source Db2 for z/OS table to the target table (source_1.emp_info) on PostgreSQL database. Since we have used TRIGGER to implement ROW CHANGE TIMESTAMP functionality in EMP_INFOCHANGE column, our intent is that these triggers do not get executed until we copy data fully from Db2 to PostgreSQL.
If these triggers are not disabled then it will result in new values in EMP_INFOCHANGE column causing reconciliation breaks between source and target.
To solve this issue, you can implement one of the following 2 solutions on the target server.
Solution 1: Use the ALTER command
In this solution we explicitly disable the triggers on the table to avoid any change in ROW CHANGE TIMESTAMP column and enable it back once data copy activity is complete.
You can enable or disable triggers using the following ALTER commands before copying data:
Run the following command:
Enter the data copy commands from source to target.
Run the following command:
You can use this same solution if you’re copying data from PostgreSQL as source.
Solution 2: Use session_replication_role
This setting controls firing of replication-related triggers and rules for the current session. The effect of this will be that triggers, foreign key checks, and rules will not fire if it is set to replica .
So, you can set session_replication_role to replica for the source_1.tgf_emp_info_rct trigger function while data copy is going on, as shown in the following code.
This setting can also be changed at server level by altering value in parameter-group.
Note: Reset the value of session_replication_role to ‘origin’ (default) once data copy is complete or you can remove this setting from function definition altogether and re-create it.
See the following example:
Solution 2 is better suited during initial data-migration activity where you can disable all triggers and foreign-keys at server level. However, in Solution 1 you can disable trigger for specific tables without affecting other functionality at server level.
Limitations for this solution
The solution provided in this post doesn’t cover the ROW CHANGE TIMESTAMP expression feature in Db2 for z/OS. This allows you to use the expression in a SQL query to return the last time a row was changed or as a filter predicate to restrict rows without the need for defining the ROW CHANGE TIMESTAMP column on the underlying table.
The following table summarizes the feature comparison between Db2 and the solution provided in this post while migrating to PostgreSQL.
If multiple rows are inserted or updated with a single statement, the value for the ROW CHANGE TIMESTAMP column might be different for each row
If the data type is specified, it must be TIMESTAMP WITHOUT TIME ZONE with a precision of 6
Use the ROW CHANGE TIMESTAMP expression to read data directly in the select command
In this post, we discussed the solution to migrate ROW CHANGE TIMESTAMP from Db2 for z/OS to RDS for PostgreSQL or Aurora PostgreSQL target databases. We also discussed the use case in replicating data across a similar table structure in the target database. You can apply this solution while migrating from Db2 LUW as well.
If you have questions or feedback, leave a comment.
About the Authors
Vikas Gupta is a Lead Database Migration Consultant with AWS Professional Services. He loves to automate manual processes and enhance the user experience. He helps customers migrate and modernize workloads in the AWS Cloud, with a special focus on modern application architectures and development best practices.
Read MoreAWS Database Blog