AWS launched Amazon Database Migration Accelerator (Amazon DMA) to accelerate your journey to AWS Databases and Analytics services and achieve cloud adoption benefits such as cost savings and performance improvements. In this post, we share Amazon DMA’s approach to migrate generated columns to PostgreSQL implementations, such as Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition using AWS Database Migration Service (AWS DMS).
What is a generated column?
A generated column is a column in a relational database table where its value is calculated on the fly by using either the values from other columns in the table or from a deterministic expression. Generated columns simplify queries, reduce errors, and offer a centralized, consistent mechanism to compute column values and keep them in sync. There are two types of generated columns:
Stored generated column – The value in a stored generated column is computed when the data is written (inserted or updated) and it occupies storage as if it were a normal column. A stored generated column is like a materialized view, except that it’s always updated automatically.
Virtual generated column – The value in a virtual generated column is computed when it’s read and it doesn’t occupy any storage. A virtual generated column is like a view.
Create generated columns
Most relational databases support generated columns. Microsoft SQL Server and MySQL support both stored and virtual generated columns, Oracle supports only virtual generated columns, and PostgreSQL (version 12 or higher) supports only stored generated columns. The following code snippets show how to create generated columns in these relational databases:
You can achieve the generated column functionality in PostgreSQL versions lower than 12 by using triggers or views with implicit or explicit functions to compute the generated columns. With triggers, you can define the generated column as a real column and use the trigger to update the column. See the following code:
Alternatively, you can create a view on the table, which provides generated columns in place and populates them with implicit or explicit function values:
Update generated columns
The values in the generated column are auto-generated and they can’t be manually inserted or updated. For example, if you try to insert or update a value in the generated column, you get an error:
Instead, you can insert or update the values of all other columns, and the value of the generated column will get updated automatically, as shown in the following code snippet:
In addition, a generated column can be added or dropped from a table. For example, in PostgreSQL (version 12 or higher) you can run the following commands:
Migrate generated columns to PostgreSQL using AWS DMS
You can use AWS DMS to migrate relational databases such as Oracle or Microsoft SQL Server to PostgreSQL implementations, such as Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition. AWS DMS supports both full load (the entire table) and CDC (change data capture, which are incremental changes to the target table) migrations, as shown in the following diagram.
However, if the tables in the target database contain generated columns, AWS DMS will give the following error while migrating a table with the generated column:
You can mitigate this challenge by using one of the following options.
Full load or Change Data Capture (CDC)
For PostgreSQL versions lower than 12, you can use full load or CDC with the following actions:
Either use Views or create Triggers and add the generated column as a real column to the target table.
Use AWS DMS to migrate the data from the source table to the target table
For PostgreSQL versions 12 or higher, you can use full load for the following actions:
Modify the target table – Complete the following steps:
Add all columns from the source table to the target table, except the generated columns.
Migrate the data from the source to the target table using AWS DMS.
Add the generated columns in the target table.
Disable the copy command with AWS DMS – Prior to starting the AWS DMS migrations, disable the copy command option by setting LoadUsingCSV=false. In this case, AWS DMS uses the SQL command to insert the source records into the target table. This process is generally slower than the other options because the rows are inserted with one row at a time.
Use the copy command and skip AWS DMS – Instead of using AWS DMS to migrate the source table containing generated columns, you can export the source table to a CSV file and use PostgreSQL’s copy command directly to load the data to the target table:
Change data capture
For PostgreSQL version 12 or higher, you can configure AWS DMS in CDC mode to ignore the generated columns on the target. Then you can migrate the data from the source to the target table.
In this post, we shared Amazon DMA’s approach to migrate generated columns to PostgreSQL implementations, such as Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition using AWS DMS.
Amazon DMA offers complementary advisory services to create migration strategy and implementation plans, and enables your in-house migration team (or Amazon Professional Services or APN Partner, if involved) to conduct migration implementation. If you are planning to migrate your workloads to AWS Databases and Analytics Services, email [email protected] to engage the Amazon DMA team.
About the Authors
Ezat Karimi is a Sr. Solution Architect at Amazon Database Migration Accelerator.
Sharath Gopalappa is a Sr. Product Manager Technical at Amazon Database Migration Accelerator.
Read MoreAWS Database Blog