Options for legacy application modernization with Amazon Aurora and Amazon DynamoDB

By mullaned2002

May 26, 2021

2866

Legacy application modernization can be complex. To reduce complexity and risk, you can choose an iterative approach by first replatforming the workload to Amazon Aurora. Then you can use the cloud-native integrations in Aurora to introduce other AWS services around the edges of the workload, often without changes to the application itself. This approach allows teams to experiment, iterate, and modernize legacy workloads iteratively.

Modern cloud applications often use several database types working in unison, creating rich experiences for customers. To that end, the AWS database portfolio consists of multiple purpose-built database services that allow you to use the right tool for the right job based on the nature of the data, access patterns, and scalability requirements. For example, a modern cloud-native ecommerce solution can use a relational database for customer transactions and a nonrelational document database for product catalog and marketing promotions.

If you’re migrating a legacy on-premises application to AWS, it can be challenging to identify the right purpose-built approach. Furthermore, introducing purpose-built databases to an application that runs on an old-guard commercial database might require extensive rearchitecture.

In this post, I propose a modernization approach for legacy applications that make extensive use of semistructured data such as XML in a relational database. Starting in the mid-90s, developers began experimenting with storing XML in relational databases. Although commercial and open-source databases have since introduced native support for nonrelational data types, an impedance mismatch still exists between the relational SQL query language and access methods that may introduce data integrity and scalability challenges for your application. Retrieval of rows based on the value of an XML attribute can involve a resource-consuming full table scan, which may result in performance bottlenecks. Because enforcing accuracy and consistency of relationships between tables, or referential integrity, on nonrelational data types in a relational database isn’t possible, it may lead to orphaned records and data quality challenges.

For such scenarios, I demonstrate a way to introduce Amazon DynamoDB alongside Amazon Aurora PostgreSQL-compatible edition, using the native integration of AWS Lambda with Aurora, without any modifications to your application’s code. DynamoDB is a fully managed key-value and document database with single-millisecond query performance, which makes it ideal to store and query nonrelational data at any scale. This approach paves the way to gradual rearchitecture, whereby new code paths can start to query DynamoDB following the Command-Query Responsibility Segregation pattern. When your applications are ready to cut over reads and writes to DynamoDB, you can remove XML from Aurora tables entirely.

Solution overview

The solution mirrors XML data stored in an Aurora PostgreSQL table to DynamoDB documents in an event-driven and durable way by using the Aurora integration with Lambda. Because of this integration, Lambda functions can be called directly from within an Aurora database instance by using stored procedures or user-defined functions.

The following diagram details the solution architecture and event flows.

The solution deploys the following resources and configurations:

Amazon Virtual Private Cloud (Amazon VPC) with two public and private subnets across two AWS Availability Zones
An Aurora PostgreSQL cluster in the private subnets, encrypted by an AWS KMS managed customer master key (CMK), and bootstrapped with a orders table with sample XML
A pgAdmin Amazon Elastic Compute Cloud (Amazon EC2) instance deployed in the public subnet to access the Aurora cluster
A DynamoDB table with on-demand capacity mode
A Lambda function to transform XML payloads to DynamoDB documents and translate INSERT, UPDATE, and DELETE operations from Aurora PostgreSQL to DynamoDB
An Amazon Simple Queue Service (Amazon SQS) queue serving as a dead-letter queue for the Lambda function
A secret in AWS Secrets Manager to securely store Aurora admin account credentials
AWS Identity and Access Management (IAM) roles granting required permissions to the Aurora cluster, Lambda function and pgAdmin EC2 instance

The solution registers the Lambda function with the Aurora cluster to enable event-driven offloading of data from the postgres.orders table to DynamoDB, as numbered in the preceding diagram:

When an INSERT, UPDATE, or DELETE statement is run on the Aurora orders table, the PostgreSQL trigger function invokes the Lambda function asynchronously for each row, after it’s committed. Every function invocation receives the operation code (TG_OP), and—as applicable—the new row (NEW) and the old row (OLD) as payload.
The Lambda function parses the payload, converts XML to JSON, and performs the DynamoDB PutItem action in case of INSERT or UPDATE and the DeleteItem action in case of DELETE.
If an INSERT, UPDATE or DELETE event fails all processing attempts or expires without being processed, it’s stored in the SQS dead-letter queue for further processing.

The source postgres.orders table stores generated order data combining XML with relational attributes (see the following example of a table row with id = 1).

You can choose which columns or XML attributes get offloaded to DynamoDB by modifying the Lambda function code. In this solution, the whole table row, including XML, gets offloaded to simplify querying and enforce data integrity (see the following example of a corresponding DynamoDB item with id = 1).

Prerequisites

Before deploying this solution, make sure that you have access to an AWS account with permissions to deploy the AWS services used in this post through AWS CloudFormation.

Costs are associated with using these resources. See AWS Pricing for details. To minimize costs, I demonstrate how to clean up the AWS resources at the end of this post.

Deploy the solution

To deploy the solution with CloudFormation, complete the following steps:

Choose Launch Stack.

By default, the solution deploys to the AWS Region, us-east-2, but you can change this Region. Make sure you deploy to a Region where Aurora PostgreSQL is available.
For AuroraAdminPassword, enter an admin account password for your Aurora cluster, keeping the defaults for other parameters.
Acknowledge that CloudFormation might create AWS Identity and Access Management (IAM) resources.
Choose Create stack.

The deployment takes around 20 minutes. When the deployment has completed, note the provisioned stack’s outputs on the Outputs

The outputs are as follows:

LambdaConsoleLink and DynamoDBTableConsoleLink contain AWS Management Console links to the provisioned Lambda function and DynamoDB table, respectively. You can follow these links to explore the function’s code and review the DynamoDB table items.
EC2InstanceConnectURI contains a deep link to connect to the pgAdmin EC2 instance using SSH via EC2 Instance Connect. The EC2 instance has PostgreSQL tooling installed; you can log in and use psql to run queries from the command line.
AuroraPrivateEndpointAddress and AuroraPrivateEndpointPort contain the writer endpoint address and port for the Aurora cluster. This is a private endpoint only accessible from the pgAdmin EC2 instance.
pgAdminURL is the internet-facing link to access the pgAdmin instance.

Test the solution

To test the solution, complete the following steps:

Open the DynamoDB table by using the DynamoDBTableConsoleLink link from the stack outputs.
Some data is already in the DynamoDB table because we ran INSERT operations on the Aurora database instance as part of bootstrapping.

Open a new browser tab and navigate to the pgAdminURL link to access the pgAdmin instance.
The Aurora database instance should already be registered.
To connect to the Aurora database instance, expand the Servers tree and enter the AuroraAdminPassword you used to create the stack.

Choose the postgres database and on the Tools menu, and then choose Query Tool to start a SQL session.

Run the following INSERT, UPDATE, and DELETE statements one by one, and return to the DynamoDB browser tab to observe how changes in the Aurora postgres.orders table are reflected in the DynamoDB table.

— UPDATE example
UPDATE orders SET order_status = ‘pending’ WHERE id < 5;

— DELETE example
DELETE FROM orders WHERE id > 10;

— INSERT example
INSERT INTO orders (order_status, order_data)
VALUES (‘malformed_order’,
‘<preorder>
<audiobook>
<kindle_id>
<error>error retrieving kindle id</error>
</kindle_id>
</audiobook>
</preorder>’);

The resulting set of items in the DynamoDB table reflects the changes in the postgres.orders table.

You can further explore the two triggers (sync_insert_update_delete_to_dynamodb and sync_truncate_to_dynamodb) and the trigger function sync_to_dynamodb() that makes calls to the Lambda function.

In the pgAdmin browser tab, on the Tools menu, choose Search Objects.
Search for sync.
Choose (double-click) a search result to reveal it in the pgAdmin object hierarchy.

To review the underlying statements, choose an object (right-click) and choose CREATE Script.

Security of the solution

The solution incorporates the following AWS security best practices:

Encryption at rest – The Aurora cluster is encrypted by using an AWS KMS managed customer master key (CMK).
Security – AWS Secrets Manager is used to store and manage Aurora admin account credentials.
Identity and access management – The least privilege principle is followed when creating IAM policies.
Network isolation – For additional network access control, the Aurora cluster is deployed to two private subnets with a security group permitting traffic only from the pgAdmin EC2 instance. To further harden this solution, you can introduce VPC endpoints to ensure private connectivity between the Lambda function, Amazon SQS, and DynamoDB.

Reliability of the solution

Aurora is designed to be reliable, durable, and fault tolerant. The Aurora cluster in this solution is deployed across two Availability Zones, with the primary instance in Availability Zone 1 and a replica in Availability Zone 2. In case of a failure event, the replica is promoted to the primary, the cluster DNS endpoint continues to serve connection requests, and the calls to the Lambda function continue in Availability Zone 2 (refer to the solution architecture earlier in this post).

Aurora asynchronous calls to Lambda retry on errors, and when a function returns an error after running, Lambda by default retries two more times by using exponential backoff. With the maximum retry attempts parameter, you can configure the maximum number of retries between 0 and 2. Moreover, if a Lambda function returns an error before running (for example, due to lack of available concurrency), Lambda by default keeps retrying for up to 6 hours. With the maximum event age parameter, you can configure this duration between 60 seconds and 6 hours. When the maximum retry attempts or the maximum event age is reached, an event is discarded and persisted in the SQS dead-letter queue for reprocessing.

It’s important to ensure that the code of the Lambda function is idempotent. For example, you can use optimistic locking with version number in DynamoDB by ensuring the OLD value matches the document stored in DynamoDB and rejecting the modification otherwise.

Reprocessing of the SQS dead-letter queue is beyond the scope of this solution, and its implementation varies between use cases. It’s important to ensure that the reprocessing logic performs timestamp or version checks to prevent a newer item in DynamoDB from being overwritten by an older item from the SQS dead-letter queue.

This solution preserves the atomicity of a SQL transaction as a single, all-or-nothing operation. Lambda calls are deferred until a SQL transaction has been successfully committed by using INITIALLY DEFERRED PostgreSQL triggers.

Performance efficiency of the solution

Aurora integration with Lambda can introduce performance overhead. The amount of overhead depends on the complexity of the PostgreSQL trigger function and the Lambda function itself, and I recommend establishing a performance baseline by benchmarking your workload with Lambda integration disabled.

Upon reenabling the Lambda integration, use Amazon CloudWatch and PostgreSQL Statistics Collector to analyze the following:

Aurora CPU and memory metrics, and resize the Aurora cluster accordingly
Lambda concurrency metrics, requesting a quota increase if you require more than 1,000 concurrent requests
Lambda duration and success rate metrics, allocating more memory if necessary
DynamoDB metrics to ensure no throttling is taking place on the DynamoDB side
PostgreSQL sustained and peak throughput in rows or transactions per second
If your Aurora workload is bursty, consider Lambda provisioned concurrency to avoid throttling

To illustrate the performance impact of enabling Lambda integration, I provisioned two identical environments in us-east-2 with the following parameters:

AuroraDBInstanceClass – db.r5.xlarge
pgAdminEC2InstanceType – m5.xlarge
AuroraEngineVersion – 12.4

Both environments ran a simulation of a write-heavy workload with 100 INSERT, 20 SELECT, 200 UPDATE, and 20 DELETE threads running queries in a tight loop on the Aurora postgres.orders table. One of the environments had Lambda integration disabled.

After 24 hours of stress testing, I collected the metrics using CloudWatch metrics, PostgreSQL Statistics Collector, and Amazon RDS Performance Insights.

From an Aurora throughput perspective, enabling Lambda integration on the postgres.orders table reduces the peak read and write throughput to 69% of the baseline measurement (see rows 1 and 2 in the following table).

#
Throughput measurement
INSERT/sec
UPDATE/sec
DELETE/sec
SELECT/sec
% of baseline throughput
1
db.r5.xlarge without Lambda integration
772
1,472
159
10,084
100% (baseline)
2
db.r5.xlarge with Lambda integration
576
887
99
7,032
69%
3
db.r5.2xlarge with Lambda integration
729
1,443
152
10,513
103%
4
db.r6g.xlarge with Lambda integration
641
1,148
128
8,203
81%

To fully compensate for the reduction in throughput, one option is to double the vCPU count and memory size and change to the higher db.r5.2xlarge Aurora instance class at an increase in on-demand cost (row 3 in the preceding table).

Alternatively, you can choose to retain the vCPU count and memory size, and move to the AWS Graviton2 processor-based db.r6g.xlarge Aurora instance class. Because of Graviton’s better price/performance for Aurora, the peak read and write throughput is at 81% of the baseline measurement (row 4 in the preceding table), at a 10% reduction in on-demand cost in us-east-2.

As shown in the following graph, the DynamoDB table consumed between 2,630 and 2,855 write capacity units, and Lambda concurrency fluctuated between 259 and 292. No throttling was detected.

You can reproduce these results by running a load generator script located in /tmp/perf.py on the pgAdmin EC2 instance.

# Lambda integration on
/tmp/perf.py 100 20 200 20 true

# Lambda integration off
/tmp/perf.py 100 20 200 20 false

Additional considerations

This solution doesn’t cover the initial population of DynamoDB with XML data from Aurora. To achieve this, you can use AWS Database Migration Service (AWS DMS) or CREATE TABLE AS.

Be aware of certain service limits before using this solution. The Lambda payload limit is 256 KB for asynchronous invocation, and the DynamoDB maximum item size limit is 400 KB. If your Aurora table stores more than 256 KB of XML data per row, an alternative approach is to use Amazon DocumentDB (with MongoDB compatibility), which can store up to 16 MB per document, or offload XML to Amazon Simple Storage Service (Amazon S3).

Clean up

To avoid incurring future charges, delete the CloudFormation stack. In the CloudFormation console, change the Region if necessary, choose the stack, and then choose Delete. It can take up to 20 minutes for the clean up to complete.

Summary

In this post, I proposed a modernization approach for legacy applications that make extensive use of XML in a relational database. Heavy use of nonrelational objects in a relational database can lead to scalability issues, orphaned records, and data quality challenges. By introducing DynamoDB alongside Aurora via native Lambda integration, you can gradually rearchitect legacy applications to query DynamoDB following the Command-Query Responsibility Segregation pattern. When your applications are ready to cut over reads and writes to DynamoDB, you can remove XML from Aurora tables entirely. You can extend this approach to offload JSON, YAML, and other nonrelational object types.

As next steps, I recommend reviewing the Lambda function code and exploring the multitude of ways Lambda can be invoked from Aurora, such as synchronously; before, after, and instead of a row being committed; per SQL statement; or per row.

About the author

Igor is an AWS enterprise solutions architect, and he works closely with Australia’s largest financial services organizations. Prior to AWS, Igor held solution architecture and engineering roles with tier-1 consultancies and software vendors. Igor is passionate about all things data and modern software engineering. Outside of work, he enjoys writing and performing music, a good audiobook, or a jog, often combining the latter two.

Options for legacy application modernization with Amazon Aurora and Amazon DynamoDB

Solution overview

Prerequisites

Deploy the solution

Test the solution

Security of the solution

Reliability of the solution

Performance efficiency of the solution

Additional considerations

Clean up

Summary

About the author

Scale your relational database for SaaS, Part 1: Common scaling patterns

Scale your relational database for SaaS, Part 2: Sharding and routing

Perform a side-by-side upgrade in AWS DMS by moving tasks to minimize business impact

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 25

Google is a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud AI Developer Services

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

Fine-tune and deploy language models with Amazon SageMaker Canvas and Amazon Bedrock

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Here’s what AWS revealed about its generative AI strategy at re:Invent 2023

Accelerate data migration using AWS DMS and AWS CDK

How to Perform T-Test in SAS (with Examples)

POPULAR CATEGORY