Friday, April 19, 2024
No menu items!
HomeDatabase ManagementAutomate schema version control and migration with Flyway and AWS Lambda on...

Automate schema version control and migration with Flyway and AWS Lambda on Amazon Aurora PostgreSQL

Monthly releases are always hectic, and companies face challenges with SQL deployments and schema versioning. It’s very time-consuming when multiple developers are developing code and need to run multiple SQL files across multiple production databases. To maintain environment integrity and consistency across applications, code and schema changes need to sync across multiple non-production and production environments.

In this post, we explain how you can set up Flyway along with an AWS Lambda function to deploy SQL scripts into various Amazon Aurora PostgreSQL-Compatible Edition environments. The solution in this post helps you centralize and deploy all your SQL scripts, which can be Data Definition Language (DDL) or Data Manipulation Language (DML). You can deploy quickly and schedule the function to run monthly, bi-weekly, or hourly. You can do schema versioning and migrate from your sandbox to development, QA, and production, or you can add it to any other environment in your software development lifecycle.

Flyway overview

Flyway is an open-source database SQL deployment tool. In Flyway, all DDL and DML changes to the database are called migrations. Migrations can be versioned or repeatable. You can use a versioned migration to create, alter, or drop an object. Sometimes they’re used to correct reference data. With the help of Flyway, you can also undo or roll back migrations.

Solution overview

In this solution, we install Flyway on an Amazon Elastic Compute Cloud (Amazon EC2)

instance and point the Flyway configuration to two different Aurora PostgreSQL databases. We use a Lambda function to run the migrate command, which deploys SQL scripts to their respective databases. An Amazon EventBridge rule calls the Lambda function based on a schedule. We can also keep track of changes that were completed successfully and are pending to be applied on respective databases.

The following diagram illustrates the solution architecture.

Prerequisites

To get started, you must have the following prerequisites:

Two Aurora PostgreSQL instances to test and deploy scripts in one environment and, after successful completion, deploy that into another environment.
An EC2 instance for installing and configuring the Flyway software and deploying SQL files to keep track of changes of various SQLs and migrate them to different Aurora PostgreSQL databases. This post doesn’t cover the security, high availability, and resiliency side of Amazon EC2 configurations. Also, an EC2 instance should have connectivity to Aurora database instances.
Flyway, an open-source database migration tool that helps to migrate SQL and manage changes from one environment to another environment. We use the Flyway community edition in this post, and it has certain limitations. For installation instructions, refer to Download and installation.
A Lambda function to automate the database code changes from one database environment to another environment.
An Amazon Simple Storage Service (Amazon S3) bucket to store the SQL files.
An EventBridge rule that schedules the Lambda function to run at certain intervals.

Implement the solution

For this post, we use two different Aurora PostgreSQL environments, dev and QA. To deploy changes into a unique environment, Flyway needs two different configuration files pointing to two different databases.

We install Flyway community version 8.5.4 on Amazon EC2 for demonstration. Note the directory structure for Flyway software:

$flyway_home software binary directory

$flyway_home/conf – to store config files to add database endpoints and credentials. Default config file is flyway.sql in this case it’s being used as a DEV environment

$flyway_home/sql – to store SQL Files

The two configuration files point to unique environments. They use different URLs pointing to two different Aurora PostgreSQL databases, as shown in the following screenshots. This solution can be scaled to any Aurora PostgreSQL database writer instances, including Aurora global databases and Aurora Serverless databases.

All SQL scripts are under the $flyway_home/sql directory, which are deployed in Aurora PostgreSQL databases.

Naming patterns are important for Flyway to recognize which files need to be deployed and which are pending. Flyway follows specific naming patterns to keep track of changes and push those changes in the databases.

To push changes to their respective databases, use the command flyway migrate and flyway info to get details about the deployment.

When migrating SQL files, -configFiles points to a specific environment. In our case, we deploy code to the QA environment, so we use -configFiles=”flyway_QA.conf”.

For demonstration purposes, we created a schema beforehand in the QA environment, so our deployment partially failed. The error shows that a schema already exists in the database.

For validation, we can review those changes by querying from the target database or by running migrate info from the command prompt.

This way, all SQL changes move from one environment to another. It shows whether it completed successfully or had any issues. As we mentioned earlier, it also keeps entries in all target databases for future review.

You can find detailed information about when deployment was completed, how much time it took, and the order in which the files ran.

If you have multiple developers developing code, you can use an S3 bucket to keep track of all SQL changes and sync the bucket with the local EC2 instance $flyway_home/sql directory and push those changes to the respective database.

To automate SQL deployment, we use a Lambda function along with the AWS Systems Manager Agent send command to push those changes into their respective environment. You can use the following code in the Lambda function.

Run a simple command to push your SQL changes to their respective databases by adding -configFiles:

response = ssm.send_command(
InstanceIds=[instanceid],
DocumentName=”AWS-RunShellScript”,
Parameters={
“commands”: [“/home/ec2-user/flyway-8.5.4/flyway migrate >> /tmp/dev_changes.text”]
}, # replace command_to_be_executed with command

Alternatively, you can use the default configuration file:

“commands”: [“/home/ec2-user/flyway-8.5.4/flyway -configFiles=”flyway_QA.conf”migrate >> /tmp/QA_changes.text”]

Next, you schedule EventBridge to run your schema changes on a monthly, daily, or hourly basis, based on your organizational requirements. This helps push SQL changes into their respective databases automatically.

With the help of Amazon CloudWatch Logs, we can see the runtime and any success or failure of the migration.

Limitations

The Flyway community version has limitations. With the Teams edition, you can test dry runs, undo, and batching and streaming commands.

The Lambda function has a time limit of 15 minutes. For bigger deployments with lots of database changes, we recommend taking a manual approach.

Clean up

The services involved in this solution incur costs. When you’re done using this solution, clean up the following resources:

EC2 instancesStop or delete EC2 instances that you provisioned for installing Flyway and hosting services.
Aurora PostgreSQL instancesDelete the DB instances that you created.
S3 bucket – Amazon S3 is optional for testing purposes. If you used an S3 bucket to transfer files, you should delete the bucket if no longer needed.
Lambda functionDelete the Lambda function that you used to run the SSM Agent command.
EventBridge rule – To stop the rule from further processing, you can delete or disable it.

Conclusion

In this post, we provided a solution to address version control of database changes and automate the deployment to various environments with the use of a Lambda function and EventBridge rule. Flyway helps to automate schema versioning and track all your SQL changes. With a few configuration changes, we can deploy the same SQL scripts to another database. With this solution, you can easily store and retrieve schema versioning directly from the database with the exact information of deployment time. This solution is effective for your on-premises or cloud infrastructure needs.

We’d love to hear your feedback. Please leave your comments in the comments section.

About the author

Harshad Gohil is a Cloud/Database Consultant with Professional Services team at Amazon Web Services. He helps customers to build scalable, highly available and secure solutions in AWS cloud. His focus area is homogenous and heterogeneous migrations of on-premise infrastructure to AWS cloud.

John Lonappan is a Senior Database Specialist Consultant / Solutions Architect at Amazon Web Services (AWS) with a focus on relational databases. Prior to AWS, John has worked as Database Architect for large Data center providers across the Globe. Outside of work, he is passionate about, Long drives, EV Conversion, playing chess and traveling.

Read MoreAWS Database Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments