AWS Database Migration Service (AWS DMS) is a managed migration and replication service that helps move your database and analytics workloads to AWS quickly, securely, and with minimal downtime and zero data loss. AWS DMS can migrate your data from AWS DMS supported sources to AWS DMS supported targets, including most widely used commercial and open-source databases. The automation of AWS DMS migration tasks offers the following benefits:
Reduced manual effort
Reduced time due to automation
Easier testing and validation process
At a high level, AWS DMS requires the following:
Endpoints that identify the source and target platforms together with the required information to successfully connect to the databases
An AWS DMS task that maps the objects being migrated together with any special handling required at the individual object level
In this post, we show you a solution to automate the creation of AWS DMS endpoints and AWS DMS tasks.
Our use case consists of migrating from a Microsoft SQL Server database to an Amazon Aurora PostgreSQL-Compatible Edition database. The solution consists of scripts contained in a GitHub repository that runs AWS Command Line Interface (AWS CLI) commands based on parameters and passwords stored in AWS Secrets Manager. The architecture of the solution is diagrammatically represented in the following flowchart, which walks you through the process of automating and validating the creation of AWS DMS endpoints and tasks.
The flowchart includes the following steps:
Start the script blog_createdmsall.sh in the GitHub repository.
Determine if database secrets exist.
If no, proceed to Step 3.
If yes, proceed to Step 4.
Prompt for user ID and password, and create secrets.
Read database information from a CSV file.
Create endpoints by retrieving credentials from Secrets Manager.
Generate the AWS DMS data replication task creation script.
Generate the AWS DMS data validation task creation script with the RUNNOW option:
If NO, modify the table mapping or task setting JSON for specific databases.
If YES, proceed to Step 9.
Test the endpoint connection.
Create the AWS DMS data replication task.
Create the AWS DMS data validation task.
The following prerequisites are required to utilize this solution:
Existing source and target databases.
An existing AWS DMS replication instance.
An existing certificate if TLS connection is required for on-premises databases.
The AWS CLI installed and configured on a server.
An AWS DMS instance with connectivity to both the source and target databases.
The database ID and password used by AWS DMS are pre-created in the source and target databases. This solution requires two user account and password pairs: the first pair is used for the source database, and the second pair is used for the target database. For the purpose of automation, all the source databases share the same set of credentials, as do the target databases.
The AWS DMS endpoints settings are identical for the same DBMS engine. However, each engine can have its own settings.
All AWS DMS tasks share the same table mapping and task settings to simplify the logic. However, the solution incorporates a RUNNOW option to save the scripts for manual runs. With this option, you can modify the script to use a different setting JSON file.
The parameters required by the script are set in the DMSPROFILE.ini input file.
All the source and target database information is in DMSDB.ini text file, which contains five fields: EndpointType, Engine-name, DBname, Port IP, and Hostname.
The contents of the DMSPROFILE.ini input file look like the following code:
The following is sample content of DMSDB.ini:
Implement the solution
This section contains the steps required to utilize this solution:
Identify a server with connectivity to both the source and target databases and the AWS CLI installed.
Download the scripts contained in the GitHub repository in a new folder on the server.
The process starts by running the main script blog_createdmsall.sh, which loads all the parameters from the configuration file.
If the DMSSECRET.INI file doesn’t exist, the script blog_createdmsall.sh calls blog_create_secret.sh in Step 3. If the DMSSECRET.INI file exists, the script skips Step 4 and proceeds to Step 5.
The secret creation script, blog_create_secret.sh, prompts the user to enter a user ID and password combination for both the source and target databases. This information is used to create secrets in Secrets Manager. The secrets are utilized when the endpoints are created. The secret isn’t referred to at run time. Two pairs of secrets (one user account and pair per secret) are created: one for the source and the other for the target.
The following screenshot is the sample output of the blog_createdmsall.sh script.
You can verify via the Secrets Manager console that the secret has been created.
Retrieve the user ID and password from Secrets Manager and start a loop to read the database information contained in the DMSDB.ini file.
The database information consists of pairs of lines such that the first one is for the source and the second one is for the target. The following screenshot shows the script retrieving the database credentials from Secrets Manager.
Create source and target endpoints and generate the test connection script.
You can verify via the AWS DMS console that the endpoints have been successfully created.
Generate the AWS DMS replication task creation script.
Generate the AWS DMS data verification task creation script.
In this example, the RUNNOW option is set to NO, therefore the script names are displayed and the main script ends. This provides you an opportunity to customize the AWS DMS task definition properties like the table mapping or task setting JSON file. If RUNNOW is set to YES, Steps 8–10 run in the main script, which provides full automation.
The following screenshot displays the instructions to run the scripts manually.
Test the AWS DMS endpoint connection and display the connection results.
It runs in parallel for all endpoints and takes about 60 seconds to complete.
You can verify via the AWS DMS endpoints console that the connections are successful.
Run the create-DMS-replications-Task.sh script to create the AWS DMS replication task.
Run the create-DMS-validation-full-load-Task.sh script to create both the full load task and the change data capture (CDC) task.
Run the create-DMS-validation-cdc-Task.sh script to create the AWS DMS validation task.
You can verify that the tasks have been successfully created via the AWS DMS console.
You have the option to make data validation part of a replication task by modifying the task setting JSON in Step 11. If you create a validation-only task, you don’t have the option to specify full load plus CDC, unlike a replication task. Our experience shows that a CDC-only validation task performs a full table validation. This strategy provides you with the flexibility to decide how you want to run the tasks.
If you have a large number of tasks, you can load balance them across multiple replication instances to have greater throughput.
You can also use blog_create_operation.sh to generate operation scripts:
This section contains the steps required to clean up the artifacts created by this solution after the project is complete and no resources need to be retained. If there is a possible requirement to rerun the replication or validation, do not perform these steps.
There are two methods to clean up resources: via scripts or via the AWS Management Console.
Clean up via scripts
You can delete all the resources we created by running the cleanup scripts:
Sign in to the server you used to run the scripts.
Navigate to the folder that contains the code.
Run the following scripts in order:
These scripts are created during the implementation of the solution.
Clean up via the console
You can delete all the resources we created via the console with the following steps:
On the AWS DMS console, stop all the AWS DMS tasks.
Delete the AWS DMS tasks and endpoints.
On the Secrets Manager console, delete the secrets you created.
Automating the creation and validation of endpoints reduces the level of effort needed to establish the AWS DMS structures required to migrate databases for large-scale organization. Additionally, automation eliminates any errors introduced by manual migration.
If you have any questions or suggestions about this post, leave a comment. We hope the information we have shared helps!
About the Authors
Feng Cai is a Database Consultant at AWS. He has a wide background in providing database services to clients. Feng is currently a member of the AWS Global Competency Centre, providing homogeneous and heterogenous database migration support to public sector clients.
Wajid Ali Mir is a Database Consultant at AWS. He has extensive database experience with customers ranging from banking, telecom, and cloud consultancy. Wajid is currently a member of the AWS Global Competency Centre, providing homogeneous and heterogenous database migration support to commercial clients.
Dean Capps is a Database Consultant at AWS. He has experience providing database support on legacy, open systems, and cloud platforms. Dean is currently a member of the AWS Global Competency Centre, providing homogeneous and heterogenous database migration support to public sector clients.
Baji Shaik is a Sr. Lead Consultant with AWS ProServe, Global Competency Centre. His background spans a wide depth and breadth of expertise and experience in SQL and NoSQL database technologies. He is a Database Migration Expert and has developed many successful database solutions addressing challenging business requirements for moving databases from on premises to Amazon RDS and Amazon Aurora PostgreSQL/MySQL. He is an eminent author, having written several books on PostgreSQL. A few of his recent works include “PostgreSQL Configuration,” “Beginning PostgreSQL on the Cloud,” and “PostgreSQL Development Essentials.” Furthermore, he has delivered several conference and workshop sessions.
Read MoreAWS Database Blog