Migrate from self-managed Db2 to Amazon RDS for Db2 using AWS DMS

By mullaned2002

January 18, 2024

248

We’re excited to announce that AWS Database Migration Service (AWS DMS) now supports Amazon Relational Database Service (Amazon RDS) for Db2 as a target endpoint. This development simplifies the process of migrating self-managed Db2 workloads to Amazon RDS for Db2, a managed service designed to ease the setup, operation, and scaling of Db2 databases in the cloud. AWS DMS supports both full load and change data capture (CDC) migration modes for Amazon RDS Db2. Full load migration copies all of the data from the source database to Amazon RDS for Db2. CDC copies only the data that has changed since the last migration.

In this post, we outline the steps to migrate using AWS DMS, incorporating best practices for performance tuning and comprehensive logging to ensure a smooth migration.

Solution overview

The solution for migrating databases to Amazon RDS for Db2 using AWS DMS involves several key components and steps. AWS DMS is a powerful service designed to efficiently and securely migrate databases from the source to a target environment. The following are the main elements of this solution:

AWS DMS – AWS DMS simplifies database migration. It supports various database platforms and allows for continuous data replication with high availability and minimal downtime.
Source endpoint – The source endpoint represents the database you intend to migrate from. In this scenario, it’s a self-managed Db2 database.
Target endpoint – The target endpoint represents the destination for the migration, which, in this case, is Amazon RDS for Db2. Configuring both endpoints in AWS DMS is crucial for defining the migration path.
Replication instances – AWS DMS uses replication instances to enable the migration of data. These instances are responsible for connecting the source and target databases, reading the source data, and applying it to the target database. The size and type of the replication instance should align with the scale of the migration task for optimal performance.
Authentication– AWS DMS offers two methods for managing database credentials:

AWS Secrets Manager – The first method uses AWS Secrets Manager, which enhances security by providing a secure vault where database credentials are encrypted and stored. With this integration, AWS DMS automates the retrieval of these credentials during the migration process, thereby bypassing the risks associated with hardcoding sensitive information.
Manual authentication – The alternative method is manual authentication, where you directly enter the authentication details such as user names and passwords into the AWS DMS endpoint configurations. Although this approach is straightforward, it places the onus on you to manage and secure these credentials, making it imperative to follow best practices for data protection within the AWS DMS setup.

The following diagram illustrates the process flow, highlighting how each component in the migration—from the source and target endpoints to the replication instance and authentication method—plays a pivotal role in ensuring a smooth and effective migration using AWS DMS.

The solution setup consists of the following high-level steps:

Create the source endpoint
Create the target endpoint
Create the replication instance
Create the AWS DMS migration task

Prerequisites

For this walkthrough, you should have the following prerequisites:

An active AWS account
A self-managed Db2 source for migration
Familiarity with AWS DMS and Amazon RDS for Db2

Create the source endpoint

Complete the following steps to set up your source endpoint:

On the AWS DMS console, choose Endpoints in the navigation pane.
Choose Create endpoint.

For Endpoint type¸ select Source endpoint.

Enter a name for your endpoint and an optional Amazon Resource Name (ARN).
For Source engine, choose IBM Db2 LUW.

For Access to endpoint database, select your preferred method of authentication to the Db2 LUW source—Secrets Manager or manual authentication. For this post, we select Provide access information manually.
Enter the connection details, including the server name, port, user name, and password.

Choose Create endpoint to finalize the setup of the source endpoint.

To create an AWS DMS endpoint for a Db2 database using the AWS Command Line Interface (AWS CLI), use the aws dms create-endpoint command. The following is a sample command to create a source endpoint for a Db2 database:

aws dms create-endpoint
–endpoint-identifier “db2-source-endpoint”
–endpoint-type “source”
–engine-name “db2”
–username “your_db_username”
–password “your_db_password”
–server-name “your_db2_server_url”
–port 50000
–database-name “your_db_name”
–tags Key=”Name”,Value=”YourEndpointName”

This command contains the following parameters:

–endpoint-identifier – A unique identifier for the endpoint
–endpoint-type – The type of the endpoint, which is source in this case
–engine-name – The type of database engine. For IBM Db2 LUW, it’s specified as db2
–username and –password – Your database credentials
–server-name – The hostname or IP address of your Db2 database server
–port – The port number on which your Db2 database is listening (the default is 50000 for Db2)
–database-name – The name of the database in your Db2 server
–tags – Optional key-value pairs for resource tagging

Provide your actual database user name, password, server URL, and database name in the preceding code.

Create the target endpoint

Complete the following steps to set up your target endpoint:

On the AWS DMS console, choose Endpoints in the navigation pane.
Choose Create endpoint.

For Endpoint type, select Target endpoint.

Enter a name for your endpoint and an optional ARN.
For Target engine, choose IBM Db2 LUW.

For Access to endpoint database, select your preferred method of authentication to the Db2 LUW source—Secrets Manager or manual authentication. For this post, we select Provide access information manually.
Enter the connection details, including the server name, port, user name, and password.

Choose Create endpoint to finalize the setup of the source endpoint.

To create a Db2 target endpoint in AWS DMS using the AWS CLI, use the aws dms create-endpoint command. The following is a sample command to create a target endpoint for a Db2 database:

aws dms create-endpoint
–endpoint-identifier “db2-target-endpoint”
–endpoint-type “target”
–engine-name “db2”
–username “target_db_username”
–password “target_db_password”
–server-name “target_db2_server_url”
–port 50000
–database-name “target_db_name”
–ssl-mode “none”
–tags Key=”Name”,Value=”YourDb2TargetEndpoint”

This command contains the following parameters:

–endpoint-identifier – A unique identifier for the endpoint.
–endpoint-type – The type of the endpoint, which is target in this case.
–engine-name – The type of database engine. For IBM Db2 LUW, it’s specified as db2.
–username and –password – Your database credentials.
–server-name – The hostname or IP address of your target Db2 database server.
–port – The port number on which your target Db2 database is listening (the default is 50000 for Db2).
–database-name – The name of the database in your Db2 server.
–ssl-mode – The SSL mode to use for the connection. Adjust this as needed for your security requirements.
–tags – Optional key-value pairs for resource tagging.

Provide your actual target database user name, password, server URL, and database name in the preceding code.

Create the replication instance

To create your replication instance, complete the following steps:

On the AWS DMS console, choose Replication instances in the navigation pane.
Choose Create replication instance.

Enter a name for your replication instance and an optional ARN and description.

Choose the appropriate instance size, engine version, and Availability Zone for your instance.
The instance size should align with your data volume and workload intensity. For migrating substantial workloads or high transaction volumes, consider using C6i and R6i instances, which offer additional memory to efficiently handle a large number of transactions and avoid memory pressure during ongoing replications. For such scenarios, we recommend selecting an instance size of 8x.large or higher. It’s important to regularly monitor the CPU and memory usage of your instance, adjusting its size up or down as needed. Opt for engine version 3.5.2 or newer for Db2 compatibility. For production environments, choose a Multi-AZ deployment; a Single-AZ setup may suffice for testing purposes.

For Network type, select either IPv4 or Dual-stack mode.
Choose the appropriate VPC and subnet for your setup.

Choose Create replication instance and allow a few minutes for the instance to become active.

To create an AWS DMS replication instance using the AWS CLI, use the create-replication-instance command. The specific command will depend on your desired configuration for the replication instance; the following is an example:

aws dms create-replication-instance
–replication-instance-identifier my-replication-instance
–replication-instance-class dms.c6i.xlarge
–allocated-storage 50
–vpc-security-group-ids sg-xxxxxxxxxxxxxxxxx

This command contains the following parameters:

my-replication-instance – The name for your replication instance.
c6i.xlarge – The instance class. This should be chosen based on your workload requirements.
50 – The allocated storage in gigabytes.
sg-xxxxxxxxxxxxxxxxx – The ID of your VPC security group.
–publicly-accessible – Makes the instance accessible over the internet. This flag should be used based on your network setup and security requirements.

Create the AWS DMS migration task

Complete the following steps to configure your AWS DMS migration task:

On the AWS DMS console, choose Database migration tasks in the navigation pane.
Choose Create task.

Enter a name for your task and optional description.
Choose the replication instance, source endpoint, and target endpoint that you created.

For Migration type, choose your preferred migration type, based on the different needs and stages of your migration journey:

Migrate existing data (full load) – Transfers all the data from the source database to the target database. It’s ideal for initial migrations where the target is empty and needs to be fully populated with the source data.
Replicate data changes only (CDC) – After a full load, CDC captures and applies only the changes made in the source database to the target. This option is crucial for keeping the databases in sync if the source database remains in use during migration.
Migrate existing data and replicate ongoing changes (full load and CDC) – This combines both methods, starting with a full load migration followed by CDC. This ensures a complete and up-to-date migration for scenarios where the source database continues to be active throughout the migration process.

Configure your task settings, including editing mode, preparation mode, actions after the full load is complete, and LOB column settings.

For Data Validation, select Validation with data migration.

For Task logs, select Turn on CloudWatch logs.
By default, AWS DMS task logging is minimal. For detailed insights, especially during troubleshooting, increase the logging level. Be mindful that detailed logging can rapidly consume storage because it records every event. AWS DMS stores logs on the replication instance, but they are accessible via Amazon CloudWatch. Logs older than 7 days are automatically deleted.
For Log context, select Turn on log context.
Keep the remaining settings at default.

Create and start the migration task.

The duration will vary based on the size of your database.

To create a migration task using the AWS CLI, use the create-replication-task command in AWS DMS. The specific command will depend on the details of your source, target, and type of migration task you want to perform. The following is an example:

aws dms create-replication-task
–replication-task-identifier my-migration-task
–source-endpoint-arn arn:aws:dms:region:account-id:endpoint:source-endpoint-id
–target-endpoint-arn arn:aws:dms:region:account-id:endpoint:target-endpoint-id
–replication-instance-arn arn:aws:dms:region:account-id:rep:replication-instance-id
–migration-type “migrate-existing-data”
–table-mappings file://table-mappings.json
–replication-task-settings file://task-settings.json

This command contains the following parameters:

my-migration-task – The identifier for your migration task
source-endpoint-arn – The ARN of the source endpoint (where your source database is located)
target-endpoint-arn – The ARN of the target endpoint (where you want to migrate your data)
replication-instance-arn – The ARN of your AWS DMS replication instance
migration-type – This setting can be full-load, cdc, or full-load-and-cdc. For this post, we use migrate-existing-data, which means a full load migration
table-mappings – The path to a JSON file that specifies how the source database tables should be mapped to the target database
replication-task-settings – The path to a JSON file containing settings for the task, such as task logging and target table preparation mode

During migration: Monitoring and performance enhancers

Monitoring with CloudWatch is an essential component of the AWS DMS migration process. It provides visibility into the performance and health of the migration tasks. Key metrics to monitor include FullLoadThroughputBandwidthTarget and CDCIncomingChanges, which provide insights into the data transfer rate and the efficiency of CDC, as well as the latencies involved.

The following are some critical CloudWatch metrics:

FullLoadThroughputBandwidthTarget – Measures the rate at which data is transferred from the replication instance to the target endpoint, in kilobytes per second
FullLoadThroughputRowsTarget – Indicates the number of rows transferred per second from the replication instance to the target endpoint
CDCIncomingChanges – Shows the number of change events pending application at the target, which can highlight lag in seeing the latest data
CDCLatencySource and CDCLatencyTarget – Represent the time lags in capturing events from the source and applying them to the target, respectively
CDCChangesMemorySource and CDCChangesMemoryTarget – Reflect the number of changes buffered in memory at the replication instance for the source and target, respectively
CDCChangesDiskSource and CDCChangesDiskTarget – Indicate the number of changes written to disk when the memory buffer is full, for both the source and the target, respectively

For troubleshooting, CloudWatch logs offer detailed information on table errors, exceptions, task failures, and performance bottlenecks. Adjusting the logging levels can provide more granular data for in-depth analysis.

To enhance performance, consider the following strategies:

Maximize file size – By increasing the maxFileSize, you can control the maximum size of files used for storing data during migration. Larger files can improve handling efficiency and are particularly advantageous for large migrations.
Optimize write buffer size for CSV (full load only) – Adjusting the writeBufferSize parameter can optimize the volume of data written to CSV files during full load operations, improving the speed and efficiency of data transfer.
Enable batch apply – Activating batchApply groups multiple transactions into batches when applying changes to the target database. This method is especially useful for handling significant transaction volumes because it can reduce the number of commit operations required on the target database.
Task and table parallelism – Employ the MaxFullLoadSubTasks parameter to run multiple table migrations in parallel during full load. This parallelism is key to accelerating the migration of databases with many tables.
Table-level parallelism – For larger tables, divide and migrate segments simultaneously. This drastically cuts down the time needed for their migration.
Optimize commit rate and employ CDC threads – Boost the commitRate to quicken the full load data migration. For comprehensive CDC operations, splitting your migration workload into multiple tasks across different tables or schemas can streamline the process, reducing delays and increasing the speed of data replication.

Conclusion

In this post, we discussed the process of migrating self-managed Db2 databases to Amazon RDS for Db2 using AWS DMS. By adhering to the outlined steps and integrating performance optimization and comprehensive logging strategies, you will be well-equipped to implement a smooth and successful migration.

For further information and nuanced configurations, refer to Creating source and target endpoints. If you’re delving into the finer points of AWS DMS or if any aspect of your database migration journey seems complex, we encourage you to engage with the AWS community. Your questions and experiences, shared in the comments section, are a vital contribution to the collective knowledge and success of AWS users worldwide.

About the authors

Darshan Thimmaiah, a Senior Product Manager with Amazon Web Services’ Database Migration Service, specializes in guiding customers through modernizing and migrating their legacy databases to AWS’s managed services. In his role, he is responsible for shaping the product’s strategy, ideation, roadmap, and pricing.

Brian Molina, a Senior Engineer on Amazon Web Services’ Database Migration Service – Data Plane team, specializes in Db2, Redshift, and Data Validation. His expertise contributes to the advanced technical aspects of database migration and management within AWS’s extensive service offerings.

Migrate from self-managed Db2 to Amazon RDS for Db2 using AWS DMS

Solution overview

Prerequisites

Create the source endpoint

Create the target endpoint

Create the replication instance

Create the AWS DMS migration task

During migration: Monitoring and performance enhancers

Conclusion

About the authors

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Implement UUIDv7 in Amazon RDS for PostgreSQL using Trusted Language Extensions

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Streamline your real-time data pipeline with Datastream and MongoDB

Optimize costs for Windows workloads using Persistent Disk Async Replication

How Korean Air succeeded in managing the vaccine cold chain with Amazon Managed Blockchain

POPULAR CATEGORY