Use a SQL Server secondary replica in an availability group as a source to migrate data into Amazon Redshift with AWS DMS

By mullaned2002

September 6, 2023

391

AWS Database Migration Service (AWS DMS) is a managed migration and data replication service that helps you migrate data from your databases, on premises or Amazon Elastic Compute Cloud (Amazon EC2), to a supported target platform, for example Amazon Redshift. Amazon Redshift is a cloud data warehousing service that can be used to analyze structured and semi-structured data across multiple data platforms. With AWS DMS, you can perform a full data load with continuous replication to seed and update your target endpoint.

In this post, we show you how to use a secondary replica instance in an Always On availability group as the source of migrating your data from SQL Server to Amazon Redshift. Using a secondary replica reduces the utilization overhead on your busy primary replica.

Solution overview

For this post, we use a 2-node SQL Server 2019 Always On availability group set up on Amazon EC2 as the source and Amazon Redshift as the target. We use AWS DMS to migrate the existing data and incremental changes to keep the target in sync with the source database.

As part of this solution, we use Amazon Route 53 to resolve the availability group listener endpoint and primary and secondary replicas. For more details, refer to Working with self-managed SQL Server Always On availability groups.

AWS DMS version 3.4.7 introduces support to use the secondary replica as a source and VPC endpoint or gateway endpoint to connect to some AWS services. For example with Amazon Redshift, you need to configure VPC endpoint if your AWS DMS replication instance has no publicly accessible IP address. For more details, refer to New features in AWS DMS 3.4.7.

The following diagram shows the architecture of the overall solution and how data is migrated from source to target through the following steps:

AWS DMS connects to the SQL Server secondary replica to discover Always On availability group topology.
AWS DMS connects to the SQL Server primary replica for necessary setup, such as verifying if the tables are part of a publication used by the replication and creating them if needed.
Data is extracted from the SQL Server secondary replica and saved to CSV files on the AWS DMS instance. The CSV files could be stored in the memory or on disk, depending on the Amazon Redshift endpoint configuration.
When the current batch is complete, the CSV files are transferred to an Amazon Simple Storage Service (Amazon S3) bucket.
AWS DMS connects to Amazon Redshift to run the COPY command, with the AWS Identity and Access Management (IAM) role dms-access-for-endpoint.
Data is imported into Amazon Redshift from the S3 endpoint into the Amazon Redshift endpoint within your VPC.

To implement the solution, complete the following high-level steps:

Configure a SQL Server Always On availability group secondary replica as the source.
Configure Amazon Redshift as the target.
Create an AWS DMS instance.
Create the AWS DMS source endpoint.
Create the AWS DMS target endpoint.
Create an AWS DMS full load and change data capture (CDC) task

Prerequisites

Before we begin, we assume that you have the following prerequisites:

An AWS account.
A SQL Server Always On availability group deployed (on premises or Amazon EC2).
Connectivity between the AWS DMS replication instance to SQL Server (source).
The SQL Server authenticated login user dbasa created with sysadmin privilege. Refer to Using a Microsoft SQL Server database as a source for AWS DMS for more information
The ability to resolve the availability group listener and primary and secondary replicas DNS names.
Native SQL Server replication publishing and distribution enabled on your primary replica.
The distribution option enabled on all replicas in your availability group. For details, refer to Setting up ongoing replication on a cloud SQL Server DB instance.

The Amazon VPC, security group and subnet group naming reflect the ones we have in our AWS account for this blog post, use the ones from your own AWS account.

This solution involves the creation and utilization of new AWS resources. Therefore, it will incur costs on your account. Refer to AWS Pricing for more information.

We strongly encourage that you set up this in a non-production environment and run end-to-end validations before implementing this solution in your production environment.

Configure a SQL Server availability group secondary replica as the source

The SQL Server Always On availability group feature provides high availability and read scaling capabilities. Using the secondary replica reduces the overhead on the primary instances so the primary can be used to serve more critical production read/write workloads.

To configure the secondary replica as a source for your AWS DMS replication, follow these steps:

Log in to your SQL Server instance.
Launch SQL Server Management Studio (SSMS) and connect to SQL Server using the availability group listener endpoint. This ensures that you are connected to the primary replica.
Expand the Always On High Availability folder, choose (right-click) the availability group, and choose Properties.

Configure the Readable Secondary property of both replicas to Read-intent only.

Allowing the secondary replicas to serve read-only workloads may requires additional licenses because it’s now not considered a true passive instance. Consult your licensing team for potential impact. For more information, download the licensing document.

Open a new query window and run the following query to create read-only routing on your Always On availability group. Replace dmsag, SQLAG1, SQLAG2, and contoso.dms.com with your own availability group name, instance names, and domain name, respectively. For more details, refer to Configure read-only routing for an Always On availability group.

ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG1′ WITH
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG1′ WITH
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N’TCP://SQLAG1.contoso.dms.com:1433′));

ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG2′ WITH
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG2′ WITH
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N’TCP://SQLAG2.contoso.dms.com:1433′));

ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG1′ WITH
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=(‘SQLAG2′,’SQLAG1’)));

ALTER AVAILABILITY GROUP [dmsag]
MODIFY REPLICA ON
N’SQLAG2′ WITH
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=(‘SQLAG1′,’SQLAG2’)));
GO

After the primary and secondary replicas are configured to support read-only routing in both roles, the readable secondary replica can receive read-intent connection requests from AWS DMS through the availability group listener endpoint.

For ongoing replication (CDC), AWS DMS reads from the active portion of the transaction log file for changes. If the active transaction log has been truncated due to a backup process, AWS DMS reads for the log backup files. The log backup files must be in native format and in an accessible location. If it’s not able to read from either the active or backup logs, the AWS DMS task will fail.

Configure Amazon Redshift as the target

Amazon Redshift is a fully managed, petabyte-scale cloud data warehouse that enables you to analyze large datasets using standard SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes. It uses AWS-designed hardware and machine learning to deliver the best price performance at any scale.

In this section, we create an Amazon Redshift cluster, Amazon S3 gateway endpoint, and Amazon Redshift endpoint.

Create an Amazon Redshift cluster

To create the Amazon Redshift cluster used as the target for this post, complete the following steps:

On the Amazon Redshift console, choose Clusters in the navigation pane.
Choose Create cluster.
In the Cluster configuration section, specify the following values:

For Cluster identifier, enter dms-blog-redshift.
For Node type, choose dc2.large.
For Nodes, enter 1.

In the Database configuration section, specify the following values:

For Admin user name, enter awsuser.
For Admin user password, enter a value for the password.

In the Associated IAM roles section, specify an IAM role as the default role for the cluster. In this post, you create a new IAM role:

Under Cluster permissions, for Manage IAM roles, choose Create IAM role.
Choose No additional Amazon S3 bucket to allow the created IAM role to access only S3 buckets with names that contain the keyword redshift. For the S3 bucket used by AWS DMS as a staging area before copying data to Amazon Redshift, the IAM role dms-access-for-endpoint will be created automatically when you configure the AWS DMS endpoint for Amazon Redshift.
Choose Create IAM role as default. Amazon Redshift automatically creates and sets the IAM role as the default for your cluster.

In the Additional configurations section, turn off Use defaults for additional configurations.
Under Network and security, modify the following settings:

For Virtual private cloud (VPC), choose dms-blog-vpc.
For VPC security groups, choose dms-blog-priv-sg.
For Cluster subnet group, choose dms-blog-subnetgrp. If you haven’t created the subnet group yet, refer to Managing cluster subnet groups using the console to create one.
For Enhanced VPC routing, select Turn on.
For Publicly accessible, deselect Turn on Publicly accessible.

Under Database configurations, specify the following values:

For Database name, enter dmsdb.
For Database port, use the default port number 5439.

Choose Create cluster to create the Amazon Redshift cluster.
When the cluster is ready, choose the cluster name (dms-blog-redshift) to view the cluster information.
Note down the endpoint, which you use during AWS DMS target endpoint creation later.

The following is the endpoint name for Amazon Redshift in this post: dms-blog-redshift.abcdefghijk.ap-southeast-1.redshift.amazonaws.com:5439/dmsdb.

Create an Amazon S3 gateway endpoint

Using VPC endpoints ensures that all your traffic remains within the VPC to maintain end-to-end network security for your replication task. It’s also a mandate requirement if your AWS DMS instance is not publicly accessible and your AWS DMS version is 3.4.7 and higher.

Complete the following steps to create an Amazon S3 gateway endpoint using Amazon Virtual Private Cloud (Amazon VPC):

On the Amazon VPC console, choose the same Region as your AWS DMS replication instance.
In the navigation pane, choose Endpoints.
Choose Create endpoint.
Specify a name tag (for this post, dms-blog-s3-endpoint).
Under Services, for Service Name, enter com.amazonaws.ap-southeast-1.s3 and for Type, choose Gateway. For this post, we use ap-southeast-1, but you can choose the appropriate service endpoint for your Region.
Under VPC, choose dms-blog-vpc and select all route tables to create a route record for Amazon S3 access.
Accept the default Full access under Policy.
Choose Create endpoint to create the Amazon S3 gateway endpoint.

Create an Amazon Redshift endpoint

To create an Amazon Redshift endpoint to utilize Amazon Redshift enhanced VPC routing, complete the following steps:

On the Amazon VPC console, choose Endpoints in the navigation pane.
Choose Create endpoint.
Specify a name tag, such as dms-blog-redshift-endpoint.
Under Services, specify Service Name as com.amazonaws.ap-southeast-1.redshift with Type as Interface. (Choose the appropriate service endpoint for your Region.)
For VPC, choose dms-blog-vpc.
For Subnets, select all subnets your Amazon Redshift cluster dms-blog-redshift resides in.
For Security groups, choose dms-blog-priv-sg, which opens all traffic within this security group.
Accept the default Full access under Policy.
Choose Create endpoint to create the Amazon Redshift VPC endpoint.

Create an AWS DMS replication instance

An AWS DMS replication instance is an EC2 instance that performs your data migration. You get high availability through a Multi-AZ deployment when you choose the Multi-AZ option during configuration. The replication instance runs the AWS DMS replication task that connects to the source and target endpoints.

To create a replication instance, follow these steps:

On the AWS DMS console, choose Replication instances in the navigation pane.
Choose Create replication instance.
Enter a name and optional description. In this post, we name the replication instance dms-blog-ri.
Choose an instance class. For this post, we use a t3.small instance for demonstration purposes. For more information, refer to Choosing the right AWS DMS replication instance for your migration

For Engine version, choose version 3.4.7 or newer.
For Multi AZ, choose Single-AZ. It is recommended to choose Multi-AZ for a production environment.
Leave the default setting for Allocated storage (50 GB).

For VPC, choose your VPC. In our case, we created dms-blog-vpc for this post.
For Replication subnet group, choose the replication subnet group in your selected VPC where you want the replication instance to be created. For more information about replication subnet groups, see Creating a replication subnet group.
Deselect Public accessible. This instance will only have a private IP address.
Review the settings for Advanced security and network configuration, Maintenance, and Tags.

Choose Create replication instance.

Create the AWS DMS source endpoint

Next, you create the source endpoint and test connectivity using the replication instance created in the previous step.

On the AWS DMS console, choose Endpoints in the navigation pane.
Choose Create endpoint.
For Endpoint type, choose Source endpoint.
For Endpoint identifier, enter a name for your endpoint. In our case, we call it sqlserver-source.

For Source engine, choose Microsoft SQL Server
For Access to endpoint database options, select Provide access information manually.
For Server name, enter the SQL Server Always On availability group listener name. Ours is sqlag.contoso.dms.com.
Enter the connection information of the source SQL database server. We use the following values, but you can replace this information with your own:

For Port, enter 1433.
For User name, enter dbasa.
For Password, enter the password of the SQL login that was created in the source database.
For Secure Socket Layer (SSL) mode, choose none.
For Database name, enter dmsdb.

Select Use endpoint connection attributes and specify the following for Extra connection attributes:

applicationIntent=ReadOnly;multiSubnetFailover=yes;alwaysOnSharedSynchedBackupIsEnabled=false;activateSafeguard=false;setUpMsCdcForTables=false

For more information about these settings, see Endpoint settings when using SQL Server as a source for AWS DMS.

Choose Create endpoint.
After you create the endpoint, select it and on the Actions menu, choose Test Connection to make sure it’s successful.

The AlwaysOnSharedSynchedBackupIsEnabled attribute changes the behavior of AWS DMS when using a SQL Server database that is part of an Always On availability group. When this attribute is set to True, AWS DMS does not perform a probing call to all the replica nodes to track if the transaction log backup is taken on replicas other than the source instance. If you set this value to True, ensure that you configure the log backup on the same replica instance as your source. We recommend setting this to False, which allows AWS DMS to probe and track where the log backup is taken.

Create the AWS DMS target endpoint

Next, you create the target endpoint for Amazon Redshift and test connectivity using the following steps:

On the AWS DMS console, choose Endpoints in the navigation pane.
Choose Create endpoint.
Choose Target endpoint.
For Endpoint identifier, enter dms-blog-redshift.
Choose Amazon Redshift for Target engine.
Specify the following information for access to the endpoint database:

For Server name, enter dms-blog-redshift.abcdefjhijk.ap-southeast-1.redshift.amazonaws.com.
For Port, enter 5439.
For User name, enter awsuser.
For Password, enter your complex password specified during Amazon Redshift cluster creation.
For Secure Socket Layer (SSL) mode, choose none.
For Database name, enter dmsdb.

Choose Create endpoint.
After you create the endpoint, select it and on the Actions menu, choose Test Connection to make sure it’s successful.

AWS DMS uses a default S3 bucket as intermediate storage before copying the migrated data to Amazon Redshift. If you need to specify a custom S3 bucket name and folder name for this intermediate storage, you can modify the following endpoint settings:

BucketName – A string you specify as the name of the S3 bucket storage. The default bucket name uses the format dms-xxxxxx, in which xxxxxx is the part of the lowercase value of the endpoint ARN.
BucketFolder – An optional string you can specify as the name of the storage folder in the specified S3 bucket. The default folder name starts with redshift-.

To encrypt data stored in Amazon S3 before it’s pushed to an S3 bucket, you can use Amazon S3 service-side encryption (EncryptionMode:SSE_S3), which is the default. Alternatively, you can use a custom AWS Key Management Service (AWS KMS) key (EncryptionMode:SSE_KMS). After it’s configured, you can switch EncryptionMode from SSE_KMS to SSE_S3, but you can’t change from SSE_S3 to SSE_KMS.

You only need to specify the ARN of your custom AWS KMS key for the ServerSideEncryptionKmsKeyId field if you choose SSE_KMS for EncryptionMode.

For more information on custom S3 buckets and encryption settings, see Using a data encryption key, and an Amazon S3 bucket as intermediate storage.

Create an AWS DMS full load and CDC task

An AWS DMS task is where the actual data migration job runs on the replication instance, moving data from the source endpoint to the target endpoint. Three types of migration methods are supported:

Migrating data to the target database – This process is the Migrate existing data option on the AWS DMS console and is called Full Load in the API.
Capturing changes during migration – This process is the Migrate existing data and replicate ongoing changes option on the AWS DMS console and is called full-load-and-cdc in the API.
Replicating only data changes on the source database – This process is the Replicate data changes only option on the AWS DMS console and is called cdc in the API.

For this post, we create a single AWS DMS task with the migration method Migrate existing data and replicate ongoing changes (also known as a full load and CDC task) to migrate data from dmsdb in SQL Server to dmsdb in Amazon Redshift. For more information on AWS DMS task creation, see Creating a task.

To improve AWS DMS task performance, we recommend the following key points. For more performance considerations, refer to Multithreaded task settings for Amazon Redshift.

A primary key on both the source and target table is required for AWS DMS to handle changes to Amazon Redshift target tables during CDC with BatchApplyEnabled set as true. Without a primary key, changes are applied statement by statement. That can adversely affect task performance during CDC by causing target latency and impacting the cluster commit queue.
AWS DMS queries that run during ongoing replication to an Amazon Redshift cluster can share the same WLM (workload management) queue with other application queries that are running. Therefore, consider properly configuring WLM properties to allow enough resources for your AWS DMS workload.

Complete the following steps to create your full load and CDC task:

On the AWS DMS console, in the navigation pane, choose Database migration tasks.
Choose Create task.
On the Create database migration task page, in the Task configuration section, specify the task options:

For Task identifier, enter sqlserver-dms-redshift.
For Replication instance, enter dms-blog-ri.
For Source database endpoint, enter sqlserver-source.
For Target database endpoint, enter dms-blog-redshift.
For Migration type, choose Migrate existing data and replicate ongoing changes.

In the Task Settings section, specify the following values:

For Editing mode, choose Wizard.
For Target table preparation mode, choose Disable custom CDC stop mode.
For Stop task after full load completes, choose Drop tables on target.
For Include LOB columns in replication, choose Don’t stop.
For Maximum LOB size (KB), enter 32. For more information, refer to Setting LOB support for source databases in an AWS DMS task
For Task logs, choose Turn on CloudWatch logs. Using the default setting for these AWS DMS component activities is sufficient for most monitoring and troubleshooting scenarios.

In the Table mapping section, specify the table selection rules and transformation rules:

Under Selection rules, use % as a wildcard value to match to all schemas and tables in the source SQL Server dmsdb database.

Under Transformation rules, covert names for schema, table, and columns to lowercase using the action convert-lowercase, because Amazon Redshift names and identifiers are case-insensitive and folded to lowercase by default.

Choose Create task to create the task.

The task will start automatically after creation by default.

In this post, we configured AWS DMS to create schemas in an Amazon Redshift target database by choosing Drop tables on target for the task setting Target table preparation mode. AWS DMS creates only the objects required to efficiently migrate the data: tables, primary keys, and in some cases, unique indexes. AWS DMS doesn’t create secondary indexes, non-primary key constraints, or column data defaults. AWS DMS doesn’t take additional consideration to optimize sorting and distribution keys, and you can take advantage of Amazon Redshift self-tuning capabilities to automatically optimize the design of the tables. You can also customize the sort and distribution key by using ALTER TABLE for these tables created by AWS DMS.

Alternatively, you can use the AWS Schema Conversion Tool (AWS SCT) to convert your existing database schema from SQL Server to Amazon Redshift. As part of the schema conversion, you can choose the strategies and rules you want the tool to use. After converting your schema and reviewing the suggested keys, you can adjust your rules or change your strategy to get the results you want. Refer to Choosing optimization strategies and rules for use with AWS SCT for more information. When you use the AWS SCT, you need to choose Do nothing for the AWS DMS task setting Target table preparation mode. In this mode, AWS DMS assumes that the target tables have been pre-created on the target.

Troubleshooting

In this section, we discuss some common errors, their causes, and their solutions.

Error message “Role ‘dms-access-for-endpoint’ is not configured properly” during creation of Amazon Redshift endpoint

AWS DMS uses an S3 bucket to transfer data to the Amazon Redshift database. For AWS DMS to create the bucket, it uses the IAM role dms-access-for-endpoint. Missing or improper configuration for this role are common causes for this error.

If you use the AWS Command Line Interface (AWS CLI) or the AWS DMS API to create a database migration with Amazon Redshift as the target database, you must create this IAM role first. The dms-access-for-endpoint role must be able to assumed by both redshift.amazonaws.com and dms.amazonaws.com services.

If you encounter the error “Role ‘dms-access-for-endpoint’ is not configured properly” when you choose Test endpoint connection before the endpoint is actually created, you can ignore it and choose Test Connection after the endpoint created.

For more information about creating this role, see Creating the IAM roles to use with the AWS CLI and AWS DMS API. For more information about how to troubleshoot errors related to this role, see How can I troubleshoot connectivity failures and errors for an AWS DMS task that uses Amazon Redshift as the target endpoint.

Fail to connect to source replica

You may encounter the following error message from the AWS DMS task log:

[SOURCE_CAPTURE ]E: RetCode: SQL_ERROR SqlState: HYT00 NativeError: 0 Message: [unixODBC][Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired [1022502] [SOURCE_CAPTURE ]E: Failed to connect to replica ‘SQLAG1’ [1022506]

To use a SQL Server secondary replica as a source, AWS DMS needs to connect to the primary and secondary replicas in the availability group cluster. It also needs to resolve the availability group listener, primary replica, and secondary replicas.

As a solution, you can use Route 53 private hosted zones, self-managed domain name servers on Amazon EC2, or on-premises domain name servers. For self-managed DNS, you can specify custom DNS servers using the AWS CLI with parameter –dns-name-servers. In this post, we have configured a Route 53 outbound forwarder associated with VPC dms-blog-vpc and we forward DNS requests to DNS servers for domain contoso.dms.com. With this configuration, the AWS DMS replication instance is able to resolve the following host names:

sqlag.contoso.dms.com
sqlag1.contoso.dms.com
sqlag2.contoso.dms.com

For more information about Route 53 outbound endpoints, see Forwarding outbound DNS queries to your network.

Not able to list the S3 bucket due to lack of VPC endpoint

You may encounter the following error message:

[FILE_FACTORY ]E: Failed to list bucket ‘dms-xxxxxxx’: error code <>: curlCode: 28, Timeout was reached [1001713]

The timeout error indicates a connectivity-related issue.

To address this, check the following areas if your AWS DMS version is 3.4.7 and above and the replication instance is not publicly accessible:

The Amazon S3 gateway endpoint is missing.
The AWS DMS replication instance is not able to resolve the S3 endpoint for your Region. In this post, the endpoint is com.amazonaws.ap-southeast-1.s3.
The outbound rules for the AWS DMS instance attached security group are allowed for any host 0.0.0.0/0.

AWS DMS CDC task behaviors during the source availability group failover

AWS DMS connects to the source database via an availability group listener. The CDC task behavior varies based on the availability group state when the task connects and the actual CDC step being run during the event of availability group failover.

If the failover happens while the CDC task is reading the source SQL Server database transaction log, you see the following errors in the task log:

2023-06-09T06:35:17 2023-06-09T06:35:17 [SOURCE_CAPTURE ]E: mssql_execute_log_lsn_peeper(…) failed upon SQLExecute with LSN parameters ‘0x000001a2:0000ac78:0003′,’0x000001a2:0000ac78:0003’ [1020102] (sqlserver_log_queries.c:779)
2023-06-09T06:35:17 2023-06-09T06:35:17 [SOURCE_CAPTURE ]E: Encountered an unexpected error. Aborting… [1020102] (sqlserver_endpoint_capture.c:742)

The following is the last failure message on the AWS DMS console for the same CDC error. To recover from this error, ensure the availability group state is healthy and resume the task.

Last Error Stream Component recoverable error. Task error notification received from subtask 0, thread 0 [reptask/replicationtask.c:2883] [1020102] Error executing source loop; Stream component failed at subtask 0, component st_0_NGJ3XHIMAUSDHNRTBU66OCJ4YQL44P5Z2NJW73Y; Stream component ‘st_0_NGJ3XHIMAUSDHNRTBU66OCJ4YQL44P5Z2NJW73Y’ terminated [reptask/replicationtask.c:2891] [1020102] Stop Reason RECOVERABLE_ERROR Error Level RECOVERABLE

If AWS DMS can connect to the listener after failover and the availability group state is healthy, the CDC task continues running without error.

If AWS DMS is able to connect to the listener but the availability group state is not healthy, you see the following last failure message on the AWS DMS console for the CDC task. To recover from this error, ensure the availability group state is healthy and resume the task.

Last Error Source endpoint is in transition state and is not ready yet for interacting with REPLICATE. Task error notification received from subtask 0, thread 0 [reptask/replicationtask.c:2883] [1020470] Error executing source loop; Stream component failed at subtask 0, component st_0_NGJ3XHIMAUSDHNRTBU66OCJ4YQL44P5Z2NJW73Y; Stream component ‘st_0_NGJ3XHIMAUSDHNRTBU66OCJ4YQL44P5Z2NJW73Y’ terminated [reptask/replicationtask.c:2891] [1020470] Stop Reason RECOVERABLE_ERROR Error Level RECOVERABLE

If AWS DMS fails to connect to the listener with the error [SOURCE_CAPTURE ]E: Failed to connect to replica, fix the connectivity issue and resume the task.

Clean up

To avoid ongoing costs, delete the resources that you created as part of this post

Terminate Amazon EC2 instance
Delete the Amazon Redshift cluster
Delete VPC endpoints
Delete DMS replication task
Delete DMS endpoints
Delete DMS replication instance

Conclusion

In this post, we showed you how to use the secondary replica instance in an Always On availability group as the source of migrating your data from SQL Server to Amazon Redshift. We also showed you how to use VPC endpoints to ensure all traffic involved in moving your data remains inside your VPC network. The troubleshooting section serves as a quick reference for common errors during the deployment.

If you have any comments or questions, leave them in the comments section.

About the Authors

Donghua Luo is a Senior RDS Specialist Solutions Architect. He works with AWS customers to migrate and achieve higher flexibility, scale, and resiliency with database services in AWS cloud.

Barry Ooi is a Senior Database Specialist Solution Architect at AWS. His expertise is in designing, building and implementing data platform using cloud native services for customers as part of their journey on AWS. His areas of interest includes data analytics and visualization. In his spare time, he loves music and outdoor activities.

Use a SQL Server secondary replica in an availability group as a source to migrate data into Amazon Redshift with AWS DMS

Solution overview

Prerequisites

Configure a SQL Server availability group secondary replica as the source

Configure Amazon Redshift as the target

Create an Amazon Redshift cluster

Create an Amazon S3 gateway endpoint

Create an Amazon Redshift endpoint

Create an AWS DMS replication instance

Create the AWS DMS source endpoint

Create the AWS DMS target endpoint

Create an AWS DMS full load and CDC task

Troubleshooting

Error message “Role ‘dms-access-for-endpoint’ is not configured properly” during creation of Amazon Redshift endpoint

Fail to connect to source replica

Not able to list the S3 bucket due to lack of VPC endpoint

AWS DMS CDC task behaviors during the source availability group failover

Clean up

Conclusion

About the Authors

Scale your relational database for SaaS, Part 1: Common scaling patterns

Scale your relational database for SaaS, Part 2: Sharding and routing

Perform a side-by-side upgrade in AWS DMS by moving tasks to minimize business impact

LEAVE A REPLY Cancel reply

Most Popular

ChromeOS makes security simple for Block

Managing Cloud Storage soft delete at scale

Enhancing iEEG seizure identification and similarity search with Google Cloud

Auto-upgrades for Config Sync in GKE Enterprise now in preview

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

ChromeOS: A New Year’s Resolution for Your Business

Introducing gcloud storage: up to 94% faster data transfers for Cloud Storage

Make informed decisions with Google Trends data

POPULAR CATEGORY