Improve native backup and restore performance in Amazon RDS for SQL Server

By mullaned2002

August 9, 2021

1049

Amazon Relational Database Service (Amazon RDS) for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. As a fully managed database service, Amazon RDS for SQL Server takes automated backups of your DB instance during the backup window. If required, you can restore your DB instance to a specific point in time as per your backup retention.

In addition to automated snapshots, Amazon RDS for SQL Server also supports native backup and restore in SQL Server. You may have to use a native backup and restore method to migrate individual databases from one RDS instance to another, or refresh your dev or QA environment by taking a backup of the production database and restoring it to the environment.

Native backup and restore is a time-tested method for SQL Server migrations. You can utilize the differential and log backups on the source side to reduce the database migration timelines. However, you may wonder if you can rely on this strategy for large databases, such as 4 TB in size? What are the different methods in which you can improve the performance of the backup and restore process and thereby reduce the downtime required for migration activities? Is one method better than others?

In this post, we demonstrate how to improve native backup and restore performance by tuning the configurations and testing different strategies along with estimates in Amazon RDS for SQL Server.

Native backup and restore in Amazon RDS for SQL Server

For the context of this post, it’s important to understand how native backup and restore works in Amazon RDS for SQL Server. For an overview, see Support for native backup and restore in SQL Server.

To learn about the technical implementation of performing backup and restore in Amazon RDS for SQL Server, see Importing and exporting SQL Server databases. This documentation provides different configurations that you can use with Amazon RDS for SQL Server backup and restore, such as using compression or multi-file backups.

Set up lab servers

The goal of our testing was to find out which configuration, including instance types and storage types, gives the best performance for backups and restores. We started with two lab servers:

Server A – RDS SQL 2017 R5.8XLarge
Server B – RDS SQL 2017 R5.24XLarge

Then we created a dummy database, filled with random large tables. The database was grown up to 1 TB and 4 TB in size for the tests.

No other active workloads were running on these lab machines. The only load running on these servers were backup and restore queries. The backup and restore performance in your Amazon RDS environment depends on the user or application queries and any other active workloads.

The following table summarizes the storage configuration that was tested with each lab server.

Server
Instance Storage Size
GP2IO1
PIOPS Configuration
RDS SQL 2017 R5.8XLarge
1.5 TB
GP2

RDS SQL 2017 R5.8XLarge
1.5 TB
IO1
9000
RDS SQL 2017 R5.8XLarge
4.5 TB
GP2

RDS SQL 2017 R5.8XLarge
4.5 TB
IO1
30,000
RDS SQL 2017 R5.24XLarge
1.5 TB
GP2

RDS SQL 2017 R5.24XLarge
1.5 TB
IO1
9000
RDS SQL 2017 R5.24XLarge
4.5 TB
GP2

RDS SQL 2017 R5.24XLarge
4.5 TB
IO1
35,000

In your environment, you should also consider testing with the newer R5b instance types. For more information, see New – Amazon EC2 R5b Instances Provide 3x Higher EBS Performance.

Set up test cases

We tested the following configurations for performing backup and restore for 1 TB and 4 TB databases:

Backup and restore with GP2 storage
Backup and restore with compression enabled and GP2 storage
Backup and restore with compression enabled, multiple files (four), and GP2 storage
The backup file was divided into four multiple files:

exec msdb.dbo.rds_backup_database
@source_db_name=’mydatabase’,
@s3_arn_to_backup_to=’arn:aws:s3:::mybucket/backup*.bak’,
@type=’FULL’,
@number_of_files=4;

Backup and restore with compression enabled, multiple files (eight), and GP2 storage
The backup file was divided into eight multiple files:

exec msdb.dbo.rds_backup_database
@source_db_name=’mydatabase’,
@s3_arn_to_backup_to=’arn:aws:s3:::mybucket/backup*.bak’,
@type=’FULL’,
@number_of_files=8;

Backup and restore with multiple files (four) and GP2 storage
Backup and restore with multiple files (eight) and GP2 storage
Backup and restore with io1 storage
Backup and restore with compression enabled and io1 storage
Backup and restore with compression enabled, multiple files (four), and io1 storage
Backup and restore with compression enabled, multiple files (eight), and io1 storage
Backup and restore with multiple files (four) and io1 storage
Backup and restore with multiple files (eight) and io1 storage

Test results

The following table summarizes the results we found for the 1 TB database.

(All the numbers represent minutes)
Full Backup Times (1 TB)
Full Restore Times (1 TB)
r5x8large
r5x24large
Performance
r5x8large
r5x24large
Performance
GP2
179
175
x
247
273
x
GP2+Compression
90
89
~2x faster
138
127
~2x faster
GP2+Compression+Multiple Files (4)
66
66
~2.7x faster
79
79
~3x faster
GP2+Compression+Multiple Files (8)
66
66
~2.7x faster
82
79
~3x faster
GP2+Multiple Files (4)
65
65
~2.7x faster
73
74
~3x faster
GP2+ Multiple (8)
65
71
~2.7x faster
78
75
~3x faster
IOPS (9,000)
172
177
~same performance
175
383

IOPS+Compression
85
87
~2.1x faster
82
124
~3x faster
IOPS+Compression+Multiple Files (4)
33
34
~5.4x faster
37
40
~6.5x faster
IOPS+Compression+Multiple Files (8)
33
33
~5.4x faster
36
37
~6.5x faster
IOPS+Multiple Files (4)
55
57
~3.2x faster
50
76
~4x faster
IOPS+ Multiple (8)
49
53
~3.5x faster
48
68
~4x faster

The following table summarizes the results we captured for the 4 TB database.

(All the numbers represent minutes)
Full Backup Times (4 TB)
Full Restore Times (4 TB)
r5x8large
r5x24large
Performance
r5x8large
r5x24large
Performance
GP2
542
537
x
740
763
x
GP2+Compression
223
224
~2.4x faster
281
280
~2.6x faster
GP2+Compression+Multiple Files (4)
202
201
~2.5x faster
281
281
~2.6x faster
GP2+Compression+Multiple Files (8)
204
202
~2.5x faster
281
–
~2.6x faster
GP2+Multiple Files (4)
202
201
~2.5x faster
281
–
~2.6x faster
GP2+ Multiple (8)
208
208
~2.5x faster

–

IOPS (30,000)
540
–
~same performance
826
757

IOPS (30,000)+Compression
164
–
~3x faster
194
210
~3.6x faster
IOPS (30,000)+Compression+Multiple Files (4)
102
102
~5.3x faster
140
141
~5.2x faster
IOPS (30,000)+Compression+Multiple Files (8)
101
101
~5.3x faster
138
140
~5x faster
IOPS (30,000)+Multiple Files (4)
173
–
~3.1x faster
168
175
~4x faster
IOPS (30,000)+ Multiple (8)
134
–
~4x faster
141
141
~5.2x faster
IOPS (35,000)+Compression+Multiple Files (4)
94
93
~5.7x faster
129
129
~5.7x faster
IOPS (35,000)+Compression+Multiple Files (8)
93
93
~5.7x faster
129
130
~5.7x faster

Observations

Based on my results, we can make the following observations about backup and restore performance:

IOPS with compression and multiple files gives the best performance. This configuration could reduce the time taken for a backup and restore by more than five times. However, there are trade-offs for this performance:
IOPS costs more than GP2.
Compression requires CPU; it might affect other concurrent transactions running on SQL Server and cause overall performance degrade.

You can use multi-file backup to improve backup performance by approximately 2.5 times and restore performance by 2.6 times. By using this functionality, you don’t have change any configuration on your existing Amazon RDS for SQL Server, such as IOPS or enabling compression.
The reason we didn’t see a difference between R5.XLarge and R5.24XLarge is because we didn’t have any active workloads running on these machines. Backup and restore were the only processes running on SQL Server. Therefore, there was no contention of resources on SQL Server. In a real-world scenario, when the backup and restore process competes with other database transactional processes, a bigger machine (with more resources) gives a better performance.
IOPS configuration affects backup and restore performance. When we increased the IOPS from 30,000 to 35,000, performance improved. You should adjust the IOPS setting during the migration window to improve performance.

Summary

If you’re looking for ways to improve backup and restore time for your large databases, the results and the observations in this post should help you form a good strategy. The post lists some of the tools and techniques to reduce time taken for backup and restore of large databases. More importantly, it can help you get an estimate of the performance gain with each configuration, which you can use to plan migration for your large databases.

Try one of the configurations discussed in the post and leave a comment to let us know what results you saw in your environment!

About the Author

Anuja Malik has been working with database for 15 years. Ex-Microsoft, she was the Database Architect for Thomson Reuters where she led the datacenter exit program and helped Thomson Reuters move to AWS. Anuja is a builder; Powershell and Python are her favorite tools. She is passionate about managed database services and helping customers adopt best practices for operational excellence and cost savings.

Improve native backup and restore performance in Amazon RDS for SQL Server

Native backup and restore in Amazon RDS for SQL Server

Set up lab servers

Set up test cases

Test results

Observations

Summary

About the Author

GQL: The ISO standard for graphs has arrived

Set up notifications for Amazon RDS pending maintenance actions

Enhance PostgreSQL database security using hooks with Trusted Language Extensions

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 18

GQL: The ISO standard for graphs has arrived

Set up notifications for Amazon RDS pending maintenance actions

Enhance PostgreSQL database security using hooks with Trusted Language Extensions

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Streamline interstate Department of Motor Vehicles collaboration with Private Blockchain

The journey to the cloud mitigates enterprise risk

Are cloud-native ops tools right for multicloud?

POPULAR CATEGORY