Friday, June 21, 2024
No menu items!
HomeDatabase ManagementOptimize TempDB performance in Amazon RDS Custom for SQL Server using local...

Optimize TempDB performance in Amazon RDS Custom for SQL Server using local instance storage

Amazon Relational Database Service (Amazon RDS) Custom for SQL Server now supports the X2iedn instance class, which has a high memory-to-vCPU ratio and non-volatile memory express (NVMe) SSD-backed instance storage optimized for low latency, improved random I/O performance, and high sequential read throughput. With this new offering, you can now create and extend TempDB files on the locally attached NVMe disk to achieve low storage latencies.

In this post, we show you how to use the local NVMe disks for optimizing TempDB performance. We provide step-by-step instructions to configure and initialize local storage (using GUI and PowerShell) and TempDB, provide a performance comparison of TempDB on local storage to Amazon Elastic Block Store (Amazon EBS) storage, and discuss best practices for TempDB optimization when using a local instance store.

SQL Server TempDB overview

The TempDB system database in SQL Server is a global resource accessible to all users connected to the instance. It’s also used to store temporary user objects and internal objects created by the database engine. Temporary user objects include global or local temporary tables, temporary stored procedures, table variables, tables returned in table-valued functions, and cursors. Internal objects created by the database engine include work tables for storing intermediate results for operations like spools, cursors, sorts, and temporary large object (LOB) storage. It also includes work files for hash join or hash aggregate operations, and intermediate sort results for operations such as creating or rebuilding indexes or certain GROUP BY, ORDER BY, or UNION queries.

Given its role in many operations, TempDB can experience high I/O, especially if many queries require temporary storage or if there’s heavy use of features like snapshot isolation, result set sorting, or aggregation user queries. TempDB can experience contention when many sessions try to allocate space in TempDB simultaneously, which can also result in overall performance issues.

The following are advantages of local TempDB:

TempDB performance on local storage has faster read/write speeds due to lower latency. It also provides improved random I/O performance and high sequential read throughput, reducing overall latency.
Depending on the workload, TempDB performance can be improved by up to 20%.
It can reduce your overall EBS snapshot costs because secondary TempDB files will be placed in the local instance store.

As a prerequisite, you must have an RDS Custom for SQL Server instance. For more details on setting up an instance, refer to Launch an Amazon RDS Custom for SQL Server Instance using AWS CloudFormation.

Configure and initialize local NVMe SSD storage using the GUI

Complete the following steps to initialize and start using an SSD disk on your Amazon Elastic Compute Cloud (Amazon EC2) instance via RDP:

Use Remote Desktop Protocol (RDP) to connect to your RDS Custom EC2 instance.
You’ll need the instance’s public IP address or DNS name and credentials to log in.
Open the Disk Management window by entering “Disk Management” into the Start menu search and selecting the corresponding result

In the Disk Management window, you should see a list of disks. The new SSD disk will likely appear as an unknown disk with unallocated space (Disk1 in the following screenshot).

3. Choose (right-click) Disk1 (which is labeled as “Not initialized”) and then choose Initialize Disk to begin setting it up for partitioning.

Follow the prompts to select the appropriate disk initialization type (GPT in our case) and choose OK to proceed with the initialization.

After you initialize the disk, choose (right-click) the unallocated space and choose New Simple Volume.

Choose Next to follow the wizard to create a new partition on the SSD.
You can choose the partition size and drive letter, and format the partition with a file system (for example, NTFS).

Configure your local storage based on the DB instance class

Assign a drive letter (T in our case; choose a drive letter that isn’t already in use).

Format with appropriate file system (NTFS in our case) and label the volume (we have chosen the label Temp).
Choose Next to continue creating the partition.

Choose (right-click) the newly created partition and choose Format.
Choose the desired file system and allocation unit size.
Choose Next to finish formatting the partition.

The SSD disk is now initialized and ready for use. You can access it using the assigned drive letter (for this use case, T).

Configure and initialize local NVMe SSD storage using PowerShell

For our example use case, we have one local SSD storage in an X2iedn instance. You can configure and initialize local NVMe SSD storage by running the following PowerShell commands as administrator:

#Identify all available Disks
#Get all disks
$disks = Get-Disk

#Find the local SSD disk that needs to be configured
$ssd = $disks | Where-Object { $_.OperationalStatus -eq ‘offline’ -or $_.PartitionStyle -eq “RAW” }

#Initialize the Disk
$ssd | Initialize-Disk -PartitionStyle GPT

#Create a New Partition
$partition = $ssd | New-Partition -UseMaximumSize

#Assign a unused Drive Letter in this example we used T
$partition | Set-Partition -NewDriveLetter T

#Format the Partition by provinding volume name, in this example we used ‘Temp’
$partition | Format-Volume -FileSystem NTFS -NewFileSystemLabel ‘Temp’

Configure TempDB on local storage

Complete the following steps to configure TempDB on local storage:

Identify a new local drive that you intend to use for creating secondary TempDB files. Make sure that this drive has sufficient disk space and I/O capacity to support TempDB.
Open SQL Server Management Studio (SSMS), connect to your SQL Server instance, and connect to SSMS using the master user.
Run the following SQL command to inspect the current configuration of TempDB and its file locations. USE tempdb;
EXEC sp_helpfile;

Make note of the existing file locations.

Before you configure the secondary TempDB files, create directories on the new drive where the secondary additional TempDB files will be stored. In our example, we created directories and the path T:rdsdbdataDATA on the T drive to store additional TempDB files. Make sure that you have the necessary permissions to create directories. You can alternatively create the directory structure and copy permissions by running the following PowerShell commands as administrator:

# Source directory in our case it is “D:rdsdbdataDATA”
$source_directory = “D:rdsdbdataDATA”
# Display the source directory
Write-Host “Source directory is: $source_directory”

# Destination directory, in our case it is “T:rdsdbdataDATA”
$destination_directory = “T:rdsdbdataDATA”
# Display the destination directory
Write-Host “Destination directory will be: $destination_directory”

# Get the directory structure from the source directory
Write-Host “Getting directory structure from source…”
$sub_directory = Get-ChildItem -Path $source_directory -Recurse | Where-Object { $_.PSIsContainer }

# Create the destination directory
Write-Host “Creating destination directory…”
New-Item -ItemType Directory -Force -Path $destination_directory

# Get the permissions from the source directory
Write-Host “Getting permissions from source directory…”
$source_directory_permission = Get-Acl -Path $source_directory

# Set permissions for the destination directory to match the source directory
Write-Host “Setting permissions for destination directory…”
Set-Acl -Path $destination_directory -AclObject $source_directory_permission

# Copy the directory structure and permissions from the source directory to the destination directory
Write-Host “Copying directory structure and permissions…”
foreach ($item in $sub_directory) {
$relative_path = $item.FullName.Substring($source_directory.Length)
$destination_path = Join-Path -Path $destination_directory -ChildPath $relative_path
Write-Host “Creating directory: $destination_path”
New-Item -ItemType Directory -Force -Path $destination_path
$itemAcl = Get-Acl -Path $item.FullName
Set-Acl -Path $destination_path -AclObject $itemAcl
}

Write-Host “Done!”

Use the following SQL command as a template to create secondary TempDB files on the newly created drive. Adjust the file paths and sizes according to your requirements. It’s recommended to set all the TempDB database files to the same initial size.

USE master;
GO
ALTER DATABASE tempdb
ADD FILE (
NAME = tempdev1, — Example file name
FILENAME = ‘T:rdsdbdataDATAtempdev1.ndf’, — we have to Specify the new file path
SIZE = 300GB, — Set the file size
MAXSIZE = UNLIMITED,
FILEGROWTH = 64MB — Set the autogrowth size
);

Repeat this command as necessary to create the required number of additional TempDB files and add them to the new location.
Run the sp_helpfile command again to confirm that the TempDB files have been successfully created on the new drive: USE tempdb;
EXEC sp_helpfile;

Performance benchmarks for local and non-local TempDB

We conducted and analyzed the performance of TempDB on Single-AZ and Multi-AZ deployment on X2ieden instances. We used the SQLQueryStress tool to simulate TempDB workloads on different configurations.

We used an db.x2iedn.16xlarge instance for benchmarking with the following configuration.

SQL Server configuration:

Version: Enterprise Edition 2019 with CU20
TempDB data files: One primary and six secondary additional TempDB data files, each 352,250 MB
SQL Server max memory: 20 GB (By limiting to 20 GB, we simulate more disk I/O. Use this only for test purposes.)

Test details using SQLQueryStress tool donfiguration:

Iterations: 500
Threads: 1
Delay: 0

We created two temporary tables, #tempOrders and #tempOrderDetails, which reside in TempDB. These tables were populated with 1 million and 5 million records, respectively, to mimic large-scale application data.

For our benchmarking query, we joined these tables, performed aggregations, and filtered results. This complex query tested TempDB’s efficiency in handling large datasets and intensive operations, providing insights into its performance.

After we ran the query in each iteration, we cleared the SQL Server cache to ensure a fair and fresh comparison in subsequent tests, and then dropped the temporary tables to clean up and keep it ready for the next iteration.

See the following code:

–Workload queries for tempdb performance evaluation

— Create Temporary Tables for benchmark test :
CREATE TABLE #tempOrders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATETIME,
Amount DECIMAL(10,2)
);

CREATE TABLE #tempOrderDetails (
DetailID INT PRIMARY KEY,
OrderID INT,
ProductID INT,
Quantity INT
);

— Populate Temporary Tables with sample test Data:
— Populate #tempOrders
INSERT INTO #tempOrders
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.name) AS OrderID,
ABS(CHECKSUM(NEWID())) % 5000 AS CustomerID,
DATEADD(DAY, ABS(CHECKSUM(NEWID())) % 365, ‘2023-01-01’) AS OrderDate,
ABS(CHECKSUM(NEWID())) % 1000 AS Amount
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;

— Populate #tempOrderDetails
INSERT INTO #tempOrderDetails
SELECT TOP 5000000 ROW_NUMBER() OVER (ORDER BY a.name) AS DetailID,
ABS(CHECKSUM(NEWID())) % 1000000 AS OrderID,
ABS(CHECKSUM(NEWID())) % 10000 AS ProductID,
ABS(CHECKSUM(NEWID())) % 100 AS Quantity
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;

— Sample Complex Query for Benchmarking:
WITH CTE AS (
SELECT o.OrderID, o.CustomerID, o.OrderDate, o.Amount,
d.ProductID, d.Quantity,
(o.Amount * d.Quantity) AS TotalAmount
FROM #tempOrders o
JOIN #tempOrderDetails d ON o.OrderID = d.OrderID
)
SELECT CustomerID, SUM(TotalAmount) AS GrandTotal
FROM CTE
GROUP BY CustomerID
HAVING SUM(TotalAmount) > 5000
ORDER BY GrandTotal DESC;

DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

DROP TABLE #tempOrders;
DROP TABLE #tempOrderDetails;

The following screenshots are from the benchmark test.

The following table indicates that TempDB performance on both Multi-AZ and Single-AZ instances configured with local storage outperformed their EBS storage counterparts in terms of elapsed time and average actual time per iteration by 20% for the test parameters we applied. This suggests that for high TempDB workloads, local storage configurations are more efficient. The improvement in performance can be attributed to the reduced latency and higher IOPS (I/O operations per second) offered by local storage compared to EBS storage.

We used the following TempDB configurations:

Multi-AZ with local storage (configured TempDB on D and E drive)
Multi-Z with EBS storage (configured TempDB only on D drive)
Single-AZ with EBS storage (configured TempDB only on D drive)
Single-AZ with local storage (configured TempDB on D and E drive)

Configuration
Perfmon Metric Total (Disk Transfer/Second)
Elapsed Time (MM:SS)
Average Actual Time per Iteration (Seconds)
Elapsed Time Improvement (%)
Average Actual Time per Iteration Improvement (%)
Disk Transfer/Second Improvement (%)

Multi-AZ with EBS storage (only D drive)
1947.28
57:09
6.8627s

Multi-AZ with local storage (D and E drive)
2743.91
46:04
5.5263s
20%
20%
41 %

Single-AZ with EBS storage (only D drive)
2200.94
57:54
6.9712s

Single-AZ with local storage (D and E drive)
3127.91
46:22
5.5663s
20%
20%
42%

Best practices for provisioning X2iedn on RDS Custom for SQL Server

Consider the following best practices:

Setting up X2iedn – Upon launching RDS Custom with the X2iedn instance type, you only have a raw, unallocated device for local storage. Before using it, you must correctly mount and format this device. Then you can configure TempDB on it to ensure optimal performance. For Multi-AZ instances, it’s recommended to perform the configuration on a standby instance. This way, in the event of a failover, the system continues to operate without issues, because the configuration is already in place on the standby instance.
TempDB file placement – Although the benefits of local storage are significant, it’s crucial to remember that only secondary TempDB files can be placed on local attached storage. The primary files for TempDB (tempdev and templog) and other user and system databases must be always placed on the D drive. Not adhering to this placement will keep the instance in an out of perimeter status, potentially disrupting operations.
TempDB size – To avoid any single file from becoming a performance bottleneck, we recommend maintaining a consistent size for all TempDB files. This approach ensures balanced I/O operations and maximizes performance.
Instance operations – With this setup, when using instance operations such as scale compute, instance replacement, snapshot restore, or point-in-time recovery (PITR), the local storage will revert to its raw and unallocated state. In these situations, you must remount, reformat, and reconfigure both the drive and TempDB to restore functionality as appropriate.
Adjust TempDB configuration – After any instance operations, we recommend altering the TempDB configuration and adjusting the sys.metadata accordingly. This action makes sure the system metadata reflects your TempDB setup, preventing potential conflicts or issues.
Monitor – Regularly monitor the TempDB space usage on your local NVMe drive and scale up or down based on TempDB usage.
Always test before applying – Before you apply any significant changes to a production environment with this setup, we highly recommend testing them in a staging or development environment. This will help prevent any unforeseen issues or performance impacts.

Conclusion

In this post, we took a deep dive into the capabilities of the X2iedn instance class, a powerful offering by RDS Custom that boasts a high memory-to-vCPU ratio and NVMe SSD-backed instance storage. We walked you through the process of creating a new partition and configuring TempDB on local storage, showcasing the significant enhancements in performance this setup can offer. Our performance benchmarking demonstrated the improvements in TempDB performance when taking advantage of the low-latency, high I/O capabilities of local storage. Additionally, we highlighted crucial best practices to ensure that you get the most out of your TempDB configuration when using the X2iedn instance class. Leave a comment if you have any questions.

To learn more about where or when to use RDS Custom, check out Working with RDS Custom for SQL Server.

About the author

Kalyan Banala is an accomplished Database Engineer, and works with the Amazon RDS Custom for SQL Server team at AWS. He enjoys working on technical challenges and is passionate about learning from and sharing knowledge with his teammates and AWS customers.

Read MoreAWS Database Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments