Programmatic approach to optimize the cost of Amazon RDS snapshots

By mullaned2002

April 17, 2024

142

One of the key benefits of Amazon Relational Database Service (Amazon RDS) is that it creates an automated storage volume snapshot of the database instance, backing up the database host at the instance. Amazon RDS saves the automated backups of databases according to the specified backup retention period. The flexibility of creating manual snapshots helps you achieve your business goals for high availability and disaster recovery. Although over-provisioned snapshots don’t cause application disruption or performance impact, it can increase the overall RDS snapshot costs. Reviewing RDS snapshots is important to optimize the overall cost of Amazon RDS.

In this post, we discuss programmatic strategies to manage RDS snapshots and optimize overall associated costs. We discuss identifying and dropping old or orphan snapshots by using AWS Command Line Interface (AWS CLI) commands, and creating recurring reports. We also show how to use AWS Cost and Usage Reports to find trends in snapshots cost, usage, and cost-optimization. If you’d like to keep the data in an Apache Parquet format, you can export it to Amazon Simple Storage Service (Amazon S3) and reduce costs. For more details, refer to Reduce data archiving costs for compliance by automating Amazon RDS snapshot exports to Amazon S3.

What are RDS snapshots?

Amazon RDS creates point-in-time copies of RDS storage volumes, backing up the entire RDS databases. Amazon RDS takes a full snapshot at the time of instance provision, or when automated snapshots are enabled on the RDS instance. Any subsequent snapshots are incremental, backing up any data blocks modified from last snapshot. Amazon RDS snapshots are stored in S3 buckets for a user-specified retention period. These snapshots can be shared to different AWS Regions for high availability and disaster recovery purposes, or can be shared with other AWS accounts for cross-account collaboration.

There are two types of RDS snapshots: automated and manual. Automated snapshots get triggered during a daily backup window, backing up databases and transactions logs. Automated snapshots are also taken during single-AZ to Multi-AZ conversion, database engine upgrades, and at the time of replica creation. These snapshots are purged based on a specified retention period, with a maximum of 35 days. At the time of RDS instance deletion, you have the option to keep automated snapshots, but these snapshots are automatically deleted at the end of retention period.

In some cases, business needs to retain snapshots for longer than 35 days to meet compliance requirements. Manual snapshots can be created by users any time using AWS Management Console, AWS CLI, or RDS API, and are kept until explicitly deleted. The key use cases of manual snapshots are long-term retention, cross-Region backup, and safeguarding database deletion.

How are RDS snapshots billed?

RDS snapshots are charged based on the amount of data stored in the snapshots and the duration for which the snapshots are retained. Data transfer charges are applied when snapshots are copied to a different Region. There are no backups charges for snapshots whose size is up to the total provisioned storage for all DB instances in a Region and account. For example, if you have two Amazon RDS for MySQL instances of 50 GB, three Amazon RDS for SQL Server instances of 100 GB, and two Amazon RDS for PostgreSQL instances of 150 GB in the us-east-1 Region, you will have a total of 700 GB backup storage at no additional charge. If you have a total of 800 GB of backup storage in that Region, you will be charged for the 100 GB of excess backup storage. For more details, refer to Amazon RDS pricing.

Solution overview

This Amazon RDS cost-optimization approach consists of the following key steps:

Identify the RDS instances with the maximum snapshots.
Identify the oldest RDS instances.
Identify any orphan RDS snapshots.

The following diagram illustrates this process and subsequent steps.

Throughout this post, we use the AWS CLI aws rds describe-db-snapshots command. Let’s understand the major components of this command with the following example:

aws rds describe-db-snapshots
—db-snapshot-identifier mydbsnapshot
{
“DBSnapshots”: [
{
…
“DBSnapshotIdentifier”: “mydbsnapshot”,
“DBInstanceIdentifier”: “mysqldb”,
“SnapshotCreateTime”: “2023-02-08T22:28:08.598Z”,
“SnapshotType”: “manual”,
…
}
]
}

The SnapshotType value provides the following types of snapshot:

automated – Snapshots that have been automatically taken by Amazon RDS for your AWS account
manual – Snapshots that have been taken manually by your AWS account
shared – Snapshots that have been shared with your AWS account
public – Snapshots that have been marked as public
awsbackup – Snapshots that are managed by AWS Backup

By reducing the overall size and number of snapshots, you can reduce the cost incurred by RDS snapshot storage. You can delete a manual, shared, or public DB snapshot using the AWS Management Console, the AWS CLI, or the Amazon RDS API. To delete a shared or public snapshot, you must sign in to the AWS account that owns the snapshot. If you have automated DB snapshots that you want to delete without deleting the DB instance, change the backup retention period for the DB instance to 0. This action will delete all automated snapshots of the DB instance.

Identify RDS instances with maximum snapshots

The following AWS CLI command lists RDS instances with the maximum automated snapshots:

aws rds describe-db-snapshots —snapshot-type automated —query ‘DBSnapshots[*].[DBInstanceIdentifier’] —output text |sort |uniq -c| sort -nr
9 pg15
9 pg13
8 pg14
8 pg11

After you identify the RDS instances with maximum snapshots, review the snapshot retention policies for these instances. RDS tags are used to classify DB instances for given environments, such as production, qa, stage, or test. If the business retention policy has been changed, or the RDS instance is wrongly classified for its environment, reducing the retention period can greatly reduce the overall snapshot cost. This can be done by using Amazon RDS console or AWS CLI modify-db-instance command. The following is an example of modifying the RDS DB snapshot retention period using the AWS CLI:

aws rds modify-db-instance –db-instance-identifier database-2 –backup-retention-period 3

Identify oldest RDS snapshots

In this step, we list all RDS snapshots in order by snapshot creation time. The following AWS CLI command lists the oldest RDS snapshots and their related instance names:

aws rds describe-db-snapshots —snapshot-type manual —query ‘DBSnapshots[*].[DBInstanceIdentifier,DBSnapshotIdentifier,SnapshotCreateTime]’ —output table |sort -k 3 -n
+————+—————————–+—————————-+
| database-1| database-1-final-snapshot | 2021-06-02T19:58:36.719Z |
| pg1010 | pg1010-final-snapshot | 2022-01-25T18:44:35.417Z |
| pg101 | pg101-final-snapshot | 2022-01-27T20:25:38.212Z |
| pg11 | manual-pg11-1 | 2023-05-04T20:50:26.052Z |
| pg11 | pg11-manual-623 | 2023-06-23T18:14:16.688Z |
| pg12 | pg12-final-snapshot | 2022-05-10T18:52:19.666Z |
| pg15 | manual-623-pg15 | 2023-06-23T18:15:09.957Z |
| testbackup| testbackup | 2021-04-09T15:01:29.801Z |
| test-snap | manualtest0401 | 2022-04-01T19:35:28.343Z |

Alternatively, you can use the following script to list all RDS snapshots older than 90 days:

After you identify the oldest RDS snapshots, review the business retention policies for related instances. If you find snapshots that exceed the retention period, deleting these snapshots can reduce the overall cost. You can delete snapshots by using Amazon RDS console or AWS CLI delete-db-snapshot command. The following is an example of deleting an RDS DB snapshot using the AWS CLI:

aws rds delete-db-snapshot –db-snapshot-identifier mydbsnapshot

Identify orphan RDS snapshots

When an RDS instance is deleted, the manual snapshots remain in the account and are counted towards the total backup size. Snapshots that are no longer referenced to deleted RDS instances are called orphaned snapshots. You can find and delete these orphan RDS snapshots to reduce your AWS RDS snapshot cost. Use the following script to list all orphan snapshots and related RDS instances:

#!/bin/bash
#Live databases
echo “Getting list of live RDS instances: LiveDBs.txt …”
LiveDBs=$(aws rds describe-db-instances –query ‘DBInstances[*].[DBInstanceIdentifier’] –output text | sort | uniq)
echo -e “$LiveDBs” > LiveDBs.txt
#Get list of all RDS instances related to snapshots
echo “Getting list of all RDS instances related to snapshots: AllDBs.txt …”
AllDBs=$(aws rds describe-db-snapshots –snapshot-type manual –query ‘DBSnapshots[*].[DBInstanceIdentifier’] –output text |sort |uniq)
echo -e “$AllDBs” > AllDBs.txt
#Get Deleted DBs
echo “Getting list of deleted RDS instances…”
DeletedDBs=`comm -13 <(sort LiveDBs.txt) <(sort AllDBs.txt)`
#Define output file
OutputFile=OrphanSnaps_Report_$(date +%Y%m%d%H%M).csv
echo “Creating file $OutputFile to store the information.”
echo “Looping through Deleted DBs to get the Orphan Snapshots…”
#Loop thought Deleted DBs to find the Orphan Snapshots
echo $DeletedDBs |tr ‘ ‘ ‘n’ | while read db ; do aws rds describe-db-snapshots –snapshot-type manual –db-instance-identifier $db –query ‘DBSnapshots[*].[DBSnapshotIdentifier,AllocatedStorage,DBSnapshotArn,DBInstanceIdentifier’] –output text |sort >> $OutputFile;
done
#Print the file
awk ‘{ print $4}’ $OutputFile > DeletedDB_$(date +%Y%m%d%H%M).csv
awk ‘{ print $1}’ $OutputFile > OrphanSnapshopts_$(date +%Y%m%d%H%M).csv
echo “List of deleted database instances created : DeletedDB_$(date +%Y%m%d%H%M).csv”
echo “List of orphan RDS snapshots created : OrphanSnapshopts_$(date +%Y%m%d%H%M).csv”
#echo OrphanSnapshopts_$(date +%Y%m%d%H%M).csv |tr ‘ ‘ ‘n’ | while read snap ; do echo “aws rds delete-db-snapshot –db-snapshot-identifier $snap” >> DeleteSnapshotCommand_$(date +%Y%m%d%H%M).csv;
while read snap;
do
echo “aws rds delete-db-snapshot –db-snapshot-identifier $snap” >> DeleteSnapshotCommand_$(date +%Y%m%d%H%M).csv;
done < OrphanSnapshopts_$(date +%Y%m%d%H%M).csv
echo “Delete Snapshot commands created : DeleteSnapshotCommand_$(date +%Y%m%d%H%M).csv”
echo “Detailed report created: $OutputFile”
echo “”
echo “==========RDS Orphan Snapshots Report==========”
cat $OutputFile
echo “”

Running the preceding script creates five files:

LiveDBs.txt is the list of currently live RDS instances:

cat LiveDBs.txt
aurorapgclus
pg11
pg13
pg14
pg15

AllDBs.txt is the list of all RDS instances including currently live RDS instances and deleted instances but having one or more snapshots:

cat AllDBs.txt
database-1
pg101
pg1010
pg11
pg12
pg14
pg144
pg15
testbackup
test-snap

OrphanSnapshopts_202401252044.csv is the list of all orphan snapshots:

cat OrphanSnapshopts_202306282255.csv
database-1-final-snapshot
pg101-final-snapshot
pg1010-final-snapshot
pg12-final-snapshot
pg144-snapshot
testbackup
manualtest0401

DeletedDB_202401252044.csv contains all deleted databases:

cat DeletedDB_202401252044.csv
database-1
pg101
pg1010
pg12
pg144
testbackup
test-snap

DeleteSnapshotCommand_202401252044.csv contains commands to delete identified orphan snapshots:

cat DeleteSnapshotCommand_202306282255.csv
aws rds delete-db-snapshot –db-snapshot-identifier database-1-final-snapshot
aws rds delete-db-snapshot –db-snapshot-identifier pg101-final-snapshot
aws rds delete-db-snapshot –db-snapshot-identifier pg1010-final-snapshot
aws rds delete-db-snapshot –db-snapshot-identifier pg12-final-snapshot
aws rds delete-db-snapshot –db-snapshot-identifier pg144-snapshot
aws rds delete-db-snapshot –db-snapshot-identifier testbackup
aws rds delete-db-snapshot –db-snapshot-identifier manualtest0401

Finally, OrphanSnaps_Report_202401252044.csv contains a detailed report of orphan snapshots with snapshot identifier, allocated storage size in gibibytes (GiB), ARNs, and RDS identifier:

cat OrphanSnaps_Report_202401252044.csv
database-1-final-snapshot 20 arn:aws:rds:us-west-2:571234748784:snapshot:database-1-final-snapshot database-1
pg101-final-snapshot 900 arn:aws:rds:us-west-2:571234748784:snapshot:pg101-final-snapshot pg101
pg1010-final-snapshot 700 arn:aws:rds:us-west-2:571234748784:snapshot:pg1010-final-snapshot pg1010
pg12-final-snapshot 500 arn:aws:rds:us-west-2:571234748784:snapshot:pg12-final-snapshot pg12
pg144-snapshot 200 arn:aws:rds:us-west-2:571234748784:snapshot:pg144-snapshot pg144
testbackup 20 arn:aws:rds:us-west-2:571234748784:snapshot:testbackup testbackup
manualtest0401 500 arn:aws:rds:us-west-2:571234748784:snapshot:manualtest0401 test-snap

After you identify the orphan snapshots, you can review the business needs of these snapshots. If the business retention policy permits, deleting these snapshots can reduce the overall snapshot cost.

AWS Cost and Usage Report

AWS provides certain tools for detailed billing, usage reports, and proactively finding costs. The AWS Cost and Usage Report (AWS CUR) contains the cost and usage of AWS services. This visibility helps you identify opportunities on cost optimization. The report also includes detailed information of RDS snapshot costs. By analyzing AWS CUR data, you can identify trends in snapshot usage costs over time. The report also helps you identify snapshots out of the retention period. This allows you to delete or manage them more efficiently, reducing unnecessary costs. The AWS CUR helps you make informed decisions about optimizing snapshot-related expenses.

The following steps show how to configure a report for snapshot cost optimization using Amazon Athena:

On the AWS Billing console, choose Cost & usage reports in the navigation pane.
Choose Create report.
For Report name, enter a name.
Specify the report path and format.

The next step is to set up Athena.

On the Athena console, create a database to hold the AWS CUR data.
On the Editor tab, enter the Hive data definition language (DDL) command CREATE DATABASE dbname;.
Choose Run or press Ctrl+Enter.
To make this database the current database, select it from the Database menu in the query editor.
Create an external table in Athena that corresponds to the schema of your AWS CUR data.
Map the columns of the table to the appropriate AWS CUR report fields, including the ones related to RDS snapshots.
Run the following query in Athena:

SELECT product_product_name as Resource_Type,
line_item_resource_id as Database_Name,
line_item_usage_type as Storage_Type,
sum(line_item_blended_cost) AS cost
FROM mybackupcostreport
WHERE line_item_product_code = ‘AmazonRDS’
and line_item_usage_type like ‘%BackupUsage%’
and product_database_engine = ‘Oracle’
GROUP BY line_item_usage_type,
resource_tags_user_workload_type,
product_storage,
product_storage_class,
product_storage_media,
product_storage_type,
month,
product_product_name,
line_item_resource_id
HAVING sum(line_item_blended_cost) > 0;

We get the following output for the preceding query showing snapshots by cost.

You can further refine your queries to extract more specific insights from the AWS CUR data related to RDS snapshots. For example, you can analyze snapshot costs, usage patterns, snapshot sizes, or any other relevant metrics.

AWS Cost Explorer

AWS Cost Explorer allows you to track backup storage costs per instance (RDS:ChargedBackupUsage) if cost allocation tags are added to the RDS instances. These tags enable you to filter costs by tags. We encourage you to do this and enable AWS CUR for the most detailed breakdown of your usage and per-instance backup costs. The following screenshot shows Amazon RDS backup charges for instances tagged with ChargedBackupUsage.

Summary

In this post, we shared recommendations to help you optimize the cost of your RDS snapshots. It’s important to keep in mind that each RDS instance has its own business policies and each snapshot has its own retention period. It’s therefore essential to periodically review and validate the current business polices and backup retention policies. For non-production DB instances, you can put automation in place to delete snapshots. However, for production instances, it’s recommended to work with the architects to understand the business policies on backup retention periods. By using Athena, AWS CUR, and Cost Explorer, you can gain deeper insights into your snapshots’ usage, costs, and other related information for tagged RDS instances.

If you have any questions or comments, post your thoughts in the comments section.

About the authors

Vivek Singh is a Principal Database Specialist Technical Account Manager with AWS focusing on Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL engines. He works with enterprise customers providing technical assistance on PostgreSQL operational performance and sharing database best practices. He has over 17 years of experience in open source database solutions, and enjoys working with customers to help design, deploy, and optimize relational database workloads on AWS.

Pavan Pusuluri is a Senior Sr. Data Architect with the Professional Services team at Amazon Web Services. His passion is building scalable, highly available, and secure solutions in the AWS Cloud. His focus area is homogenous and heterogeneous migrations of on-premises databases to Amazon RDS and Amazon Aurora PostgreSQL. Outside of work, he cherishes spending time with his family, exploring food, and playing cricket.

Programmatic approach to optimize the cost of Amazon RDS snapshots

What are RDS snapshots?

How are RDS snapshots billed?

Solution overview

Identify RDS instances with maximum snapshots

Identify oldest RDS snapshots

Identify orphan RDS snapshots

AWS Cost and Usage Report

AWS Cost Explorer

Summary

About the authors

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Implement UUIDv7 in Amazon RDS for PostgreSQL using Trusted Language Extensions

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Empowering all to be safer with AI this Cybersecurity Awareness Month

Improved Alerting with Atlas Streaming Eval

Approach cloud architecture from the outside in

POPULAR CATEGORY