Optimize costs with scheduled scaling of Amazon DocumentDB for read workloads

By mullaned2002

June 12, 2024

62

In this post, we show you two ways to schedule the scaling of your Amazon DocumentDB (with MongoDB compatibility) instance-based clusters to address anticipated read traffic patterns.

Amazon DocumentDB is a fully managed native JSON document database that makes it straightforward and cost-effective to operate critical document workloads at virtually any scale without managing infrastructure. Amazon DocumentDB simplifies your architecture by providing built-in security best practices, continuous backups, and native integrations with other AWS services. You can enhance your applications with generative artificial intelligence (AI) and machine learning (ML) capabilities using vector search for Amazon DocumentDB and integration with Amazon SageMaker Canvas. As a document database, Amazon DocumentDB makes it straightforward to store, query, and index JSON data.

To learn more about how Amazon Document automatically distributes reads across available replicas in the cluster, see Scaling Amazon DocumentDB (with MongoDB compatibility), Part 1: Scaling reads

By aligning your Amazon DocumentDB cluster scaling operations with the anticipated read traffic patterns, you can achieve optimal performance during peak loads and save costs by reducing the need to overprovision your cluster. Amazon DocumentDB provides per-second billing for compute instances, with a 10-minute minimum billing period, so you pay only for what you use. For more information, see Amazon DocumentDB pricing.

Solution overview

You can use the following methods to scale your Amazon DocumentDB cluster based on a schedule:

Scheduled scaling with multiple replicas – With this approach, you can smoothly handle high read traffic by scaling out your Amazon DocumentDB cluster during peak times with a predefined number of instances and scaling in to remove additional capacity during off-peak hours to save costs. As an example use case, consider that you manage an e-learning platform or a similar platform where your read traffic peaks during school or business hours and drops significantly during nonbusiness hours and weekends. In such scenarios, your need to scale out your platform just before the start of business hours and scale in after business hours to save costs.
Targeted replica – This approach adds a dedicated replica instance with a fixed name on a schedule for running analytical queries and removes it later. This approach confines the analytics or asymmetric workloads to the specific replica instance. For example, your organization may need to run complex analytical reports on Amazon DocumentDB data every Friday that can scan a large amount of data with the least performance overhead on the cluster serving your live customer traffic.

To implement these approaches, you can use AWS Lambda functions from the lambda-samples GitHub repo. To run the Lambda functions on a schedule, we use Amazon EventBridge rules. The following diagram illustrates the solution architecture.

In the following sections, we walk through the steps to deploy the Lambda functions and create the EventBridge rules.

Prerequisites

To implement the solution, you need to have the following resources set up:

An Amazon DocumentDB cluster. You can use an existing Amazon DocumentDB cluster or create a new cluster.
An Amazon Elastic Compute Cloud (Amazon EC2) instance or a local desktop with Python installed.
Optionally, you can clone the aws-samples GitHub repository.
An AWS Identity and Access Management (IAM) role with permissions to query, add, and remove Amazon DocumentDB instances in your cluster. You can use an existing role with these privileges or create a new role with the following permissions. As a best practice, when creating IAM roles, we recommend that you follow the principle of least privilege.

{
“Version”: “2012-10-17”,
“Statement”: [
{
“Effect”: “Allow”,
“Action”: [
“rds:DeleteDBInstance”,
“rds:CreateDBInstance”,
“rds:ListTagsForResource”,
“rds:AddTagsToResource”,
“rds:DescribeDBClusters”
],
“Resource”: “<Amazon DocumentDB resource ARN>“
}
]
}

Deploy Lambda functions

To schedule scaling with multiple replicas, we use the docdb-scaleOnSchedule.py script and for targeted replicas, we use the docdb-add-delete-targetedReplica.py script. You may choose to deploy either of these functions or both based on your requirements.

The steps for deploying both Lambda functions are the same, except for the environment variables that are specific to each function.

Complete the following steps to deploy the Lambda function for scheduled scaling with replicas:

On the Lambda console, choose Functions in the navigation pane.
Choose Create function.
Select the option Author from scratch.
For Function name, enter a valid function name, such as scheduled-scaling-docdb.
For Permissions, expand Change default execution role, choose Use existing role, and choose the IAM role that matches the requirements stated in the prerequisites.
Choose Create function.
Copy the script docdb-scaleonschedule.py in raw format, enter it into the Code source, and choose Deploy.
On the Configuration tab, under General configuration, choose Edit.
Change Timeout to 30 seconds and choose Save.
Under Environment variables, choose Edit.
Choose Add environment variable and add the variables in the following table (these are for the scheduled scaling Lambda: docdb-scaleOnSchedule.py).

Key
Value

CLUSTER_IDENTIFIER
Your Amazon DocumentDB cluster identifier.

INSTANCE_CLASS
Your database instance class. For more information about supported classes, see Managing Instance Classes.

INSTANCES_TO_ADD
The number of instances to add to your clusters at a time.

INSTANCES_TO_DELETE
The number of instances to be removed from your clusters at a time.

For more information, see the Readme.

Choose Save.

To create the Lambda function for the targeted read replica, complete the preceding steps with the following changes:

Provide a different name for the function
Use the script docdb-add-delete-targetedReplica.py as the source code
Add the environment variables in the following table

Key
Value

CLUSTER_ID
Your Amazon DocumentDB cluster identifier.

INSTANCE_CLASS
Your database instance class. For more information about supported classes, see Managing Instance Classes.

INSTANCE_NAME
The name of the instance to be added and removed.

For more information, see the Readme.

Next, you create a schedule to invoke these Lambda functions using the EventBridge rules.

Create EventBridge rules

You need to create two EventBridge rules for each Lambda function: one rule for adding the instances and the other for deleting the instances.

Complete the following steps steps to create an EventBridge rule to add instances on every weekday from Monday to Friday:

On the EventBridge console, choose Rules in the navigation pane.
Choose Create rule.
On the Define rule detail page, enter a name (for example, scale-docdb-mon-fri) and optional description.
Select Rule type: Schedule and choose Continue to create rule.
For the scheduled pattern, leave the default (fine-grained schedule).
For Cron expression, enter the expression that applies to your workload. For example, the expression to run the functions every weekday at 8:00 AM is cron( 00 8 ? * 2-6 *).
Choose Next.

If your expression is correct, you will see the next dates to invoke the function. For more information, see Cron expressions.

On the Select targets page, choose AWS Service (default) and choose Lambda Function from the dropdown.
For the function, choose the Lambda function you created earlier.
Expand Additional settings and in the Configure target input section, choose Constant (JSON text) and enter {“Action”: “Add”} as input to add instances.
Choose Next.
If you want to add tags to the EvenBridge rule, add them on the Configure tags page.
Choose Next.
On the Review and create page, review the values and choose Create rule.

To create an EventBridge rule to delete the instances, repeat these steps but provide a different name for the rule and enter {“Action”: “Delete”} for Constant (JSON text).

Clean up

If you created a new Amazon DocumentDB cluster and are no longer using it, you can stop the cluster or delete the cluster. You can also delete the Lambda functions and EventBridge rules associated with the cluster. If you created a new IAM role, you can delete the role if you’re not using that role elsewhere.

Summary

In this post, we showed you a cost-effective way to schedule scaling activities on an Amazon DocumentDB cluster using Lambda and EventBridge to handle anticipated read traffic patterns.

For more information about recent launches and blog posts, see Amazon DocumentDB (with MongoDB compatibility) resources.

About the Author

Kaarthiik Thota is a Senior DocumentDB Specialist Solutions Architect at AWS. He is passionate about database technologies and enjoys helping customers solve problems and modernize applications using NoSQL databases. Before joining AWS, he worked extensively with Relational databases, NoSQL databases, and Business Intelligence (BI) technologies for over 15 years.

Optimize costs with scheduled scaling of Amazon DocumentDB for read workloads

Solution overview

Prerequisites

Deploy Lambda functions

Create EventBridge rules

Clean up

Summary

About the Author

Perform a two-step database migration from an on-premises Oracle database to Amazon RDS for Oracle using RMAN

Use Spring Cloud to capture Amazon DynamoDB changes through Amazon Kinesis Data Streams

How Scopely scaled “MONOPOLY GO!” for millions of players around the globe with Amazon DynamoDB

LEAVE A REPLY Cancel reply

Most Popular

Manage Amazon SageMaker JumpStart foundation model access with private hubs

Introducing the SnapLogic June 2024 Release!

Simplify historical data tracking in BigQuery with Datastream’s append-only CDC

Normalize billing data across clouds with new Looker template and BigQuery views

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Accelerate your data to AI journey with new features in BigQuery ML

Freight Pricing with a Controlled Markov Decision Process

AI-Generated Predictions for the 2023 NFL Draft

POPULAR CATEGORY