Understand and build a hybrid database with Amazon RDS and AWS Outposts

By mullaned2002

August 4, 2022

971

Many customers are faced with the challenge of building and operating a hybrid infrastructure to support workloads that must run both in the cloud and on premises. In many cases, these hybrid workloads rely on a relational database to support the workload, which can be particularly challenging to build and support across a hybrid infrastructure. In this post, we cover the architecture and setup of a database spanning from an AWS Region to AWS Outposts using Amazon Relational Database Service (Amazon RDS) and Amazon Elastic Compute Cloud (Amazon EC2).

At AWS, we think of hybrid infrastructure as including the Region along with on-premises data centers and edge nodes like Outposts. AWS Outposts brings AWS infrastructure and services to virtually any data center, co-location space, or on-premises facility, in the form of one or more physical racks connected to the AWS global network. A subset of native AWS services run on premises on the Outpost, and you can connect to the full range of AWS services available in your Region to support your workload.

Relational databases are a common component of a hybrid workload. Amazon RDS is a managed service that makes it easy to set up, operate, and scale a relational database in the cloud or on Outposts. When running Amazon RDS in the Region, you can choose between seven different database engines to support your workloads, and can choose from MySQL, PostgreSQL, and SQL Server when running on Amazon RDS on Outposts.

Amazon RDS can operate both in the Region and on Outposts to support hybrid workloads. In this post, we use the native binary logging and replication of Amazon RDS for MySQL hosted in a Region and an EC2 instance on the Outpost as a read replica. Although you could run a standalone VM in your data center to host the replica, using Amazon EC2 on Outposts provides the same reliable, secure, and high-performance compute experience across the hybrid workload. Additionally, because Amazon EC2 on Outposts uses the same APIs as services in the Region, you can use the same tools for deployment, security, and automation across the entire workload, leading to greater operational consistency over a self-managed VM option.

In this post, we show you how to solve a common hybrid infrastructure use case seen by customers supporting latency sensitive applications. In this use case, a customer is hosting a workload in the AWS Region but has a need to host an instance of the application in their data center to provide low latency access for local data processing. The primary data base supporting the application in the Region is hosted on Amazon RDS for MySQL but the workload requires a local copy of the database due to latency requirements. RDS currently doesn’t support managed read replicas on Outposts, however we can solve this challenge by hosting a self-managed read replica on Outposts using EC2 and the native binary transaction logging of MySQL.

Using an EC2 hosted instance of a database on an Outpost can be the foundation for additional use cases customers may face. These use cases could be one time migration from the Region to an Outpost or for a disaster recovery location from the Region to a customer data center.

Although we review the key components of a hybrid architecture, this post assumes you are familiar with Outposts, Amazon RDS, and basic setup and configuration of MySQL replication.

Solution overview

The following diagram shows the architecture of our hybrid infrastructure. With this architecture, an RDS for MySQL instance is deployed in a Region. Amazon RDS for MySQL does support self-managed replicas using native binary logging of transactions from the primary, we configure this instance with a managed read replica in the Region. The reason for using a fully managed read replica alongside the RDS primary instance is because with a few clicks, this configuration automatically sets up the primary to serve as a replication source for our read-only replica on Outposts later on. Be aware there is an increased cost when operating both RDS instances. You can configure the RDS primary instance for replication, but I have chosen to use the managed read replica in this post for simplicity.

To provide a host and storage for our read replica on the Outpost, we use an EC2 instance. Outposts supports a variety of EC2 instance types, which you can choose when configuring and ordering the Outpost. Because an Outpost has finite resources, it’s important to plan for capacity and performance when selecting both the EC2 instance type and the amount of Amazon Elastic Block Store (Amazon EBS) storage.

Even though this is a hybrid infrastructure, you can perform all the setup and configuration through the AWS Management Console, AWS Command Line Interface (AWS CLI), API, or languages like AWS CloudFormation or Terraform. This means you can develop one time, and deploy in the Region or on premises without having to rewrite your application or manage different sets of tools or processes.

In the configuration process that follows, four main areas comprise the solution:

Create networking components – In this section, I go over and review the required networking components and any Outposts-specific items to call out during setup
Set up the RDS for MySQL database and read replica – Here, I discuss the setup of our RDS database and managed read replica in the Region
Set up the EC2 read replica – I cover the Amazon EC2 specifics and identify any supporting components needed, such as security groups
Set up replication and validate the solution – In the last section, I outline the steps to back up and restore the MySQL database to the EC2 instance to support replication from the Region

Prerequisites

This post assumes that you are familiar with the following:

Setting up MySQL 8 Community Server on Linux
Configuring MySQL replication
AWS Outposts
Amazon RDS

Create networking components

The networking components are the foundation our hybrid infrastructure solution is built on, and there are a few key items to be aware of when extending them to the Outpost.

During the initial Outposts setup, a connection called a service link is established to the Region. A service link is an encrypted set of VPN connections that are used whenever the Outpost communicates with your chosen home Region. You can establish the service link using either AWS Direct Connect or over the public internet. In this example, we use Direct Connect as the service link connection.

An Outpost is homed to an Availability Zone in the Region over the service link. You can think of the Outpost as an extension of that Availability Zone. It’s important to choose the right Availability Zone when assigning resources, because some reside on the Outpost and some reside in the Region. In the preceding example, our workloads use Availability Zones A and B (AZa and AZb) in the Region, and the Outpost is homed to Availability Zone C (AZc). This paradigm of the Outpost becoming an extension of an Availability Zone persists regardless of the number of Availability Zones in a Region. Although some AWS Regions have more than three Availability Zones, we use three in this example for simplicity.

Other resources later in the post must be deployed to a specific Availability Zone to ensure they are deployed to the correct destination, the Region or the Outpost.

To create our network, we first use the Amazon Virtual Private Cloud (Amazon VPC) console to create a VPC. Use the Region that serves as the home base for your Outpost.

On the Amazon VPC console, choose Your VPCs in the navigation pane.
Choose Create VPC.
For Name tag, enter a name for the VPC.
For IPv4 CIDR, choose a CIDR block and network mask.
When finished, choose Create VPC.

Next, let’s create two subnets in the Region.

On the Amazon VPC console, choose Subnets in the navigation pane.
Choose Create subnet.
For VPC ID, choose the VPC we just created.
For Subnet name, enter a name for the subnet.
For Availability Zone, choose an Availability Zone in the Region.
For IPv4 CIDR block, enter a /24 range within the VPC CIDR created earlier.
When finished, choose Create subnet.
Repeat these steps to create a second subnet.

The procedure to create the subnet on the Outpost is almost the same as for the Region, but is created on the Outposts console instead of the Amazon VPC console.

On the AWS Outposts console, select the Outpost.
On the Actions menu, choose Create subnet.
Follow the steps from the previous subnet creation procedure.

Depending on your workload needs, you may need to create supporting components such as an internet gateway, routes, or a NAT gateway to allow your hosts access for things like OS package updates.

Now that our network is created, we create the security groups that allow things like SSH access or allow our read replica on the Outpost to communicate over TCP port 3306 to the RDS instances in the Region.

The access required depends on the specific workload, but in all cases, we need a security group to allow inbound TCP port 3306 between the RDS instance and the EC2 instance. For more information, see Controlling access with security groups.

With the networking and security group resources in place, the hybrid infrastructure should look like the following diagram.

Set up the RDS for MySQL database and read replica

For the database, we use Amazon RDS for MySQL. Deploying the database in the Region is the same even though we’re in a hybrid infrastructure, but I call out some key steps in this section.

On the Amazon RDS console, choose Create database.
Because Amazon RDS can run in the Region or on the Outpost, it’s important to remember to select AWS Cloud for Database location options.

Select MySQL as the database engine type.

For this post, we choose MySQL 8.0.23 for Version.

Choose the VPC that is extended to the Outpost we created earlier, vpc-rds-hybrid-01.

Choose the subnet group as well as the VPC security group to allow the EC2 instance to talk to the RDS instance over port 3306.
For Availability Zone, select an Availability Zone in the Region.
You can configure the remaining settings to meet the use case of your workload, but we recommend following AWS security best practices such as enabling encryption in transit and at rest as well as picking a strong password for the database.
Review your settings and choose Create Database.
When the RDS instance in the Region is ready, we can create a managed read replica.

We can use the read replica in the Region to scale read traffic requests to the primary database. It also has the added benefit of automatically configuring the RDS database as the primary database for replication.

With the RDS for MySQL primary instance and the managed read replica in place, the infrastructure should look like the following diagram.

Set up the EC2 read replica

To support workloads that need to operate on premises with a read-only relational database, we use an EC2 instance backed by an EBS volume on the Outpost running the Amazon Linux 2 operating system. Launching an EC2 instance with Amazon EBS storage on an Outpost is like launching in the Region, with a few exceptions.

Due the elasticity of the AWS Cloud, considering size and capacity is rarely a concern, but an Outpost has a finite amount of resources like compute and storage. Consider the overall capacity of the Outpost and the amount of resources you anticipate the read replica using, such as CPU, memory, and storage.

Note the following steps when setting up the read replica:

To ensure the EC2 instance is launched on the Outpost and not the Region, make sure to use the VPC created earlier and, most importantly, the subnet associated with the Availability Zone of the Outpost. In this example, the subnet associated with the Outpost is AZc.

Add an EBS volume to host the database.

SSH access to the EC2 instance is required for the configuration steps to set up replication.

If one doesn’t exist already, create a security group for the EC2 instance that allows SSH access to the host following security best practices.
After you review all the configuration settings, choose Launch to launch the EC2 instance.

When the EC2 instance has finished initializing, the architecture for the hybrid infrastructure should look like the following diagram.

Set up replication and validate the solution

After configuring the previous items, you can now configure replication between the RDS primary instance in the Region and the EC2 instance on the Outpost.

This post assumes that you have previous experience setting up MySQL 8 Community Server on a Linux host as well as configuring replication. For this final step, I provide an overview of the process to deploy MySQL and enable replication. For a more detailed setup guide, refer to How can I use binary logs from an Amazon RDS for MySQL active DB instance to replicate to an on-premises standby instance?

Begin by installing MySQL 8 Community Edition.

Establish an SSH session to the EC2 read replica instance on the Outpost.
Install MySQL 8 Community Edition:

yum install -y mysql

When MySQL is running on the EC2 instance, log in to MySQL and create a database called mysqlreplicationtest:

mysql> create database msyqlreplicationtest;

From the EC2 instance, log in to the RDS managed read replica instance and stop replication from the primary using the mysql command line call:

mysql> call mysql.rds_stop_replication;

Verify replication has stopped and record the Relay_Master_Log_File and the Exec_Master_Log_Pos to be used later to configure replication:

mysql> call mysql.rds_stop_replication;
mysql> show slave status G

Disconnect from the RDS managed read replica and use mysqldump to create a backup of the database to be restored on the EC2 instance:

mysqldump -h <rds-hostname> -u <rdsuser> -p mysqlreplicatest > backup_file_name.sql

When the backup is complete, restore the backup to the newly created database called mysqlreplicationtest on the EC2 read replica instance:

mysql -h localhost -u root -p mysqlreplicatest < backup_file_name.sql

Stop MySQL on the EC2 instance and set the my.cnf file to a unique server ID, such as server_id=2, and the name of the database to replicate, in this case replicate-do-db=mysqlreplicationtest:
Save the file and restart MySQL on the EC2 read replica instance:

systemctl restart mysqld

From the EC2 read replica, log in to the RDS primary instance and create a replication user and grant the necessary privileges to the user:

mysql> create user repl_user@’%’ identified by ‘xxx’;
mysql> grant replication slave, replication client, replication_slave_admin on *.* to ‘xxx@’%’;
mysql> show grants for repl_user@’%’;

From the EC2 read replica instance, establish a connection to the active RDS DB instance and set replication parameters:

mysql> change master to master_host='<rds-hostname>’, master_user=’repl_user’, master_password=’xxx’, master_log_file=’mysql-bin-changelog.000xxx’, master_log_pos= xxxx

Start replication from the EC2 read replica and verify replication is running:

msyql> start slave
mysql> show slave statusG

From here, you should be able to create new tables and data in the mysqlreplicationtest database on the RDS primary instance and see them replicated over to the EC2 instance on the Outpost. Our hybrid infrastructure is now ready to support a read-only workload on premises.

Clean up

This post is intended as a guide to help you build a hybrid infrastructure to support a database that spans the Region and an Outpost. If you followed along with this post, make sure you clean up your resources to prevent unexpected costs.

Stop and terminate the EC2 instance on the Outpost, ensuring the deletion of the EBS volume.
Delete the Amazon RDS-managed read replica in the Region.
Delete the RDS for MySQL primary instance in the Region.
Delete the VPC and related network components you may have created.

Conclusion

In this post, we discussed what a hybrid infrastructure looks like using Outposts and a Region. I explained that some workloads need to operate across this hybrid infrastructure, and having a database to support them in both places can be a key component driven by a variety of factors. Amazon RDS supports a variety of ways to deploy replicas, both managed in the Region or using native replication functions to deploy outside the Region on Amazon EC2.

With the examples outlined in this post, you should now have a good understanding of how to deploy an Amazon EC2-based read replica on an Outpost to support an on-premises workload.

To learn more, see the Outposts product page and Working with Amazon RDS on AWS Outposts in the Amazon RDS User Guide. How will you use the information here to build your hybrid infrastructure? Please send us feedback either in the AWS forum for Amazon Outposts or through your AWS support contacts.

About the Author

Doug Hairfield is a Senior Solutions Architect in the WWPS Federal Partner Solutions Architecture team at Amazon Web Services. He is passionate about helping customers build and architect solutions on AWS, especially around hybrid environments and edge computing. Outside of work, he enjoys spending time with family, playing guitar, and open water distance swimming.

Understand and build a hybrid database with Amazon RDS and AWS Outposts

Solution overview

Prerequisites

Create networking components

Set up the RDS for MySQL database and read replica

Set up the EC2 read replica

Set up replication and validate the solution

Clean up

Conclusion

About the Author

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Implement UUIDv7 in Amazon RDS for PostgreSQL using Trusted Language Extensions

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

When to architect for the edge

DevOps and CI/CD on Google Cloud explained

10 must-attend sessions for data professionals at Google Cloud Next ‘23

POPULAR CATEGORY