MySQL is one of the most popular open-source relational databases in the world. Many customers choose to run their MySQL workload in Amazon Relational Database Service (Amazon RDS) for MySQL because it takes care of heavy lifting tasks such as hardware provisioning, database setup, patching, configuration, monitoring, backups, and scaling.
Amazon Aurora is a great option for any enterprise application that can use a relational database. Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud. If you want a modern relational database with full MySQL compatibility that provides high availability and performance at global scale, you can migrate your MySQL database to Amazon Aurora MySQL-Compatible Edition. With Aurora, you can have up to 15 low-latency read replicas for each cluster. It provides enterprise-grade advanced features like fast database clones, auto scaling replicas, and backtracking, to name a few. In addition to those features, Aurora database cluster storage spans across three Availability Zones in a Region, providing out-of-the-box durability and fault tolerance to your data across physical data centers. With Amazon Aurora Global Database, you can extend this across Regions by adding secondary database clusters in different Regions. It replicates your data with negligible impact on database performance, enables fast local reads with low latency in each Region, and provides disaster recovery from Region-wide outages.
Migrations are among the most time-consuming and critical tasks handled by DBAs. Large-scale database migrations require adequate planning and implementation to make sure that the workload being migrated performs the same way or better. There are two common ways to migrate a database: physical and logical. Physical migration allows you to copy the physical database files to migrate the database. Logical migration allows you to create a logical backup of your database by generating SQL statements in text or SQL files and executing them on your target database to generate the same schema and data structure. Amazon RDS for MySQL and Aurora MySQL support both physical and logical migration using Percona XtraBackup and native MySQL-compatible tools.
In this post, we discuss different options to migrate data from a self-managed MySQL database to AWS-managed Amazon RDS for MySQL or Aurora MySQL.
Migration options from a self-managed MySQL database
Given the popularity of MySQL databases, you can choose from a wide range of options to migrate data from one MySQL-compatible database to another. Choosing the right migration strategy based on ease of use, data size, and downtime requirements is critical for a successful migration. Let’s look at the migration options available in more detail.
Percona XtraBackup allows you to perform consistent backups on running databases without affecting your application. It works by copying the physical InnoDB data files from the source and restoring them on the target. Because there is no logical data movement involved (selecting data and inserting it again), physical migration using Percona XtraBackup is more performant than logical migration. You can use this method for data of any size.
Amazon RDS for MySQL supports migration from Percona XtraBackup files stored in an Amazon Simple Storage Service (Amazon S3) bucket from source MySQL versions 5.6, 5.7, and 8.0. Aurora MySQL supports migration from Percona XtraBackup files that are stored in an S3 bucket from the source MySQL version 5.6 and 5.7. If you’re already using Percona XtraBackup for database backups in the database you’re planning to migrate, you can copy the full and incremental backup files to Amazon S3 and restore them to Amazon RDS for MySQL or Aurora MySQL.
For a MySQL 8.0 migration, you must use Percona XtraBackup 8.0.12 and higher. For MySQL 5.7 and 5.6 migrations, you can also use Percona XtraBackup 2.4 or 8.0.12 and higher versions.
Migrating using Percona XtraBackup involves the following steps:
Use the Percona XtraBackup tool to create a backup of the source MySQL database.
Upload backup files to an S3 bucket.
Restore backup files through the AWS Management Console, AWS Command Line Interface (AWS CLI), or API.
(Optional) For minimal downtime, set up replication between the source and Aurora MySQL cluster using binlog replication.
For detailed step-by-step instructions on using the Percona Xtrabackup method and setting up binary log replication, refer to Migrating data from MySQL by using an Amazon S3 bucket. For more information about limitations, see Limitations and recommendations for importing backup files from Amazon S3 to Amazon RDS.
In this section, we discuss the different tools available for a logical migration.
mysqldump is a popular logical backup tool for MySQL-compatible database engines. With this tool, you can export schema definitions and data from a MySQL server. To import a logical backup to the target MySQL server, you can use the MySQL command line client or the SOURCE SQL command from inside the database. In this section, we discuss two methods to perform a dump and import to the target RDS for MySQL or Aurora MySQL database.
The first method involves exporting data from the existing MySQL-compatible database and importing it directly into RDS for MySQL or Aurora MySQL using mysqldump and mysql commands. This is a one-step migration because it loads the data to the target database without creating any intermediate dump files. This method is simple and straightforward as long as you have a high-speed, low-latency, and stable network connection between the source and the target server. Use the following example command on a host with the MySQL client installed and has network communication to RDS MySQL instance:
The second method involves creating a data dump of an existing MySQL-compatible database using the mysqldump tool and restoring the SQL dump file to an RDS for MySQL or Aurora MySQL DB cluster using the SOURCE SQL command. This approach is ideal for smaller datasets because as the data size increases, the amount of time it takes for the backup and restore increases. Use the following command to create a SQL dump of the database:
The –single-transaction option sets the transaction isolation mode to repeatable read and sends a START TRANSACTION SQL statement to the server before dumping data. This creates a consistent backup by dumping all tables in a single transaction. To avoid the impact on your production database, you can create a read replica and take a backup from the read replica.
After you create the MySQL dump file of your database, import it using the source SQL command after connecting to the target database server:
If you require a near-zero downtime migration, you can use mysqldump to create backup files that not only include the data, but also the binary log position at the time when the consistent backup is taken. To do so, use the parameter –master-data=2. The –master-data parameter with value 2 captures the binary log and its position in the dump file as a part of the CHANGE MASTER TO statement to set up the binary log replication for near-zero downtime migration. You can use the binary log and its position obtained from the dump file to set up replication between a self-managed MySQL source and Amazon RDS for MySQL or Aurora MySQL. For step-by-step instructions on setting up this replication, Replicate between your external database and new Amazon RDS DB instance.
If you’re using a read replica to take a backup, you can stop the replication between the source and read replica. Capture the binary log and its position using SHOW REPLICA STATUS, and then take the backup using mysqldump. After successfully importing the data to Amazon RDS for MySQL or Aurora MySQL, you can set up the binary log replication to the source MySQL database using ‘Relay_Master_Log_File‘ and ‘Exec_Master_Log_Pos‘.
Multi-threaded migration using mydumper and myloader
mydumper and myloader are open-source MySQL export/import tools designed to address performance issues associated with the legacy mysqldump utility. They offer advanced features such as dumping and loading data using parallel threads, creating dump files in a file-per-table fashion, creating chunked dumps in a multiple-files-per-table fashion, dumping data and metadata into separate files for easier parsing and management, configurable transaction size during import, and more.
This approach is suitable for both small and large-scale migrations (data size ranging from hundreds of GBs to TBs) because these tools have flexible configuration options. You can use these tools with both Amazon RDS for MySQL and Aurora MySQL.
Use the following command to create a data dump file of DbName1 and DbName2 databases using mydumper with basic configurations:
For a complete list of parameters and usage, see Mydumper Usage.
Use the following command to restore the backup using myloader with basic configurations:
For a complete list of parameters and usage, see Myloader Usage.
If you have a requirement of near-zero downtime migration, mydumper captures the binary log and its positions before taking the backup and stores them in the metadata file in the target directory where the dump files are stored. Read the metadata file using the cat Linux command, and the results should appear similar to the following output:
You can use the binary log and its position to set up replication between the self-managed MySQL source and Amazon RDS for MySQL or Aurora MySQL.
Flat file migration using files in CSV format
Migrating with flat files is a preferred choice for many database administrators over the SQL format file because flat file dumps are easier to process using command line tools like split or combine. The lack of SQL encapsulation results in smaller dump files and reduces processing overhead during import, and it creates flat files in a file-per-table fashion, which makes it easy to import them in parallel.
The one disadvantage of using the flat file method is that the server uses a single transaction to import data from each dump file. For example, if you’re importing a large dump file and a crash occurs for any reason, it rolls back the transaction, which can be very time-consuming. To have more control over the size of import transactions, you need to manually split very large dump files into smaller files, and then import one file at a time. Files are produced in comma-separated value (CSV) or tab-separated value (TSV) format. This migration method uses a combination of the mysqldump tool and SQL commands.
First, use mysqldump to create a schema-only dump in SQL format, as shown in the following code. This dump describes the structure of schema objects like tables, views, and functions but doesn’t contain data.
Next, create flat file data dumps using one of the following options:
The SELECT … INTO OUTFILE statement, which dumps table contents (but not table structure) into a file located in the server’s local file system.
The mysqldump command with the –tab parameter, which also dumps table contents to a file and creates the relevant metadata files with CREATE TABLE statements. The command uses SELECT … INTO OUTFILE internally, so it also creates dump files on the server’s local file system.
When you create the files using any of these methods, you can import it to the target RDS for MySQL or Aurora MySQL database using one of the following methods:
Transfer all dump files to an Amazon Elastic Compute Cloud (Amazon EC2) client instance located in the same Region and Availability Zone as the target DB cluster. After transferring the dump files, you can import them using the mysql command line client and LOAD DATA LOCAL INFILE commands for SQL format schema dumps and flat file data dumps, respectively.
If the target is Aurora MySQL, you can transfer the flat file data dumps to an S3 bucket and then import files into an Aurora MySQL cluster using LOAD DATA FROM S3 SQL commands. For more information, see Loading data into an Amazon Aurora MySQL DB cluster from text files in an Amazon S3 bucket.
AWS Database Migration Service
AWS Database Migration Service (AWS DMS) is a managed database migration service that helps you quickly and securely migrate a database to AWS. It offers a wide range of options to migrate the data from source to target, such as remapping schema or table names, advanced data filtering, and migrating and replicating multiple database servers into a single Aurora MySQL cluster. With AWS DMS, you can perform simple migrations with near-zero downtime using change data capture (CDC) replication. AWS DMS can be an option if your source is a MySQL-compatible database such as MariaDB and can’t be migrated using any of the methods described previously, such as Percona XtraBackup or native export/import tools. For more information, see Using a MySQL-compatible database as a source for AWS DMS.
Migrate from Amazon RDS for MySQL to Aurora MySQL
To ease the migration burden, Amazon RDS provides two additional options to migrate from Amazon RDS for MySQL to Aurora MySQL. With a few clicks, you can migrate data to Aurora MySQL using the physical migration methods described in this section.
Aurora read replicas
With Aurora read replicas, you can migrate from Amazon RDS for MySQL to Aurora MySQL with near-zero downtime. This method is ideal if you want to migrate to Aurora MySQL in the same account and Region as the RDS instance. You can do this via the console or the AWS CLI. When you create an Aurora read replica of a MySQL DB instance, Amazon RDS creates a DB snapshot of your source MySQL DB instance and migrates the data from the DB snapshot to an Aurora read replica. After that, Amazon RDS for MySQL starts the binary log replication between Amazon RDS for MySQL and the Aurora MySQL DB cluster. When the Aurora MySQL read replica catches up with the source RDS for MySQL instance, stop the application and wait for any inflight transactions to catch up by monitoring the AuroraBinlogReplicaLag Amazon CloudWatch metric until it reaches zero. Point the application to the Aurora DB cluster by replacing the RDS for MySQL instance endpoint with the Aurora cluster endpoint in your application code.
Note that you can only create an Aurora MySQL 3 read replica DB cluster of an Amazon RDS for MySQL 8.0 DB instance (MySQL version 8.0.23 and below).
The following diagram shows the migration process from Amazon RDS for MySQL to Aurora MySQL using the Aurora read replica method:
Amazon RDS snapshot
This option is ideal if you want to migrate to another AWS account or Region. You can migrate either a manual or automated DB snapshot of the RDS for MySQL instance. You can perform this operation through the console or using the AWS CLI or AWS SDK. You can only restore an RDS for MySQL 5.7 snapshot to Aurora MySQL version 2 compatible with MySQL 5.7 or an RDS for MySQL 8.0 (MySQL version 8.0.23 and below) snapshot to Aurora MySQL version 3 compatible with MySQL 8.0. If you need to migrate from Amazon RDS for MySQL 5.7 version to Aurora MySQL 3, you can perform the migration in the following ways:
Migrate Amazon RDS for MySQL 5.7 to Aurora MySQL 2 version using the Aurora read replica method. After you create the read replica DB cluster, and it has caught up with all the changes (make sure the replication is running and the replica lag is zero), take a snapshot of Aurora MySQL 2 and upgrade the snapshot to Aurora MySQL 3.
Upgrade the RDS for MySQL 5.7 snapshot to an RDS for MySQL 8.0 (MySQL version 8.0.23 and below) instance, take a snapshot of the MySQL 8.0 DB instance, and then restore the snapshot to Aurora MySQL version 3.
Manually set up binary log replication for near-zero migration downtime. For details, see Migrating data from a MySQL DB instance to an Amazon Aurora MySQL DB cluster by using a DB snapshot and Replication between Aurora and MySQL or between Aurora and another Aurora DB cluster (binary log replication).
Summary of available migration methods
The following table summarizes the common migration methods that are available to migrate small and large-scale, self-managed MySQL databases efficiently to Amazon RDS for MySQL or Aurora MySQL. For more information, see the Amazon Aurora Migration Handbook.
Suggested migration tool options
Amazon RDS for MySQL or Aurora MySQL
Percona XtraBackup (preferred)
mysqldump or mydumper/myloader
Amazon RDS for MySQL
Aurora read replica (preferred)
Amazon RDS snapshot restore
MySQL compatible (for example, MariaDB)
Amazon RDS for MySQL or Aurora MySQL
Logical migration using native tools (preferred)
Identifying and choosing the right tool for migration can greatly reduce the cost and complexity of database migrations. In this post, we discussed different methods to migrate data from a self-managed MySQL-compatible database to Amazon RDS for MySQL or Aurora MySQL. In the next post, we dive deep into the best practices of migration tools that we covered in this post. Leave your comments and feedback in the comments section.
About the Authors
Vijay Karumajji is a Database Solutions Architect with Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.
Read MoreAWS Database Blog