Migrate Oracle database to Amazon RDS for Oracle over a database link for space savings and reclamation

By mullaned2002

January 4, 2023

1126

Amazon Relational Database Service (Amazon RDS) for Oracle is a fully managed commercial database that makes it easy to set up, operate, and scale an Oracle database in the cloud. One of its features is storage auto scaling, which allows the database storage to grow automatically, as needed, up to the maximum capacity (64 TiB) currently supported by Amazon RDS for Oracle. We use a high-water mark model to determine storage allocation, and you only pay for the allocated storage. The high-water mark is the maximum amount allocated to the instance volume at any time during its existence. Because the allocated storage of the instance may not be decreased, it’s beneficial to avoid expanding the database size without need.

However, over time, you may engage in data purging or cleanup activities, resulting in less than full utilization of your allocated storage capacity. Following a significant data purge or reduction, you need a strategy to reclaim the storage allocated to the database and eventually reduce costs.

In this post, we provide a mechanism to reduce the storage footprint during migration of an RDS for Oracle instance using Oracle Data Pump and AWS Database Migration Service (AWS DMS).

Solution overview

The solution uses Oracle Data Pump over a database link to copy the data from a source Oracle database to a target RDS for Oracle instance. For migrations that require minimal downtime, or using database links isn’t feasible, an alternative option is to use AWS DMS. With this solution, you don’t have to allocate additional storage for Data Pump files for importing into Amazon RDS for Oracle. After the data is copied, applications can use the connection string for the new RDS for Oracle instance. In the case where both the source and target are RDS for Oracle instances, you may also rename the new instance to the original instance name to avoid any changes to the connection string used by the application. Also note that the solution requires that the source and target Oracle databases can connect to each other.

You can also use this solution in the following use cases:

Migrating data from an Oracle database located on premises or an Amazon Elastic Compute Cloud (Amazon EC2) instance to or from Amazon RDS for Oracle and Amazon RDS Custom for Oracle without allocating additional storage to hold logical export files.
Performing an out-of-place upgrade of your RDS for Oracle instance with low downtime.

In this post, we discuss two options to perform the migration, each with advantages and trade-offs:

Option 1 – Perform a one-time data load with the applications down throughout the migration
Option 2 – Perform an initial data load and migrate all incremental changes using AWS DMS, with the application running

The following table summarizes the benefits and disadvantages of each option.

.
Pros
Cons
Option 1: Increased downtime

Simple and fewer steps involved in migration
No additional services required to migrate data
Lower cost

Greater application downtime that depends on database size, power of source and target instances, and process optimizations
Perform schema object validation, stats capture, and more in the target instance before cutover, thereby extending the application downtime

Option 2: Minimal downtime

Minimal application downtime
Perform schema object validation, stats capture, and more in the target instance while the application is running in the source database and incremental changes are being replicated by AWS DMS
Flexible to choose a cutover window

More steps involved in migration
Additional cost for AWS DMS to migrate incremental data

Option 1: Perform a one-time data load with the applications down throughout the migration

The first option uses Oracle Data Pump export/import over a database link to populate the target database when the application can afford a longer downtime. The downtime starts from the time the application is brought down to perform the database export until the application reconnects to the target database after import.

In the following diagram, we show a migration from Amazon RDS for Oracle (1) to Amazon RDS for Oracle (2) as an example. You can apply a similar method if the source Oracle database (1) is an RDS for Oracle instance, an Oracle database hosted on Amazon EC2, or hosted on premises.

Option 2: Perform an initial data load and migrate all incremental changes using AWS DMS, with the application running

The second option uses Oracle Data Pump over a database link and change data capture (CDC) replication with AWS DMS. This scenario is suitable when the application requires minimal downtime.

In the following diagram, the target RDS for Oracle DB instance (2) runs an Oracle Data Pump job to extract data from the source RDS for Oracle DB instance (1). After the Oracle Data Pump import using the source system change number (SCN) for consistent copy is complete, we use an AWS DMS task (3) to start the CDC replication, using the same SCN that was used for the Oracle Data Pump import as the start point of replication. Refer to Performing replication starting from a CDC start point to learn more.

Data migration steps for both options

To perform the data migration, complete the following steps:

Create a new RDS for Oracle instance with automated backups disabled and in Single-AZ mode.
If migrating from Amazon RDS for Oracle or Amazon EC2, create the DB instance in the same Availability Zone as the source instance. Also, pre-allocate storage close to what the database size will be once populated and set storage auto scaling on with a reasonable maximum capacity.
Set up the instance with the following DB parameter group settings to prevent any database jobs (dba_jobs or dbms_scheduler_jobs) from running before cutover:

job_queue_processes = 0

Change the redo log file from its default of 128 MiB to match that of the source instance or as determined to be appropriate for your use case. In the following example, a new redo log group is added with size 1 GiB after logging in to the database using a client tool like sqlplus or SQL Developer:

–Add new redo log group
EXEC rdsadmin.rdsadmin_util.add_logfile(p_size => ‘1G’);

–Running the following command to check the status, switch redo log group and checkpoint as necessary to change the redo log group to “INACTIVE” status, before dropping

select group#, bytes, status from v$log;
EXEC rdsadmin.rdsadmin_util.switch_logfile;
EXEC rdsadmin.rdsadmin_util.checkpoint;

–Drop the redo log groups that are 128 MiB in size, when the status is “INACTIVE” in the above command
EXEC rdsadmin.rdsadmin_util.drop_logfile(grp => 1);

–Iterate these commands until all redo log groups are sized adequately and old log groups are removed. Refer to My Oracle Support “<Note 1035935.6> – Example of How to Resize the Online Redo Logfiles” for additional details.

Log in to the target instance and create a database link to the source instance. Choose a user name in the source instance that has DATAPUMP_EXP_FULL_DATABASE permission.

CREATE DATABASE LINK <db link name> CONNECT TO <username> IDENTIFIED BY “<password>” USING ‘(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = <source rds endpoint>)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = <db name>)))’;

If using Option 1 for migration, make sure that application connected to database is down and there is no activity in the database while migrating data from the source instance to the target instance. Run the following command to help identify users or application sessions still connected to the database.

select username, count(*)
from v$session
where type!=’BACKGROUND’ and username is not null group by username;

Run the following script from the target instance to load the data over the database link:

–Run the following code if you want to import all the schemas from the source RDS Oracle Instance. Note that <db link name> is the database link created in step 4. Also, the value 20 in below command is the parallel degree for the export job. Follow Oracle Data Pump best practices to decide the parallel degree for your RDS for Oracle instance type. Note that parallel degree is available only in Oracle Enterprise Edition and for Oracle Standard Edition use value of 1.

set serveroutput on;
DECLARE
v_degree number:=20;
v_db_link varchar2(30):=’<db link name>‘;
v_dt varchar2(15);
v_hdnl number;
v_scn number;
v_err_msg varchar2(200);

BEGIN

execute immediate ‘select CURRENT_SCN from v$database@’||v_db_link into v_scn;
select to_char(sysdate,’yyyymmdd_hh24miss’) into v_dt from dual;
dbms_output.put_line(‘Flashback SCN =>’||v_scn);
dbms_output.put_line(‘Datapump Job Name =>’||’impdp_full_’||lower(v_db_link)||’_load_’||v_dt);

BEGIN

v_hdnl := DBMS_DATAPUMP.OPEN(operation => ‘IMPORT’, job_mode => ‘FULL’, remote_link => v_db_link, job_name => ‘impdp_full_’||lower(v_db_link)||’_load_’||v_dt);

DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => ‘impdp_full_’||lower(v_db_link)||’_load_’||v_dt||’.log’, directory => ‘DATA_PUMP_DIR’, filetype => dbms_datapump.ku$_file_type_log_file);
DBMS_DATAPUMP.METADATA_TRANSFORM(handle => v_hdnl, name=> ‘DISABLE_ARCHIVE_LOGGING’, value=>1);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘FLASHBACK_SCN’, value => v_scn);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘METRICS’, value => 1);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘LOGTIME’, value => ‘ALL’);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘TABLE_EXISTS_ACTION’, value => ‘REPLACE’);
DBMS_DATAPUMP.SET_PARALLEL (handle => v_hdnl, degree => v_degree);
DBMS_DATAPUMP.START_JOB(v_hdnl);
DBMS_DATAPUMP.DETACH(v_hdnl);

EXCEPTION
when others THEN
v_err_msg:=SUBSTR(SQLERRM, 1, 200);
dbms_output.put_line(‘Failed loading the data !!! Erorr => ‘||v_err_msg);
END;

end;
/

You can use the preceding script to migrate the entire database to the target DB instance. If you need to migrate only a specific list of schemas, refer to the following code:

–Run the following code if you want to import specific list of schemas (schema_name_1, schema_name_2, schema_name_3 and schema_name_4 in our example) from the source RDS Oracle Instance. Note that <db link name> is the database link created in step 4. Also, the value 20 in below command is the parallel degree for the export job. Follow Oracle Data Pump best practices to decide the parallel degree for your RDS for Oracle instance type. Note that parallel degree is available only in Oracle Enterprise Edition and for Oracle Standard Edition use value of 1.

set serveroutput on;
DECLARE
v_degree number:=20;
v_db_link varchar2(30):=’<db link name>‘;
v_dt varchar2(15);
v_hdnl number;
v_scn number;
v_err_msg varchar2(200);

BEGIN

execute immediate ‘select CURRENT_SCN from v$database@’||v_db_link into v_scn;
select to_char(sysdate,’yyyymmdd_hh24miss’) into v_dt from dual;
dbms_output.put_line(‘Flashback SCN =>’||v_scn);
dbms_output.put_line(‘Datapump Job Name =>’||’impdp_full_’||lower(v_db_link)||’_load_’||v_dt);

BEGIN

v_hdnl := DBMS_DATAPUMP.OPEN(operation => ‘IMPORT’, job_mode => ‘SCHEMA’, remote_link => v_db_link, job_name => ‘impdp_full_’||lower(v_db_link)||’_load_’||v_dt);

DBMS_DATAPUMP.METADATA_FILTER( handle=>v_hdnl, name=> ‘SCHEMA_EXPR’, value=> ‘IN (”’||schema_name_1||’,”’||schema_name_2||’,”’||schema_name_3||’,”’||schema_name_4||”’)’);

DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => ‘impdp_full_’||lower(v_db_link)||’_load_’||v_dt||’.log’, directory => ‘DATA_PUMP_DIR’, filetype => dbms_datapump.ku$_file_type_log_file);
DBMS_DATAPUMP.METADATA_TRANSFORM(handle => v_hdnl, name=> ‘DISABLE_ARCHIVE_LOGGING’, value=>1);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘FLASHBACK_SCN’, value => v_scn);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘METRICS’, value => 1);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘LOGTIME’, value => ‘ALL’);
DBMS_DATAPUMP.SET_PARAMETER (handle => v_hdnl, name => ‘TABLE_EXISTS_ACTION’, value => ‘REPLACE’);
DBMS_DATAPUMP.SET_PARALLEL (handle => v_hdnl, degree => v_degree);
DBMS_DATAPUMP.START_JOB(v_hdnl);
DBMS_DATAPUMP.DETACH(v_hdnl);

EXCEPTION
when others THEN
v_err_msg:=SUBSTR(SQLERRM, 1, 200);
dbms_output.put_line(‘Failed loading the data !!! Erorr => ‘||v_err_msg);
END;

end;
/

Note the Flashback SCN and Datapump Job Name values that are displayed as outputs in the previous step.
Monitor the status of the import job and review the import log by running the following commands in the target instance:

select * from DBA_DATAPUMP_JOBS;

SELECT *
FROM table(rdsadmin.rds_file_util.read_text_file(‘DATA_PUMP_DIR’, ‘<Datapump Job Name>.log’));

After the import is complete, compare the database schema object count and status between the source and target instance. Recompile objects if any are invalid by running the following commands in the target instance:

# The following command compares source and target database schema object count and status. Analyze and recompile invalid objects until the object count and status match between source and target database schema.

SELECT NVL(src.object_type,tgt.object_type) object_type,
src.status source_status, tgt.status target_status,
src.cnt source_count, tgt.cnt target_count
FROM (SELECT object_type, status, count(*) cnt
FROM dba_objects@<db link name>
WHERE owner=upper(‘<schema name>‘) AND OBJECT_name NOT LIKE ‘DR%’
and object_type not like ‘%PARTITION%’
GROUP BY object_type, status) src
FULL OUTER join
(SELECT object_type, status, count(*) cnt
FROM dba_objects
WHERE owner=upper(‘<schema name>‘) AND OBJECT_name NOT LIKE ‘DR%’
and object_type not like ‘%PARTITION%’
GROUP BY object_type, status) tgt
on src.OBJECT_type=tgt.object_type and src.status=tgt.status
ORDER BY 1,3;

# Command to recompile invalid objects

exec SYS.UTL_RECOMP.RECOMP_SERIAL(‘<schema name>’);

If necessary, grant permission to system objects using the Amazon RDS procedure rdsadmin.rdsadmin_util.grant_sys_object.
Gather dictionary-level and schema-level statistics for optimal performance.
You can now enable automated backups and the Multi-AZ option for the target instance.

Cutover

If you chose Option 1 for data migration, you can now cut over the application to the target RDS for Oracle instance.

For Option 2, complete the following additional steps in the target instance, after you have completed the import and compared the database objects:

Generate the commands to disable triggers and foreign key constraints in the target instance:

select ‘alter trigger ‘||owner||’.’||trigger_name||’ disable;’
from dba_triggers
where owner in (‘<list of migrated schema names>‘)
and status=’ENABLED’;

select ‘alter table ‘||owner||’.’||table_name||’ disable constraint ‘||constraint_name||’;’
from dba_constraints
where constraint_type=’R’
and owner in (‘<list of migrated schema names>‘)
and status=’ENABLED’;

Run the output commands from Step 1 to disable triggers and foreign key constraints in the target instance.
Generate commands to enable triggers and foreign key constraints in the target instance.

select ‘alter trigger ‘||owner||’.’||trigger_name||’ enable;’
from dba_triggers
where owner in (‘<list of migrated schema names>‘)
and status=’ENABLED’;

select ‘alter table ‘||owner||’.’||table_name||’ enable constraint ‘||constraint_name||’;’
from dba_constraints
where constraint_type=’R’
and owner in (‘<list of migrated schema names>‘)
and status=’ENABLED’;

To migrate incremental changes from the source instance to the target instance, follow the sections Provisioning the AWS DMS instances, Creating the AWS DMS migration task, and Data validation in the post Migrating Oracle databases with near-zero downtime using AWS DMS to complete the migration and cutover. Use Flashback SCN while configuring the AWS DMS task (in the System change number field). Be sure to retain archive logs for an extended period to include the captured Flashback SCN.
Wait until the desired cutover time while monitoring replication lag using Amazon CloudWatch.
Stop applications when the CDCLatencySource and CDCLatencyTarget is close to 0 and allow final changes to flow through.
Run the output commands of Step 3, to enable triggers and foreign key constraints in the target instance.
Run the following SQL in the source Oracle database to generate commands to reset the sequence values before cutover:

–Run the following statement in source RDS for Oracle DB Instance to generate the DDL to reset sequences. Run the generated DDL statements in the target RDS Oracle instance

SELECT ‘ALTER SEQUENCE ‘||sequence_owner||’.’||sequence_name||’ RESTART START WITH ‘||(last_number+1)||’;’
FROM dba_sequences
WHERE sequence_owner IN (‘<list of migrated schema names>‘);

Monitor the target instance by configuring appropriate CloudWatch alerts (see Monitoring Amazon RDS metrics with Amazon CloudWatch) and Amazon RDS events.
Modify the target instance parameter job_queue_processes with the appropriate value from the source instance.

This completes the migration of data from source to target RDS for Oracle DB instance. You can now cut over applications to the target instance.

Clean up

After you cut over to the target instance, you can stop the source instance to avoid instance charges. After the migration is confirmed to be successful, you can delete the instance. For instructions, refer to Deleting a DB instance.

Conclusion

In this post, we demonstrated how you can reduce the storage allocated to an RDS for Oracle DB instance by copying the data to a new RDS for Oracle DB instance using Oracle Data Pump over a database link, and optionally with minimal downtime by using AWS DMS for CDC replication. You can use this methodology for source Oracle databases in Amazon RDS for Oracle, self-managed Oracle databases on Amazon EC2, and Oracle databases hosted outside of the AWS Cloud.

We encourage you to try this solution. For more information about using AWS DMS, see Getting started with AWS Database Migration Service and Best Practices for AWS Database Migration Service.

Reach out with questions or requests in the comments. Happy migrating!

About the authors

Santhosh Kumar Adapa is a Sr. Database Consultant with the AWS Professional Services team at AWS. He works as a database migration specialist to help Amazon customers in design and implementing scalable, secure, performant, and robust database solutions in the cloud.

Bhanu Ganesh Gudivada is a Database Consultant with the AWS Professional Services team at AWS based out of Hyderabad, India and specializes in database migrations. He is helping and enabling customers to build high-available, cost-effective database solutions and migrate their commercial engines to AWS cloud. Curious to learn and implement new technologies around the databases and orchestrate migrations through automation.

Sushant Deshmukh is a Database Consultant with AWS Professional Services. He works with AWS customers and partners to build highly available, scalable and secured database architectures on AWS. Also, helping customers migrate and modernize their databases to AWS Cloud. Outside of work, he enjoys traveling, exploring new places, playing volleyball and spending time with his family and friends.

Jeevan Shetty is a Sr. Database Consultant with the AWS Professional Services Team. He has been supporting and enabling customers to migrate their database from OnPrem data center to AWS cloud and also in migration from commercial database engines to open source database in Amazon.

Migrate Oracle database to Amazon RDS for Oracle over a database link for space savings and reclamation

Solution overview

Option 1: Perform a one-time data load with the applications down throughout the migration

Option 2: Perform an initial data load and migrate all incremental changes using AWS DMS, with the application running

Data migration steps for both options

Cutover

Clean up

Conclusion

About the authors

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Implement UUIDv7 in Amazon RDS for PostgreSQL using Trusted Language Extensions

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

The truth about VDI and cloud computing

Bongo Learn provides real-time feedback to improve learning outcomes with Amazon Transcribe

Leverage Cloud-based Technology to Expand Business Opportunities

POPULAR CATEGORY