Monday, December 2, 2024
No menu items!
HomeDatabase ManagementOptimize your database storage for Oracle workloads on AWS, Part 1: Using...

Optimize your database storage for Oracle workloads on AWS, Part 1: Using ADO and ILM data compression policies

In this two-part series, we demonstrate how to optimize storage for Oracle database workloads on AWS by using Oracle’s built-in features, such as Heat Map, Automatic Data Optimization (ADO), and hybrid partitioning. These features help classify data by its lifecycle stage and automate data management tasks to significantly reduce storage costs, while enhancing database performance, especially for growing datasets.

In this series, we outline three different approaches to effectively manage the data lifecycle. You can apply these strategies individually or in combination, depending on the specific needs of your data environment:

  • Automatic data compression based on data access pattern – Use Oracle’s Heat Map, ADO, and Oracle Information Lifecycle Management (ILM) policies to apply compression on tables or partitions that are less frequently accessed, helping reduce storage costs while improving performance
  • Hybrid partitioning based on data access pattern – Use Heat Map data to identify aged or rarely accessed data and move it to external partitions, freeing up high-performance storage for more active workloads while still retaining access to older data
  • Storage tiering based on data access pattern – Use Heat Map, ADO, and data movement ILM policies to move data to colder, less expensive storage tiers as data ages and becomes less frequently used

The first two use cases are applicable across all deployment options for Oracle Database Enterprise Edition on AWS, including Amazon Relational Database Service (Amazon RDS) for Oracle, Amazon RDS Custom for Oracle, and a self-managed database running on Amazon Elastic Compute Cloud (Amazon EC2). The third use case is limited to self-managed databases on Amazon EC2, because tiered storage is currently not supported in RDS instances (as of this writing).

In this post, we explore how to use ADO and Oracle ILM policies to automatically compress data based on usage patterns. In Part 2, we take a deeper look at using Heat Map statistics to manage both active and inactive data through hybrid partitioning and ADO, enabling the automatic movement of data to different storage tiers as it becomes less frequently accessed.

Solution overview

Oracle ILM helps you manage data based on its usage patterns over time. It allows you to define data classes and uses Heat Map and ADO to automate data compression and movement across different storage tiers based on user-defined policies and data access patterns.

Heat Map tracks data usage at a granular level, providing insights into data activity and helping ADO make informed decisions about when to compress or move data to appropriate storage tiers, providing efficient data management throughout its lifecycle.

ADO is a feature within Oracle Database that automates the management of data storage. It constantly assesses the value and usage patterns of your data and takes proactive steps to make sure everything is stored in the most efficient manner. ADO operates based on the ILM policies you define. These policies set the rules for when and how data should be compressed, moved, or otherwise optimized.

Here’s how it breaks down:

  • Heat Map tracking – ADO uses Oracle’s Heat Map feature to monitor data usage at a granular level. It tracks how often each row and segment is accessed and modified, giving a detailed picture of data activity over time.
  • Policy-based actions – Based on the insights from Heat Map, ADO implements the policies you’ve set up. For example, you can create a policy to compress data that hasn’t been accessed in the last 30 days, or to move seldom-used data to a cheaper storage tier.
  • Automated execution – After these policies are in place, ADO takes care of the rest. It automatically compresses, moves, and optimizes data without manual intervention, making sure your database remains high performing and cost-efficient.

The following diagram illustrates this architecture.

high-level architecture diagram

The high-level steps to implement this solution are as follows:

  1. Enable Heat Map at the database level. This feature tracks data access and modifications at both the segment and row levels, providing a detailed, real-time map of how your data is being used.
  2. Create ADO policies tailored to your business needs. You can define these policies at various levels, such as tablespace, group, segment, or row. Policies can also be based on different types of operations, such as creation, access, and modification. You can specify conditions for when these policies take effect, such as based on a time period (for example, after 3 days, 1 week, or 1 year) or a tablespace usage threshold, in which you invoke policies when usage exceeds a certain threshold. The actions specified in these policies can include compression, moving to a different storage tier, or both.
  3. Segment-level ADO policies are evaluated daily during the auto-task maintenance window by default, but DBAs can create custom schedules. Row-level policies are evaluated every 15 minutes by the Manageability Monitor (MMON)
  4. Use the USER_ILMEVALUATIONDETAILS and USER_ILMRESULTS views to check the execution details of ADO policies, confirming that everything is working as intended.
  5. Enable hybrid partitioning and convert partitions based on access patterns.
  6. Verify that segments are compressed or moved according to the policies. This confirmation step makes sure your data is optimally managed and stored.

Licensing

You must be licensed for the Oracle Advanced Compression or Oracle Database In-Memory option to use Oracle ADO, as explained in the Oracle documentation. You should also be licensed to use Oracle partitioning when using hybrid partitioning. Refer to the Oracle licensing documentation for the latest information.

Prerequisites

In this section, we discuss the prerequisites to complete before you get started. You will need Oracle EE running on AWS and a client to connect to the database. For this post, we use the SQL*Plus client and Amazon RDS for Oracle deployment.

Enable Heat Map tracking

Before you can start using ADO, you need to enable activity tracking. You can enable this by setting the initialization parameter HEAT_MAP. In Amazon RDS for Oracle, you can set this parameter by creating a custom parameter group. In RDS Custom for Oracle and a self-managed database running on Amazon EC2, you can use the following command by connecting to the pluggable database (PDB) as a user with ALTER SYSTEM privileges:

SQL> alter system set Heat_Map=on;
SQL> show parameter Heat_Map

 NAME        TYPE	     VALUE
 ----------  ----------- ------
 Heat_Map    string	     ON

The preceding parameter enables the database to track read and write activities at the database level. Enabling Heat Map enables tracking both DML and access at the segment level and stores these details in a relevant table in the SYSAUX tablespace. The activities are not tracked for objects in the SYSTEM and SYSAUX tablespaces.

Tracking details of segments are available in real time in the V$HEAP_MAP_SEGMENT view. As soon as the session accesses some objects in the database, real-time segment access information will be available immediately in this view. Data from V$HEAP_MAP_SEGMENT is persisted into the SYS.HEAT_MAP_STAT$ table in the SYSAUX tablespace by the DBMS_SCHEDULER job at a regular period of time. This data is available in dictionary views like DBA_HEAT_MAP_SEGMENT and DBA_HEAT_MAP_SEG_HISTOGRAM.

After enabling Heat Map tracking, we set the tracking start time back 5 days to make sure statistics logged after this time are valid and considered by ADO:

SQL> EXEC dbms_ilm_admin.set_Heat_Map_start(start_date => SYSDATE - 5);

You can also customize parameters that control ILM. You can check the values of the parameters from the dba_ilmparameters view:

SQL> select * from dba_ilmparameters;
NAME			        VALUE
--------------------     ----------
ENABLED 		           	    1
RETENTION TIME		           30
JOB LIMIT		                2
EXECUTION MODE		            2
EXECUTION INTERVAL	           15
TBS PERCENT USED	           85
TBS PERCENT FREE	           25
POLICY TIME		                0

For ILM to be active, the ENABLED parameter must be set to 1. The parameter POLICY TIME determines if ADO policies are specified in seconds or days. Values are 1 for seconds or 0 for days (default). We set POLICY TIME to seconds. This allows us to test policies without having to wait for long periods of time.

SQL> EXEC dbms_ilm_admin.customize_ilm(parameter=>dbms_ilm_admin.policy_time,VALUE=>dbms_ilm_admin.ilm_policy_in_seconds);

This will change the POLICY TIME value in dba_ilmparameters to 1.

Create a sample table with partitions and insert dummy data into the table

Use the following code to create a partitioned table called DEMO.ORDERS and load it with dummy data. We use this table throughout this post to demonstrate various use cases.

-- Create user demo
SQL> create user demo identified by <password> default tablespace users quota unlimited on users;
SQL> grant connect, resource to demo;
SQL> grant execute on dbms_lock to demo;
SQL> grant execute on RDSADMIN.RDS_FILE_UTIL to demo;

-- Create table DEMO.ORDERS
SQL> conn demo/<password>@orapdb
SQL> BEGIN
      execute immediate 'CREATE TABLE DEMO.ORDERS
( order_id NUMBER (12),
  order_date DATE,
  total NUMBER (12),
  details VARCHAR(4000)
) 
  PARTITION BY RANGE (order_date)
(PARTITION orders_hist VALUES LESS THAN (TO_DATE(('''||to_char(SYSDATE - INTERVAL '180' DAY,'DD-MON-YYYY')||'''),''DD-MON-YYYY'')) 
,PARTITION orders_cold VALUES LESS THAN (TO_DATE(('''||to_char(SYSDATE - INTERVAL '90' DAY,'DD-MON-YYYY')||'''),''DD-MON-YYYY''))
,PARTITION orders_warm VALUES LESS THAN (TO_DATE(('''||to_char(SYSDATE - INTERVAL '30' DAY,'DD-MON-YYYY')||'''),''DD-MON-YYYY''))
,PARTITION orders_hot VALUES LESS THAN (MAXVALUE))';
END;
/

-- Insert dummy data into table
SQL> DECLARE
        i NUMBER;
 	  nbrows NUMBER;
     BEGIN
  	   i:=1;
         SELECT NVL(MAX(order_id),0)+1 INTO i FROM DEMO.ORDERS;
         nbrows:=i+10000000-1;
      LOOP
      EXIT WHEN i>nbrows;
         INSERT INTO DEMO.ORDERS 
VALUES(i, (sysdate - dbms_random.value(1,400)),dbms_random.VALUE(1,1000), lpad('x',100,'x'));
    	   i:=i+1;
      END LOOP;
      COMMIT;
      dbms_stats.gather_table_stats('demo', 'orders');
      END;
      /

-- Validate number of rows in the TABLE DEMO.ORDERS 
SQL> select count(*) from DEMO.ORDERS;

     COUNT(*)
     ----------
     10000000

Check if table is getting tracked in the Heat Map:

SQL> col OBJECT_NAME format a15
SQL> col SUBOBJECT_NAME format a15
SQL> select * from user_Heat_Map_segment;

OBJECT_NAME          SUBOBJECT_NAME       SEGMENT_W SEGMENT_R FULL_SCAN LOOKUP_SC
-------------------- -------------------- -----------------------------------------
ORDERS               ORDERS_COLD          22-JUL-24 	23-JUL-24 	23-JUL-24
ORDERS               ORDERS_WARM          22-JUL-24 	23-JUL-24 	23-JUL-24
ORDERS               ORDERS_HIST          22-JUL-24 	23-JUL-24 	23-JUL-24
ORDERS               ORDERS_HOT           22-JUL-24 	23-JUL-24 	23-JUL-24

Check the current size and compression status of the table and its partitions:

SQL> select sum(bytes)/1048576 "Size MB" from user_segments where segment_name = 'ORDERS';

   Size MB
----------
      1363

SQL> col table_name format a20
SQL> SELECT table_name, compression, compress_for FROM  user_tables where table_name='ORDERS';

TABLE_NAME           COMPRESS COMPRESS_FOR
-------------------- -------- ------------------------------
ORDERS

SQL> col PARTITION_NAME format a20
SQL> SELECT partition_name,compress_for,compression,blocks,avg_row_len, num_rows
     FROM user_tab_partitions WHERE table_name='ORDERS' ORDER BY partition_position;

PARTITION_NAME       COMPRESS_FOR       COMPRESS     BLOCKS      NUM_ROWS
-------------------- ------------------ --------    ---------- ----------- 
ORDERS_HIST                             DISABLED      95152        5506274
ORDERS_COLD                             DISABLED      39175        2255618
ORDERS_WARM                             DISABLED      25915        1503719
ORDERS_HOT                              DISABLED      13174        734389

The size of the uncompressed DEMO.ORDERS table is around 1.36 GB.

Automatic data compression based on data access pattern

In this example, we show how you can automate the data compression level based on the data access pattern and the age of the data. The following diagram illustrates this architecture.

compression-architecture

In Oracle, the following compression levels are available:

  • ROW STORE COMPRESS BASIC
  • ROW STORE COMPRESS ADVANCED
  • COLUMN STORE COMPRESS FOR QUERY LOW OR HIGH
  • COLUMN STORE COMPRESS FOR ARCHIVE LOW OR HIGH

In this post, we only test ROW STORE COMPRESS BASIC and ROW STORE COMPRESS ADVANCED.

You can define the ILM policy at the tablespace, group, segment, and row level. For this example, we create and enable an ADO row store compression policy on the DEMO.ORDERS table at the segment and row levels.

Segment-level compression policies

In segment-level compression policies, the entire segment properties get changed. These are most useful for partitioned tables where partitions have become unused or are infrequently used. The segment-level policy is run only one time and is disabled afterwards.

By issuing the following statements, we add ILM policies to automatically compress the partition at the segment level depending on the defined condition and age of the data:

SQL> ALTER TABLE demo.orders
     MODIFY PARTITION orders_hist
     ILM ADD POLICY
     ROW STORE COMPRESS ADVANCED SEGMENT
     AFTER 60 DAYS OF NO MODIFICATION;

SQL> ALTER TABLE demo.orders
     MODIFY PARTITION orders_cold
     ILM ADD POLICY
     ROW STORE COMPRESS ADVANCED SEGMENT
     AFTER 60 DAYS OF NO MODIFICATION;  

SQL> ALTER TABLE demo.orders
     MODIFY PARTITION orders_warm
     ILM ADD POLICY
     ROW STORE COMPRESS BASIC SEGMENT
     AFTER 30 DAYS OF NO MODIFICATION; 

In this example, we have defined ILM policies to perform advanced compression on two of the older partitions and basic compression on one of the partitions. Let’s verify the policies as follows:

SQL> col policy_name format a20
SQL> SELECT a.policy_name,a.subobject_name,b.action_type,b.scope,
    b.compression_level,b.condition_type,b.condition_days, enabled
    FROM user_ilmobjects a, user_ilmdatamovementpolicies b
    WHERE a.policy_name=b.policy_name
    AND a.object_owner='DEMO'
    AND a.object_name='ORDERS';

POLICY SUBOBJECT_NAME ACTION_TYPE  SCOPE   COMPRESSION CONDITION_TYPE        CONDITION  ENA	              
------ -------------- ------------  ------  --------   ---------------------  -------- ---                     
P1     ORDERS_HIST    COMPRESSION  SEGMENT  ADVANCED   LAST MODIFICATION TIME      60  YES                       
P2     ORDERS_COLD    COMPRESSION  SEGMENT  ADVANCED   LAST MODIFICATION TIME      60  YES                        
P3     ORDERS_WARM    COMPRESSION  SEGMENT		       LAST MODIFICATION TIME      30  YES

To test the policy, we flush the Heat Map statistics from memory to disk and add a pause of more than 60 seconds so we know that the segment wasn’t modified in the last 60 seconds. We set the dbms_ilm_admin.policy_time parameter to seconds earlier to allow us to simulate 60 days in just 60 seconds of real clock for quick testing.

EXEC dbms_ilm.flush_all_segments;
EXEC dbms_lock.sleep(60); 

Instead of waiting for the MMON background process to invoke the policy job, we run the policy manually:

SQL>SET serverout ON 
SQL>DECLARE
  taskid NUMBER;
            BEGIN
  dbms_ilm.execute_ilm(owner=>'DEMO', object_name=>'ORDERS', task_id=> taskid);
  dbms_output.put_line('Task ID: ' || taskid);
END;
/
Task ID: 2
PL/SQL procedure successfully completed.

Check the status of the task:

SQL> select POLICY_NAME, SUBOBJECT_NAME, SELECTED_FOR_EXECUTION from USER_ILMEVALUATIONDETAILS where TASK_ID=2;

   POLICY_NAM  SUBOBJECT_NAME   SELECTED_FOR_EXECUTION
   ----------  ----------       ------------------------
         P1    ORDERS_HIST      SELECTED FOR EXECUTION
         P2    ORDERS_COLD      SELECTED FOR EXECUTION
         P3    ORDERS_WARM      SELECTED FOR EXECUTION

SQL> col job_name format a15
SQL> select task_id, job_name, job_state from user_ilmresults where TASK_ID=2;

      TASK_ID    JOB_NAME        JOB_STATE
      ---------- --------------- ------------------
         2 	    ILMJOB364       COMPLETED SUCCESSFULLY
         2 	    ILMJOB366       COMPLETED SUCCESSFULLY
         2 	    ILMJOB368       COMPLETED SUCCESSFULLY

After the JOB_STATE column value changes to COMPLETED SUCCESSFULLY, check if the table partitions were compressed based on the defined policy:

SQL> col partition_name format a20
SQL> SELECT partition_name,compress_for,compression,blocks,num_rows FROM user_tab_partitions WHERE table_name='ORDERS' ORDER BY partition_position;

PARTITION_NAME       COMPRESS_FOR                   COMPRESS     BLOCKS   NUM_ROWS
-------------------- ------------------------------ -------- ---------- ----------
ORDERS_HIST          ADVANCED                       ENABLED       18583    5506274
ORDERS_COLD          ADVANCED                       ENABLED        7674    2255618
ORDERS_WARM          BASIC                          ENABLED        4603    1503719
ORDERS_HOT                                          DISABLED      13174     734389

Because compression is at the segment level, all rows in the segment should be compressed. Let’s verify for one of the compressed partitions:

SQL> SELECT count(*),
    CASE DBMS_COMPRESSION.get_compression_type ('DEMO', 'ORDERS', rowid, 'ORDERS_WARM')
    WHEN 1 THEN 'No Compression'
    WHEN 2 THEN 'Advanced compression level'
    WHEN 64 THEN 'Compressed row'
    WHEN 128 THEN 'High compression level for LOB operations'
    WHEN 256 THEN 'Medium compression level for LOB operations'
    WHEN 512 THEN 'Low compression level for LOB operations'
    WHEN 4096 THEN 'Basic compression level'
    WHEN 5000 THEN 'Maximum number of LOBs used to compute the LOB compression ratio'
    END AS compression_type
    FROM   demo.orders partition (ORDERS_WARM)
    group by DBMS_COMPRESSION.get_compression_type ('DEMO', 'ORDERS', rowid, 'ORDERS_WARM');

  COUNT(*) COMPRESSION_TYPE
---------- ----------------------------------------------------------------
   1503719	Basic compression level

The ADO segment-level compression policy gets disabled after the object is already compressed. Its status will be changed to NO, as shown in the following example:

SQL> SELECT * FROM user_ilmpolicies;

POLICY_N POLICY_TYPE   TABLESPACE		      ENA DEL
-------- ------------- ------------------------------ --- ---
P1	   DATA MOVEMENT				      NO  NO
P2	   DATA MOVEMENT				      NO  NO
P3	   DATA MOVEMENT				      NO  NO

If you insert new rows into the partitions where segment-level compression policies are already run, new inserted rows won’t get compressed. If segment-level compression is being used in an active segment where data continues to be added, it’s recommended to re-enable the ADO segment level compression policy.

Let’s check the size of table after running the ADO segment-level policies:

SQL> select sum(bytes)/1048576 from user_segments where segment_name = 'ORDERS';

SUM(BYTES)/1048576
------------------
		360

Row-level compression policies

In a row-level policy, blocks where all the rows in the blocks meet the policy are compressed. The row-level policy continues to run after its first run. Row-level compression policies can be created based only on modification time. They can’t be created based on creation time or access time. Also, the only compression type available for row-level compression policies is ROW STORE COMPRESS ADVANCED.

Let’s test a row-level compression ADO policy on the ORDERS_HOT partition:

SQL> ALTER TABLE demo.orders
     MODIFY PARTITION orders_hot ILM ADD POLICY
     ROW STORE COMPRESS ADVANCED ROW
     AFTER 30 DAYS OF NO MODIFICATION;  

Verify that the policy is enabled:

SQL> SELECT a.policy_name pname,a.subobject_name subobject,b.action_type,b.scope,
b.compression_level "Level",b.condition_type,b.condition_days, enabled
FROM user_ilmobjects a, user_ilmdatamovementpolicies b
WHERE a.policy_name=b.policy_name
AND a.object_owner='DEMO'
AND a.object_name='ORDERS' AND subobject_name ='ORDERS_HOT' ;

PNAME SUBOBJECT  ACTION_TYPE SCOPE  LEVEL    CONDITION_TYPE         CONDITION_DAYS ENA	               
----- ---------- ----------  ----- -------   ---------------------  --------       --
P4   ORDERS_HOT  COMPRESSION ROW	  ADVANCED  LAST MODIFICATION TIME 30             YES 

Run the policy manually:

SQL> SET serverout ON SIZE 999999
SQL> DECLARE
          taskid NUMBER;
         BEGIN
          dbms_ilm.execute_ilm(owner=>'DEMO',    object_name=>'ORDERS', task_id=> taskid);
          dbms_output.put_line('Task ID: ' || taskid);
    END;
    /
Task ID: 36

PL/SQL procedure successfully completed.

SQL> select TASK_ID,POLICY_NAME, SELECTED_FOR_EXECUTION
    from USER_ILMEVALUATIONDETAILS
    where TASK_ID=36 and POLICY_NAME='P4';

   TASK_ID POLICY_NAME          SELECTED_FOR_EXECUTION
  ------- -------------------- ---------------------------
        36 P4                   SELECTED FOR EXECUTION

After the policy job is complete, check both the segment-level compression as well as row-level compression for the ORDERS_HOT partition:

SQL> SELECT partition_name,compress_for,compression,blocks, num_rows
    FROM user_tab_partitions 
    WHERE table_name='ORDERS' AND PARTITION_NAME= 'ORDERS_HOT';

PARTITION_NAME       COMPRESS_FOR                   COMPRESS     BLOCKS   NUM_ROWS
-------------------- ------------------------------ -------- ---------- ----------
ORDERS_HOT                                          DISABLED      13174     734389

SQL> SELECT count(*),
    CASE DBMS_COMPRESSION.get_compression_type ('DEMO', 'ORDERS', rowid, 'ORDERS_HOT')
    WHEN 1 THEN 'No Compression'
    WHEN 2 THEN 'Advanced compression level'
    WHEN 64 THEN 'Compressed row'
    WHEN 128 THEN 'High compression level for LOB operations'
    WHEN 256 THEN 'Medium compression level for LOB operations'
    WHEN 512 THEN 'Low compression level for LOB operations'
    WHEN 4096 THEN 'Basic compression level'
    WHEN 5000 THEN 'Maximum number of LOBs used to compute the LOB compression ratio'
    END AS compression_type
    FROM   DEMO.ORDERS partition (ORDERS_HOT)
    group by DBMS_COMPRESSION.get_compression_type ('DEMO', 'ORDERS', rowid, 'ORDERS_HOT');

  COUNT(*)   COMPRESSION_TYPE
    --------   ----------------------------------------------------------------
     12526    No Compression
    721863    Advanced compression level

The segment compression attributes haven’t been modified, but some rows (721,863) have been compressed and others (12,526) have not. Only blocks where all the rows meet the policy criteria are compressed.

Check the status of the ADO policy after the first ADO evaluation:

SQL> col POLICY format a6
SQL> col PARTITION format a15
SQL> SELECT policy_name policy, SUBOBJECT_NAME partition, enabled FROM user_ilmobjects where SUBOBJECT_NAME='ORDERS_HOT';

POLICY                       PARTITION               ENA
----- -                      --------------          ---
P4                           ORDERS_HOT              YES

Because the ADO compression policy is set at the row level, the ADO compression policy is still useful to verify whether further updated or new inserted rows in blocks need to be compressed. This will reduce the size of the table further.

SQL> select sum(bytes)/1048576 from user_segments where segment_name = 'ORDERS';

SUM(BYTES)/1048576
------------------
	       360

After implementing the defined ILM data compression policies, the size of the table was reduced from 1.36 GB to 360 MB, resulting in nearly a 70% reduction in storage space.

Conclusion

In this post, we demonstrated how ILM compression policies and ADO enhance storage management based on data age. You can significantly reduce your storage usage while maintaining cost-efficiency and effective data access.

In Part 2, we show you how to use Heat Map statistics to monitor data usage and integrate this information with hybrid partitioning and ILM data movement policies to move data to more cost-effective storage solutions.

Leave your feedback in the comments section.


About the Authors

Archana Sharma is a Sr. Database Specialist Solutions Architect, working with Worldwide Public Sector customers. She has years of experience in relational databases, and is passionate about helping customers in their journey to the AWS Cloud with a focus on database migration and modernization.

Ramesh Pathuri is a Senior Database Consultant with AWS Worldwide Public Sector ProServe. His extensive experience in databases, along with a passion for guiding customers through their innovative transition to the AWS Cloud, enables them to unlock new possibilities through database migration and modernization, and optimize their data and analytics solutions for increased innovation, efficiency, and impact.

Read MoreAWS Database Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments