Data is any company’s most valuable asset today, and securing it is critical. Organizations in regulated industries such as healthcare and finance must safeguard their databases to meet compliance and regulatory requirements. Protecting sensitive information requires a database user activity monitoring solution, especially for privileged users. With managed database services such as Amazon Relational Database Service (Amazon RDS), you can use Database Activity Streams (DAS) to capture and store database user activities to Amazon Kinesis Data Streams in near-real time. However, finding and correlating all available Kinesis data streams from each RDS DB instance spread across multiple AWS accounts can become challenging as the number of instances increase. One of the key operational requirements for a security operations team is to have a single pane of glass that provides them with deep insights. As of this writing, Database Activity Streams are supported on Amazon RDS for Oracle, Amazon RDS for SQL Server, and Amazon Aurora, which is compatible with MySQL and PostgreSQL.
In this post, we demonstrate how to create a centralized monitoring solution using Database Activity Streams and Amazon OpenSearch Service to meet audit requirements. The solution enables the security team to gather audit data from several Kinesis data streams, enrich, process, and store it with retention to meet compliance requirements, and produce relevant alarms and dashboards. We also discuss the various layers and AWS services used at different stages of the process workflow.
Solution overview
This centralized monitoring solution uses several AWS services to capture, process, store, and analyze the data streams from multiple data sources. The following diagram illustrates the solution architecture.
In the following sections, we provide a high-level overview of different layers to better understand the solution stack that delivers centralized monitoring capabilities using OpenSearch Service.
Prerequisites
In this post, we focus only on the centralized monitoring solution to capture the audit data. Therefore, we make a few assumptions on the environment setup before we dive deep into the architecture:
You already know your compliance requirements and what to audit
You have already defined your organization structure and roles or teams
You have configured Database Activity Streams on supported DB instance classes.
Data source
In this architecture, the data source layer consists of source databases running on RDS DB instances that support Database Activity Streams. To enable the feature on an RDS DB instance, refer to Overview of Database Activity Streams. You can also go through engine-specific documentation, such as Configuring unified auditing for Oracle Database, Configuring auditing policy for Microsoft SQL Server, and Monitoring Amazon Aurora with Database Activity Streams. Enabling Database Activity Streams on the databases automates the collection of audit activities of database users, security-related events such as failed login, and more. The security team can control access to the audit data by limiting access to the API to modify the database activity stream, ModifyActivityStream. When enabling Database Activity Streams, be selective with tables and database objects to meet your compliance requirement. Auditing everything is not an ideal strategy; therefore, work with your security team to understand the database audit requirements. Also, creating a performance baseline of your database workload is helpful, especially if you have a large RDS environment. Run SQL Server performance benchmarks with and without Database Activity Streams enabled to see the impact of enabling Database Activity Streams. This should help you understand the load on CPU and network throughput.
Ingestion
When you enable Database Activity Streams on an RDS DB instance, a Kinesis data stream is created. Audit records are encrypted and pushed to the data stream using AWS Key Management Service (AWS KMS) with customer managed keys. As of this writing, Kinesis Data Streams offers two capacity modes: provisioned and on-demand. By default, enabling Database Activity Streams will provision a Kinesis data stream with 24-hour retention in on-demand capacity mode. In on-demand mode, the service takes care of scaling the capacity based on the volume of data ingested to the stream. In this architecture, the default configuration for the Kinesis data stream is kept intact as data is sent immediately to OpenSearch Service. For additional insights on monitoring and managing the Kinesis data stream, refer to Amazon Kinesis Data Streams and AWS Well-Architected design considerations.
Processing
The audit records from different RDS DB instances reside within their respective Kinesis data streams. Each RDS DB instance has its own data stream that is encrypted and therefore it must be decrypted before sending data to the monitoring solution of your choice. Now, we need to process the data from the Kinesis data stream. One possible approach is to work with AWS Lambda, an event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. The Lambda function processes the audit records from the Kinesis data stream. In this architecture, the Lambda function acts like a consumer application. Using Lambda as a consumer for the data stream abstracts the complexity of polling, checkpointing, and error handling. Lambda pulls audit records from the Kinesis data stream in batches and runs the Lambda function when any of the following conditions are true:
The batch window reaches its maximum value.
The batch size is met.
The payload size reaches 6 MB. If a Lambda function throws an error, it will continue to attempt to run the batch until the data expires.
You can find the Python-based Lambda code on the AWS Samples GitHub repo.
The Lambda function, rds-das-decrypt-opensearch.py, performs the following operations:
Decodes and decrypts the database activities using the KMS key you provided when starting the database activity stream.
Filters heartbeat events and any of the events that belong to the rdsadmin and rdssec users.
Flattens the array of database activity events into individual rows.
Creates an OpenSearch index with mapping.
Ingests records into OpenSearch Service using a signing HTTP request.
The following diagram provides a detailed view of the layered architecture components.
Indexing and Searching
When the processing layer (made up of multiple Lambda functions) sends the decrypted audit records to OpenSearch Service, the data is stored to an index comprised of shards. There are two types of shards—primary shards and replica shards. All ingested audit records are distributed across the primary shards. The replica shard is redundant storage for the data, protecting the cluster against the loss of an instance. Whenever you send a search request, it’s routed to either a primary or replica shard of the index. When you create an index, it defaults to five primary shards and one replica. Deciding on the primary shard count before adding any data is imperative, because it’s not easy and not recommended to change the count later. In this architecture, a Lambda function is responsible for creating an index for each Kinesis data stream, and it also sets the primary shard and replica count.
There are mechanisms to control the costs associated with data storage. You can use Index State Management (ISM) to control the lifecycle of indexes in your cluster and periodically delete old indexes after a certain period of time. For example, you can set up a policy that takes an automated snapshot of an index after 24 hours, migrates the index from hot storage to UltraWarm storage after 7 days, deletes the index after 30 days, and sends you a notification when the index is deleted.
Visualization
The analysis of audit data is a critical step in real-time threat detection and security incident remediation. So far, we’ve successfully ingested audit data from multiple RDS DB instances into a centralized system. Now, let’s understand how to visualize the data with dashboards and define alerts in OpenSearch Dashboards.
OpenSearch Service provides an installation of OpenSearch Dashboards with every OpenSearch Service domain. When you log in to OpenSearch Dashboards, you can create an index pattern to search which indexes you want to analyze and discover the data. You can create visualizations of audit data from OpenSearch indexes and add them to your dashboard. Also, controlling access to your dashboard for users with fine-grained access control in OpenSearch Service is critical. You can integrate an existing identity provider to enable SAML authentication into OpenSearch Service. With this integration, you can selectively provide access to dashboards for the users. For more information, refer to Controlling access to OpenSearch Dashboards. Finally, you can configure alerts in OpenSearch Service to get notified when data from one or more indexes meets certain conditions. For more details, refer to Configuring alerts in Amazon OpenSearch Service.
Benefits
This solution works seamlessly with various database engines, including RDS for Oracle, RDS for SQL Server, Aurora PostgreSQL, and Aurora MySQL DB instances. It facilitates monitoring and analysis of critical database events through Database Activity Streams and OpenSearch Service integration. Additionally, it addresses compliance requirements by generating monthly summary reports of database user activities. Furthermore, the ability to set up alerts for suspicious or unauthorized activity enhances database security and ensures prompt detection and response to potential threats.
Clean up
It’s a best practice to delete resources that you’re no longer using so you don’t incur unintended charges. Complete the following steps to clean up the resources you created for this post:
Delete the RDS DB instance.
Delete the OpenSearch Service domain.
Delete the Lambda function.
Delete Kinesis data stream.
Conclusion
In this post, you learned how to use native AWS services to build an end-to-end database activity monitoring solution architecture. With this solution, you can provide your security team with a solution that will reduce the time to respond to security incidents, and enable them to manage and store audit data reliably with a retention period to meet compliance requirements.
About the authors
Sudhir Amin is a Sr. Solutions Architect at Amazon Web Services. In his role based out of New York, he provides architectural guidance and technical assistance to enterprise customers across different industry verticals, accelerating their cloud adoption. He is a big fan of snooker, combat sports such as boxing and UFC, and loves traveling to countries with rich wildlife reserves where he gets to see world’s most majestic animals up close.
Vishal Srivastava is a Senior Technical Account Manager at AWS. He primarily collaborates with the financial services sector, assisting them through the process of adopting and leveraging cloud technologies. Apart from his professional commitments, he is an avid sports fan and enjoys traveling with his family.
Haider Naqvi is a Solutions Architect at AWS. He has extensive software development and enterprise architecture experience. He focuses on enabling customers to achieve business outcomes with AWS. He is based out of New York.
Read MoreAWS Database Blog