Cloud SQL for PostgreSQL data cache under the hood

By mullaned2002

May 8, 2024

105

Data is the lifeblood of the organization. Being able to make quick, accurate and actionable decisions based on authoritative data enables enterprises to offer differentiated services and improve customer satisfaction. The rise of generative AI has only further amplified this.

It’s therefore important that your database provides near-real-time performance when interacting with operational data. For PostgreSQL databases, we offer Cloud SQL for PostgreSQL Enterprise Plus edition, which offers improved performance out of the box, improved data protection (35 days of PITR) and improved availability (99.99% SLA and near zero downtime maintenance.)

Cloud SQL for PostgreSQL Enterprise Plus edition also includes an innovative data cache feature, which significantly improves read performance. The data cache is a read cache that uses a server-side SSD to cache data. Because it is co-located with compute in the server, data accesses have low latencies and high throughput. Workloads that are typically limited by read throughput and latency will therefore see significant benefits when using the data cache.

In this blog post, we will explore how the data cache works, its internal mechanisms, and the types of workloads that will benefit the most from it.

How does the data cache work?

Cloud SQL uses high-performance, low-latency local solid-state drives as a caching layer for data storage disks. Think of the data cache as an extension to the shared buffers in PostgreSQL. This means queries avoid unnecessary network hops and are not limited by the underlying storage. The data cache is bigger than the physical memory in the instance, which means that more of the working set fits inside the server. This helps queries to achieve a much better cache-hit ratio.

The table below gives the sense of size of different storage layers for a 32 vCPU instance.

The data cache size is fixed for an instance and is a function of the number of vCPUs configured. The table below summarizes the data cache size for different vCPU configurations:

Number of vCPU

Memory (GB)

Data Cache Size

(GB)

2

16

375

4

32

375

8

64

375

16

128

750

32

256

1500

48

384

3000

64

512

6000

80

640

6000

96

768

6000

128

864

9000

What workloads benefit from the data cache?

Workloads that benefit from the data cache feature are read workloads where the total dataset does not fit entirely into memory. This includes but is not limited to:

Workloads that are sensitive to read latencies (for example, key-value lookups)

Workloads that are sensitive to read throughput (for example, table scans)

Gen AI workloads that use vectors for similarity searches

Enabling the data cache

In the Cloud SQL console, go to SQL → Create Instance → Choose PostgreSQL and then choose Enable data cache. The data cache can also be enabled via gcloud and Terraform.

Monitoring

Cloud SQL for PostgreSQL also includes four new metrics for data cache observability:

Data cache quota: Maximum data cache size

Data cache used: Data cache used

PostgreSQL data cache hit count: Total number of data cache hits

PostgreSQL data cache miss count: Total number of data cache misses

Turbocharge your read-heavy workloads

In this blog we discussed how the data cache improves read performance. We also discussed how to easily monitor data cache operations. To learn more about Cloud SQL for PostgreSQL data cache, or to get started, check out the documentation.

Cloud BlogRead More

Previous articleAccelerating CDC insights with Dataflow and BigQuery

Next articleHow Dialog Axiata used Amazon SageMaker to scale ML models in production with AI Factory and reduced customer churn within 3 months

Cloud SQL for PostgreSQL data cache under the hood

How does the data cache work?

What workloads benefit from the data cache?

Enabling the data cache

Monitoring

Turbocharge your read-heavy workloads

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

How Specright uses Amazon QLDB to create a traceable supply chain network

Building a Clinical Intelligence Engine using MedLM augmented Clinical Knowledge Graphs

Customize Amazon SageMaker Studio using Lifecycle Configurations

POPULAR CATEGORY