Spanner: A differentiated database for non-relational workloads

By mullaned2002

January 19, 2024

86

Spanner is a fully managed database service for both relational and non-relational workloads that offers strong consistency at global scale, high performance at virtually unlimited scale, and high availability with an up to five 9s SLA.

While Spanner is renowned for its relational capabilities, it is also a versatile key-value database that can be used to store and retrieve non-relational data via read and write APIs. In fact, a significant portion of internal Spanner usage at Google is non-relational.

Spanner was developed by Google to address the challenge of achieving synchronous replication and consistency while enabling virtually unlimited scaling. While relational workloads benefit from strong reads that guarantee the latest version of the data, non-relational workloads often utilizestale readsthat can be served by localread replicas.

Recently, we announced announced a50% increase in throughputand 2.5x more storage per node (up to 10TB) for Spanner, in addition to reduced latencies. These enhancements make Spanner an even more compelling option for NoSQL workloads.

How Google uses Spanner internally

Spanner is used extensively at Google by numerous projects, such asGoogle Photos,Google Ads, and Gmail. In total, Spanner serves over 3 billion read/write requests per second at peak. While some projects employ complex SQL and transactions, the vast majority of workloads are primarily key lookups. For such workloads, Spanner is chosen due to its high performance, customizability, and scalability.

Spanner as a NoSQL database

Spanner can be used as a feature-rich RDBMS with relational semantics. However, these relational features are built on top of an equally powerful non-relational platform. For example, Spanner’ssplitsarchitecture demonstrates how it can provide write-scaling. Furthermore, Spanner’s support forJSONallows for more versatile use cases typically provided by document databases.

A question that frequently arises is why today’s data requires strong consistency. In reality, when workloads migrated from legacy databases to NoSQL databases, the requirement for consistency never went away. Instead, applications are now expected to handle inconsistencies in data, a task that was previously performed by databases. Customers had no choice but to accept an additional application complexity for the sake of the mandatory scalability requirement based on data size. With Spanner, customers no longer have to make a trade-off between scalability and consistency — they can have both.

Cost is a primary concern for all customers. Spanner offers a cost-effective starting point of $65 per month (varying by region), which can be further reduced to $45 per month withCommitted Use Discounts(CUDs). While this is higher than the free entry point offered by some non-relational databases, the reality is that most enterprise workloads demand significantly more resources. In such scenarios, the price-performance ratio of the database becomes more important than the minimum entry cost. Spanner also provides afree tierfor those who simply want to try it out, along with a localemulatorthat allows customers to develop directly without incurring additional costs.

Customers

Many Spanner customers are running non-relational workloads on Spanner today. Here’s a sampling:

Uber

Uber’s previous infrastructure, based on Cassandra and Ringpop, presented various challenges, including low developer productivity and leaky abstractions between the database and application layers. Notably, Cassandra’s lack of consistency forced developers to “think about compensating actions especially when the writes fail due to system failures. In some cases, the failure of [a system] might also result in an inconsistent state of entities that often required manual intervention,” increasing costs and operational overhead.Read more about their engineering journey from theirUber Engineering blog.

Sharechat

Sharechat, a leading social media platform in India, migrated their non-relational workload from a NoSQL database to Spanner. They were particularly impressed with the similarity between Spanner’s schema system and their previous NoSQL database, which enabled them to perform a no-downtime migration. Additionally, they were able to achieve significant cost savings by moving to Spanner.

“Unlike our legacy NoSQL database, we could scale without having to rethink existing tables or schema definitions and keep our data systems in sync across multiple locations. It’s also cost-effective for us — moving over 120 tables with 17 indexes into Cloud Spanner reduced our costs by 30%.” – Bhanu Singh, Co-founder and CTO at ShareChat.

Read more about ShareChat’s migration story.

Niantic

Niantic, the creators of the popular mobile game Pokémon GO, migrated their non-relational workload from Cloud Datastore (a non-relational datastore) to Cloud Spanner. “As the game matured, we decided we needed more control over the size and scale of the database,” said James Prompanya, Senior Staff Software Engineer at Niantic. “We also like the consistent indexing that Cloud Spanner provides.” Spanner’s consistent secondary indexes provided Niantic with a significant performance boost. Secondary indexes are used to improve the performance of queries that involve filtering or sorting on data. In some non-relational databases, secondary indexes are eventually consistent, which means that they may not be immediately up-to-date. This can lead to latency issues for applications.

Spanner vs. non-relational database concepts

Let’s take a closer look at how non-relational concepts translate to Spanner concepts:

Non-relational

Spanner

Table

Items

Rows

Attributes

Can be modeled in a number of ways:

Defining columns in schema is the most idiomatic approach in Spanner

JSON column is the closest equivalent

Interleaved table with key-value pairs

Primary Key

Primary Key: determines both both the partitioning of the data across nodes and sort order of the data within each node

Secondary Indexes

Usually non-relational systems provide two types of indexes:

Local secondary indexes, which can be modeled asinterleaved secondary indexes.

Global secondary index is equivalent to asecondary indexin Spanner.

Spanner lets users store non-key columns in the index, speeding up common read requests.

It is important to note thatall indices are always transactionally consistentin Spanner.

Sparse Index

NULL-Filtered Indicesallow users to omit NULL rows from the index. When used in conjunction with generated columns, this offers a robust method for excluding a specific set of rows from the index.

Streams

Change Streams

API

Non-relational

Spanner

Control plane APIs like CreateTable

Schema Changes

Custom Query Languages

SQL (ANSI SQL or PG) with hints like FORCE_INDEX to specify a particular index usage

PutItem, BatchWriteIem

Transaction

UpdateItem

Read-write transaction. In Spanner all writes provide ACID guarantees.

GetItem, BatchGetItem, Query, Scan

Read API. Stale reads can be used to improve performance where the most recent data is not required. Stale reads are still consistent as of some prior timestamp.

Data types

Spanner provides scalar types that are largely the same as other storage systems across the industry. In addition, Spanner natively supports dynamic arrays of any supported type. Document-like scenarios can use Spanner’s JSON type.

Try Spanner for non-relational workloads today

Spanner is a versatile database that can be used for both relational and non-relational workloads. Spanner’s core concepts are similar to those of non-relational databases, but Spanner offers a number of advantages, such as stronger consistency guarantees, global secondary indexes, support for transactions, and SQL as a query language. These advantages make Spanner a good choice for a wide range of applications.

Interested in trying out Spanner for your non-relational workload? Get started forfree!

Cloud BlogRead More

Previous articleUnleashing the Power of AI for Government and Education: Join Public Sector GenAI Live & Labs in NYC!

Next articleFiguring out microservices running on your GKE cluster with help from Duet AI

Spanner: A differentiated database for non-relational workloads

How Google uses Spanner internally

Spanner as a NoSQL database

Customers

Spanner vs. non-relational database concepts

API

Try Spanner for non-relational workloads today

Simplifying data modeling and schema generation in BigQuery using multi-modal LLMs

Uncomplicating the complex: How Spanner simplifies microservices-based architectures

Scalable multi-tenancy management with Config Sync and team scopes

LEAVE A REPLY Cancel reply

Most Popular

Introducing configurable maximum throughput for Amazon DynamoDB on-demand

Simplifying data modeling and schema generation in BigQuery using multi-modal LLMs

Scalable multi-tenancy management with Config Sync and team scopes

Uncomplicating the complex: How Spanner simplifies microservices-based architectures

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Protecting customers against cryptomining threats with VM Threat Detection in Security Command Center

Has your lift and shift been a clunker in the cloud?

How APIs bolstered the success of the Saudi healthcare system through the pandemic

POPULAR CATEGORY