Cloud Spanner powers Kochava’s mobile analytics platform

By mullaned2002

March 10, 2023

353

Kochava is a real-time data company offering leading omni-channel measurement and attribution solutions for data-driven marketers and publishers.

Via the Marketers Operating System™ (m/OS), Kochava delivers actionable data, providing marketers with a platform that seamlessly integrates and manages customer identity, measurement and data controls. With a unified view of all data and critical omni-channel solutions in a cohesive, operational system, the platform goes beyond data aggregation and reporting. By design, m/OS facilitates success by making data accessible and actionable to maximize ROI.

Kochava delivers what marketers need, when they need it, to establish customer identity and segment and activate audiences in a privacy-first world.

Kochava’s journey on Cloud Spanner

Kochava has thousands of customers, some of which process millions of clicks and impressions per minute per customer. Depending on the type of ad signal, type of customer, and exact product being used, this data was being processed and stored in on-prem MySQL or Aerospike clusters in our vertically-scaled legacy systems. The features available to the customer differed depending on which system was used, creating an inconsistent user experience for our customers. The legacy database created bottlenecks in the system’s performance during peak periods. Such bottlenecks could not be trivially scaled out of. We needed a single consolidated solution that could facilitate all our current and planned features across various categories, and could scale to meet our real-time processing requirements. We chose Cloud Spanner.

Architecture

Multi-Tenant Account Creation Flow

At Kochava we require each customer’s data to be stored in their own database. To achieve this, we automatically create Spanner resources while provisioning a Kochava customer account. When an account is created we look for an existing Spanner instance with space for another database. If one is not available, we create a new Spanner instance. When we create a new Spanner instance, we also create autoscaling resources and account-specific Service accounts to access the databases.

After an account has been created along with its corresponding Spanner resources, it’s time to ingest and process data. Ad signal data is processed in real time for each account and is stored in Cloud Spanner for real-time attribution and BigQuery for reporting. Other systems that need to look up real-time ad signal data for event attribution do real-time reads on Cloud Spanner via the Ads Storage API.

Benefits of choosing Cloud Spanner

Database consolidation

Ad Signal data was previously stored in MySQL, Aerospike, and Spanner. Each storage engine has its own schema, management, and applications. Each of these areas have been simplified by consolidating all real-time access storage to Spanner. This is especially useful during on-call and training because there are fewer systems involved. This consolidation greatly reduces the time needed to maintain and develop product features across multiple systems.

Battle-tested performance

As part of our research we used an upcoming product with similar throughput and access patterns (as our future ad signal system migration project) as a testbed for database candidates. We tried various databases, and Spanner was the easiest to implement, had the best performance, and offered the best managed service. After launching and running this project in production for two years, we are comfortable with the technology and confident it can provide what we need for our big migration project.

Part of the performance testing included scaling up and down to keep costs low and at the same time keeping latencies low enough and throughput high enough for all our data processing pipeline across multiple systems in real time without any down-time. All of this while being fully ACID-compliant means we can trust Spanner to store our data.

SQL

We use MySQL for a large variety of our databases. Our developers know how to write SQL, and one of the databases we are directly migrating from is MySQL-based. Having a familiar query language we already have knowledge of is a boost to our developer productivity. With Aerospike for example, we had to have internal Aerospike experts who provided guidance for the rest of the team. Cloud Spanner is ANSI-SQL compliant so our developers are able to self-serve.

Fully-managed

As our number of customers grew (with each customer having unpredictable traffic), we ran into scalability issues with our legacy databases. Sometimes we needed to spin up a new cluster urgently or had to beef-up an existing cluster without any down time. This involved Operations Engineers making trips to the data center and DBAs swapping database clusters or migrating data regularly. If we need a new Spanner instance, we can just create a new one using Terraform. Spanner instances can horizontally scale by configuring the number of nodes they use. This scaling can be done dynamically based on load by using a Spanner autoscaler. The autoscaler allows us to optimize cost by closely tracking workload needs and we no longer need to provision for peak load all the time. This has allowed us to reduce consumption by 30% during periods of low usage. We can focus our energy on using the database and building applications rather than scrambling to make things work when our legacy databases reach scalability limits.

Enterprise features

Spanner supports many enterprise grade features like time to live (TTL). With TTL, data older than a certain threshold is automatically deleted from the tables. We no longer need to run background programs checking each row to determine what can be deleted in order to keep our tables small and our storage costs low.

Join the future

With Spanner’s unique features, Kochava is able to power the future of mobile app analytics by providing customers with predictable performance, infrastructure that can scale to customers’ demands, and a consistent user experience. Try Kochava for yourself here!

Get started today with Cloud Spanner’s free trial!

Cloud BlogRead More

Previous articleUsing ML to predict the weather and climate risk

Next articleAdopting SRE: Standardizing your SLO design process

Cloud Spanner powers Kochava’s mobile analytics platform

Kochava’s journey on Cloud Spanner

Architecture

Benefits of choosing Cloud Spanner

Database consolidation

Battle-tested performance

SQL

Fully-managed

Enterprise features

Join the future

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Two networking patterns for secure intra-cloud access

How LotteON built a personalized recommendation system using Amazon SageMaker and MLOps

How Getir build a comprehensive fraud detection system using Amazon Neptune and Amazon DynamoDB

POPULAR CATEGORY