Kochava is a real-time data company offering leading omni-channel measurement and attribution solutions for data-driven marketers and publishers.
Via the Marketers Operating System™ (m/OS), Kochava delivers actionable data, providing marketers with a platform that seamlessly integrates and manages customer identity, measurement and data controls. With a unified view of all data and critical omni-channel solutions in a cohesive, operational system, the platform goes beyond data aggregation and reporting. By design, m/OS facilitates success by making data accessible and actionable to maximize ROI.
Kochava delivers what marketers need, when they need it, to establish customer identity and segment and activate audiences in a privacy-first world.
Kochava’s journey on Cloud Spanner
Kochava has thousands of customers, some of which process millions of clicks and impressions per minute per customer. Depending on the type of ad signal, type of customer, and exact product being used, this data was being processed and stored in on-prem MySQL or Aerospike clusters in our vertically-scaled legacy systems. The features available to the customer differed depending on which system was used, creating an inconsistent user experience for our customers. The legacy database created bottlenecks in the system’s performance during peak periods. Such bottlenecks could not be trivially scaled out of. We needed a single consolidated solution that could facilitate all our current and planned features across various categories, and could scale to meet our real-time processing requirements. We chose Cloud Spanner.
Multi-Tenant Account Creation Flow
At Kochava we require each customer’s data to be stored in their own database. To achieve this, we automatically create Spanner resources while provisioning a Kochava customer account. When an account is created we look for an existing Spanner instance with space for another database. If one is not available, we create a new Spanner instance. When we create a new Spanner instance, we also create autoscaling resources and account-specific Service accounts to access the databases.
After an account has been created along with its corresponding Spanner resources, it’s time to ingest and process data. Ad signal data is processed in real time for each account and is stored in Cloud Spanner for real-time attribution and BigQuery for reporting. Other systems that need to look up real-time ad signal data for event attribution do real-time reads on Cloud Spanner via the Ads Storage API.
Benefits of choosing Cloud Spanner
Ad Signal data was previously stored in MySQL, Aerospike, and Spanner. Each storage engine has its own schema, management, and applications. Each of these areas have been simplified by consolidating all real-time access storage to Spanner. This is especially useful during on-call and training because there are fewer systems involved. This consolidation greatly reduces the time needed to maintain and develop product features across multiple systems.
As part of our research we used an upcoming product with similar throughput and access patterns (as our future ad signal system migration project) as a testbed for database candidates. We tried various databases, and Spanner was the easiest to implement, had the best performance, and offered the best managed service. After launching and running this project in production for two years, we are comfortable with the technology and confident it can provide what we need for our big migration project.
Part of the performance testing included scaling up and down to keep costs low and at the same time keeping latencies low enough and throughput high enough for all our data processing pipeline across multiple systems in real time without any down-time. All of this while being fully ACID-compliant means we can trust Spanner to store our data.
We use MySQL for a large variety of our databases. Our developers know how to write SQL, and one of the databases we are directly migrating from is MySQL-based. Having a familiar query language we already have knowledge of is a boost to our developer productivity. With Aerospike for example, we had to have internal Aerospike experts who provided guidance for the rest of the team. Cloud Spanner is ANSI-SQL compliant so our developers are able to self-serve.
As our number of customers grew (with each customer having unpredictable traffic), we ran into scalability issues with our legacy databases. Sometimes we needed to spin up a new cluster urgently or had to beef-up an existing cluster without any down time. This involved Operations Engineers making trips to the data center and DBAs swapping database clusters or migrating data regularly. If we need a new Spanner instance, we can just create a new one using Terraform. Spanner instances can horizontally scale by configuring the number of nodes they use. This scaling can be done dynamically based on load by using a Spanner autoscaler. The autoscaler allows us to optimize cost by closely tracking workload needs and we no longer need to provision for peak load all the time. This has allowed us to reduce consumption by 30% during periods of low usage. We can focus our energy on using the database and building applications rather than scrambling to make things work when our legacy databases reach scalability limits.
Spanner supports many enterprise grade features like time to live (TTL). With TTL, data older than a certain threshold is automatically deleted from the tables. We no longer need to run background programs checking each row to determine what can be deleted in order to keep our tables small and our storage costs low.
Join the future
With Spanner’s unique features, Kochava is able to power the future of mobile app analytics by providing customers with predictable performance, infrastructure that can scale to customers’ demands, and a consistent user experience. Try Kochava for yourself here!
Get started today with Cloud Spanner’s free trial!
Cloud BlogRead More