CacheLib, Facebook’s open source caching engine for web-scale services

By mullaned2002

September 2, 2021

1908

Caching plays an important role in helping people access their information efficiently. For example, when an email app loads, it temporarily caches some messages, so the user can refresh the page without the app retrieving the same messages. However, large-scale caching has long been a complex engineering challenge. Companies must balance the fast experience people have come to expect from caching with keeping systems highly performant and cost-effective. Traditionally, each cache implementation is created and maintained independently by different engineering teams. This approach isn’t efficient, since it ignores different caching systems’ shared challenges, from deployment to maintenance.

As traditional dynamic random-access memory (DRAM) caches become more expensive and require more power to scale, companies like Facebook are exploring hardware choices such as non-volatile memory (NVM) drives to augment their caching systems. This DRAM and NVM hybrid model is a step forward, but innovative caching designs are needed to harness the full potential of the hybrid cache. This includes new caching heuristics research that must push the boundaries of traditional systems by identifying the relevant content to cache for the right duration. We have consolidated these innovations and taken them a step further through collaborations and open source work.

Today, we’re announcing the release of CacheLib, a pluggable in-process caching engine to build and scale high-performance services collaboratively. CacheLib’s C++ library enables developers to build and customize scalable and concurrent caches through its simple API. We are also open-sourcing CacheBench, a benchmarking tool for evaluating caching performance on diverse production workloads.

CacheLib is leveraged as an in-process cache in more than 70 large-scale systems at Facebook, including the social graph, content delivery network, storage, and look-aside key-value caches. This existing scale and the potential for open source adoption make CacheLib an aggregation point for optimizations and CacheBench an effective benchmarking tool for evaluating new ideas across diverse caching applications.

Enabling innovation through partnerships

As an open source platform, CacheLib and CacheBench have the potential to become an industry standard for caching innovations and benchmarking. To date, our collaborations with research universities, hardware manufacturers, and software companies have yielded substantial results that show the value of this toolkit.

Over the past two years, we have partnered with many well-known organizations to push the boundaries of caching innovation. Today, we are working with Twitter on integrating CacheLib into Pelikan.io to enable SSDs for caching objects within the Twitter infrastructure. Pinterest is evaluating the adoption of CacheLib within its machine learning infrastructure systems to improve prediction performance and system stability.

In academia, researchers at Carnegie Mellon University, Princeton University, and Yale University are using CacheLib and CacheBench to prototype research ideas. By evaluating their prototypes against industry caching workloads, these researchers can iterate on their projects much more quickly and accurately than before.

We have also collaborated with hardware industry partners like Intel, KIOXIA, Samsung, and Western Digital to standardize and enhance SSD technologies which enable improved caching solutions. This work is now part of the Open Compute Project (OCP) NVMe Cloud SSD Specification, which we discussed in this webinar. This specification, along with CacheLib, will help adapt future NVM technologies for caching workloads across the industry.

CacheLib and CacheBench have enormous potential to shape the future of caching, thanks to its developer-friendly API, access to many benchmark workloads across the industry, and the collaborative nature of open source. We are thankful for our partners’ support and contributions in using the platform to drive innovation in such an important and complex area. We are open-sourcing this work in an effort to make building the future of caching a more collaborative and open space for sharing across the entire industry. Read more about how to contribute to the project here.

The post CacheLib, Facebook’s open source caching engine for web-scale services appeared first on Facebook Engineering.

CacheLib, Facebook’s open source caching engine for web-scale services

Enabling innovation through partnerships

Building new custom silicon for Meta’s AI workloads

Building an infrastructure for AI’s future

Introducing the next-gen Meta Training and Inference Accelerator

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 18

GQL: The ISO standard for graphs has arrived

Set up notifications for Amazon RDS pending maintenance actions

Enhance PostgreSQL database security using hooks with Trusted Language Extensions

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Technology To Help Make Smarter Business Decisions

Create powerful self-service experiences with Amazon Lex on Talkdesk CX Cloud contact center

AI/ML Applied to Software Testing Improving Speed, Accuracy

POPULAR CATEGORY