DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching
Zaoxing Liu, Zhihao Bai, Zhenming Liu, Xiaozhou Li, Changhoon Kim,, Vladimir Braverman, Xin Jin, Ion Stoica

TL;DR
DistCache is a novel distributed caching system that achieves provable load balancing and scalable throughput in large-scale storage systems by combining cache partitioning, adaptive query routing, and theoretical guarantees.
Contribution
It introduces a co-designed cache allocation and routing mechanism that provably balances load and scales linearly with cache nodes in distributed storage environments.
Findings
Cache throughput increases linearly with cache nodes.
DistCache effectively balances load across cache nodes.
The system is applicable to various storage architectures.
Abstract
Load balancing is critical for distributed storage to meet strict service-level objectives (SLOs). It has been shown that a fast cache can guarantee load balancing for a clustered storage system. However, when the system scales out to multiple clusters, the fast cache itself would become the bottleneck. Traditional mechanisms like cache partition and cache replication either result in load imbalance between cache nodes or have high overhead for cache coherence. We present DistCache, a new distributed caching mechanism that provides provable load balancing for large-scale storage systems. DistCache co-designs cache allocation with cache topology and query routing. The key idea is to partition the hot objects with independent hash functions between cache nodes in different layers, and to adaptively route queries with the power-of-two-choices. We prove that DistCache enables the cache…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Cloud Computing and Resource Management
