LERC: Coordinated Cache Management for Data-Parallel Systems
Yinghao Yu, Wei Wang, Jun Zhang, Khaled B. Letaief

TL;DR
This paper introduces LERC, a cache management policy that improves data-parallel task speedup by focusing on caching all dependent data blocks together, outperforming traditional hit ratio-based methods.
Contribution
The paper proposes the effective cache hit ratio metric and the LERC policy, which caches dependent data blocks as a whole to enhance task completion times in data-parallel systems.
Findings
LERC improves job speedup by up to 37% over LRU.
Effective cache hit ratio correlates better with task performance.
LERC is implemented in Spark and evaluated on Amazon EC2.
Abstract
Memory caches are being aggressively used in today's data-parallel frameworks such as Spark, Tez and Storm. By caching input and intermediate data in memory, compute tasks can witness speedup by orders of magnitude. To maximize the chance of in-memory data access, existing cache algorithms, be it recency- or frequency-based, settle on cache hit ratio as the optimization objective. However, unlike the conventional belief, we show in this paper that simply pursuing a higher cache hit ratio of individual data blocks does not necessarily translate into faster task completion in data-parallel environments. A data-parallel task typically depends on multiple input data blocks. Unless all of these blocks are cached in memory, no speedup will result. To capture this all-or-nothing property, we propose a more relevant metric, called effective cache hit ratio. Specifically, a cache hit of a data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
