Multi-step LRU: SIMD-based Cache Replacement for Lower Overhead and Higher Precision
Hiroshi Inoue

TL;DR
This paper introduces multi-step LRU, a SIMD-based cache replacement algorithm that enhances cache hit ratios and throughput by efficiently managing cache items without additional metadata, outperforming traditional algorithms.
Contribution
The paper proposes a novel multi-step LRU algorithm that improves cache accuracy and performance using SIMD instructions without extra per-item memory overhead.
Findings
Multi-step LRU outperforms original LRU and GCLOCK in speed and cache hit ratio.
It implicitly considers access frequency and recency, improving cache effectiveness.
Results are comparable to ARC, with less metadata overhead.
Abstract
A key-value cache is a key component of many services to provide low-latency and high-throughput data accesses to a huge amount of data. To improve the end-to-end performance of such services, a key-value cache must achieve a high cache hit ratio with high throughput. In this paper, we propose a new cache replacement algorithm, multi-step LRU, which achieves high throughput by efficiently exploiting SIMD instructions without using per-item additional memory (LRU metadata) to record information such as the last access timestamp. For a small set of items that can fit within a vector register, SIMD-based LRU management without LRU metadata is known (in-vector LRU). It remembers the access history by reordering items in one vector using vector shuffle instruction. In-vector LRU alone cannot be used for a caching system since it can manage only few items. Set-associative cache is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Network Packet Processing and Optimization
