VIP Hashing -- Adapting to Skew in Popularity of Data on the Fly (extended version)
Aarati Kakaraparthy, Jignesh M. Patel, Brian P. Kroth, Kwanghyun Park

TL;DR
VIP hashing is an online, adaptive hash table method that leverages data popularity skew to significantly improve performance in in-memory settings, especially under varying workloads.
Contribution
It introduces a lightweight, non-blocking, adaptive hashing technique that dynamically learns and responds to data popularity skew in real-time.
Findings
Achieves up to 77% throughput increase under medium skew.
Reduces TPC-H Query 9 execution time by 20%.
Robust to workload changes and data insertions/deletions.
Abstract
All data is not equally popular. Often, some portion of data is more frequently accessed than the rest, which causes a skew in popularity of the data items. Adapting to this skew can improve performance, and this topic has been studied extensively in the past for disk-based settings. In this work, we consider an in-memory data structure, namely hash table, and show how one can leverage the skew in popularity for higher performance. Hashing is a low-latency operation, sensitive to the effects of caching, branch prediction, and code complexity among other factors. These factors make learning in-the-loop especially challenging as the overhead of performing any additional operations can be significant. In this paper, we propose VIP hashing, a fully online hash table method, that uses lightweight mechanisms for learning the skew in popularity and adapting the hash table layout. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Advanced Image and Video Retrieval Techniques
