VIP Hashing -- Adapting to Skew in Popularity of Data on the Fly   (extended version)

Aarati Kakaraparthy; Jignesh M. Patel; Brian P. Kroth; Kwanghyun Park

arXiv:2206.12380·cs.DB·June 27, 2022·1 cites

VIP Hashing -- Adapting to Skew in Popularity of Data on the Fly (extended version)

Aarati Kakaraparthy, Jignesh M. Patel, Brian P. Kroth, Kwanghyun Park

PDF

Open Access

TL;DR

VIP hashing is an online, adaptive hash table method that leverages data popularity skew to significantly improve performance in in-memory settings, especially under varying workloads.

Contribution

It introduces a lightweight, non-blocking, adaptive hashing technique that dynamically learns and responds to data popularity skew in real-time.

Findings

01

Achieves up to 77% throughput increase under medium skew.

02

Reduces TPC-H Query 9 execution time by 20%.

03

Robust to workload changes and data insertions/deletions.

Abstract

All data is not equally popular. Often, some portion of data is more frequently accessed than the rest, which causes a skew in popularity of the data items. Adapting to this skew can improve performance, and this topic has been studied extensively in the past for disk-based settings. In this work, we consider an in-memory data structure, namely hash table, and show how one can leverage the skew in popularity for higher performance. Hashing is a low-latency operation, sensitive to the effects of caching, branch prediction, and code complexity among other factors. These factors make learning in-the-loop especially challenging as the overhead of performing any additional operations can be significant. In this paper, we propose VIP hashing, a fully online hash table method, that uses lightweight mechanisms for learning the skew in popularity and adapting the hash table layout. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Advanced Image and Video Retrieval Techniques