Optimal Hashing in External Memory
Alex Conway, Martin Farach-Colton, Philip Shilane

TL;DR
This paper introduces simpler, optimal external memory hash tables, including the BOA, BOT, and COBOT, which improve upon prior complex structures by enhancing efficiency and cache-obliviousness.
Contribution
The paper presents a new, simpler external memory hash table (BOA) and extends it to develop the BOT and COBOT, achieving optimality and cache-obliviousness.
Findings
BOA is nearly as simple and optimal for certain parameters.
BOT matches the performance of complex IP hash tables.
COBOT is the first cache-oblivious hash table with optimal performance.
Abstract
Hash tables are a ubiquitous class of dictionary data structures. However, standard hash table implementations do not translate well into the external memory model, because they do not incorporate locality for insertions. Iacono and Patracsu established an update/query tradeoff curve for external hash tables: a hash table that performs insertions in amortized IOs requires expected IOs for queries, where is the number of items that can be stored in the data structure, is the size of a memory transfer, is the size of memory, and is a tuning parameter. They provide a hashing data structure that meets this curve for that is . Their data structure, which we call an \defn{IP hash table}, is complicated and, to the best of our knowledge, has not been implemented. In this paper, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
