Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries
Michael J. Steindorfer, Jurgen J. Vinju

TL;DR
This paper introduces HHAMT, a memory-efficient, type-safe, heterogeneous hash-array mapped trie for JVM-based immutable multi-maps, significantly reducing memory overhead and improving performance for large-scale graph and relation processing.
Contribution
It presents a novel framework for heterogeneous hash-array mapped tries that optimizes memory usage and supports type safety, enabling more efficient immutable multi-maps on the JVM.
Findings
Memory overhead reduced to 30B per key-value pair
Achieved 2x to 4x improvement in storage efficiency
Validated improvements with real-world static analysis case
Abstract
An immutable multi-map is a many-to-many thread-friendly map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied in static analysis of object-oriented systems. When processing such big data sets the memory overhead of the data structure encoding itself is a memory usage bottleneck. Motivated by reuse and type-safety, libraries for Java, Scala and Clojure typically implement immutable multi-maps by nesting sets as the values with the keys of a trie map. Like this, based on our measurements the expected byte overhead for a sparse multi-map per stored entry adds up to around 65B, which renders it unfeasible to compute with effectively on the JVM. In this paper we propose a general framework for Hash-Array Mapped Tries on the JVM which can store type-heterogeneous keys and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Algorithms and Data Compression
