TL;DR
RadiK introduces a scalable GPU-based radix top-k selection algorithm that efficiently handles larger k values, outperforming previous methods in speed and robustness across various input sizes and distributions.
Contribution
The paper presents a novel GPU-parallel radix top-k selection method that surpasses existing merge-based approaches in scalability, efficiency, and robustness for large k and diverse input conditions.
Findings
Up to 2.5x speedup over prior methods for non-batch queries
Up to 4.8x speedup for batch queries
Achieves up to 2.7x speedup on adversarial inputs
Abstract
Top-k selection, which identifies the largest or smallest k elements from a data set, is a fundamental operation in data-intensive domains such as databases and deep learning, so its scalability and efficiency are critical for these high-performance systems. However, previous studies on its efficient GPU implementation are mostly merge-based and rely heavily on the fast but size-limited on-chip memory, thereby limiting the scalability with a restricted upper bound on k. This work introduces a scalable and optimized GPU-parallel radix top-k selection that supports significantly larger k values than existing methods without compromising efficiency, regardless of input length and batch size. Our method incorporates a novel optimization framework tailored for high memory bandwidth and resource utilization, achieving up to 2.5x speedup over the prior art for non-batch queries and up to 4.8x…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
