RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs
Xi Xie, Yuebo Luo, Hongwu Peng, and Caiwen Ding

TL;DR
RTop-K is a GPU-optimized, binary search-based row-wise top-k selection algorithm that significantly accelerates neural network training workflows while maintaining accuracy.
Contribution
It introduces RTop-K, a novel parallel row-wise top-k selection method for GPUs that outperforms existing solutions in speed and efficiency.
Findings
Achieves up to 11.49× speed-up over state-of-the-art methods.
Maintains neural network accuracy with early stopping.
Accelerates MaxK-GNN training by up to 33.29%.
Abstract
Top-k selection algorithms are fundamental in a wide range of applications, including high-performance computing, information retrieval, big data processing, and neural network model training. In this paper, we present RTop-K, a highly efficient parallel row-wise top-k selection algorithm specifically designed for GPUs. RTop-K leverages a binary search-based approach to optimize row-wise top-k selection, providing a scalable and accelerated solution. We conduct a detailed analysis of early stopping in our algorithm, showing that it effectively maintains the testing accuracy of neural network models while substantially improving performance. Our GPU implementation of RTop-K demonstrates superior performance over state-of-the-art row-wise top-k GPU implementations, achieving an average speed-up of up to 11.49 with early stopping and 7.29 without early stopping. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification
