Sorting Data on Ultra-Large Scale with RADULS. New Incarnation of Radix Sort
Marek Kokot, Sebastian Deorowicz, Agnieszka Debudaj-Grabysz

TL;DR
RADULS is a highly optimized, parallel radix sort algorithm capable of efficiently sorting ultra-large data sets on multicore architectures, outperforming existing methods.
Contribution
The paper presents RADULS, a novel parallel radix sort implementation optimized for modern multicore systems, demonstrating superior performance on large-scale data.
Findings
Sorts 4 billion 16-byte records in under 15 seconds using 16 threads.
RADULS outperforms competing sorting algorithms in experiments.
Parallel scheduling and cache optimization are key to RADULS's efficiency.
Abstract
The paper introduces RADULS, a new parallel sorter based on radix sort algorithm, intended to organize ultra-large data sets efficiently. For example 4G 16-byte records can be sorted with 16 threads in less than 15 seconds on Intel Xeon-based workstation. The implementation of RADULS is not only highly optimized to gain such an excellent performance, but also parallelized in a cache friendly manner to make the most of modern multicore architectures. Besides, our parallel scheduler launches a few different procedures at runtime, according to the current parameters of the execution, for proper workload management. All experiments show RADULS to be superior to competing algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Medical Image Segmentation Techniques
