# Even faster sorting of (not only) integers

**Authors:** Marek Kokot, Sebastian Deorowicz, Maciej Dlugosz

arXiv: 1703.00687 · 2017-03-03

## TL;DR

This paper presents RADULS2, a highly optimized parallel radix sorter for large data on multicore CPUs, featuring novel handling of tiny arrays and improved memory usage.

## Contribution

Introduction of RADULS2, a new parallel radix sorting algorithm optimized for tiny arrays and efficient memory utilization on modern multicore processors.

## Key findings

- RADULS2 outperforms existing sorters in speed.
- Efficient handling of tiny subarrays enhances overall performance.
- Optimized memory processing reduces latency.

## Abstract

In this paper we introduce RADULS2, the fastest parallel sorter based on radix algorithm. It is optimized to process huge amounts of data making use of modern multicore CPUs. The main novelties include: extremely optimized algorithm for handling tiny arrays (up to about a hundred of records) that could appear even billions times as subproblems to handle and improved processing of larger subarrays with better use of non-temporal memory stores.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.00687/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1703.00687/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1703.00687/full.md

---
Source: https://tomesphere.com/paper/1703.00687