Applying Sorting Networks to Synthesize Optimized Sorting Libraries
Michael Codish, Lu\'is Cruz-Filipe, Markus Nebel, Peter, Schneider-Kamp

TL;DR
This paper demonstrates how sorting network theory can be used to synthesize optimized sorting libraries, achieving speed-ups by leveraging instruction-level parallelism and non-branching instructions in modern CPUs.
Contribution
It introduces a novel approach of applying sorting network synthesis to optimize general purpose sorting libraries, especially for modern CPU architectures.
Findings
Speed-ups achieved with network-based code in Quicksort
Instruction-level parallelism enhances performance
No comparison count advantage over traditional methods
Abstract
This paper shows an application of the theory of sorting networks to facilitate the synthesis of optimized general purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as the base case for small fixed numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching and results in code which is equivalent to a sorting network. This enables the application of further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that if considering the number of comparisons and swaps then theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
