FLiMS: a Fast Lightweight 2-way Merger for Sorting
Philippos Papaphilippou, Wayne Luk, Chris Brooks

TL;DR
FLiMS is a novel, resource-efficient parallel merging algorithm for FPGAs and CPUs that enhances sorting performance and resource utilization across various applications.
Contribution
Introduces FLiMS, a simple, high-performance merging algorithm optimized for FPGA and CPU implementations, reducing hardware resources and improving throughput.
Findings
Uses fewer hardware resources than existing methods
Achieves higher performance with the same parallelism
Performs well as software on modern CPUs
Abstract
In this paper, we present FLiMS, a highly-efficient and simple parallel algorithm for merging two sorted lists residing in banked and/or wide memory. On FPGAs, its implementation uses fewer hardware resources than the state-of-the-art alternatives, due to the reduced number of comparators and elimination of redundant logic found on prior attempts. In combination with the distributed nature of the selector stage, a higher performance is achieved for the same amount of parallelism or higher. This is useful in many applications such as in parallel merge trees to achieve high-throughput sorting, where the resource utilisation of the merger is critical for building large trees and internalising the workload for fast computation. Also presented are efficient variations of FLiMS for optimizing throughput for skewed datasets, achieving stable sorting or using fewer dequeue signals.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
