Combination generators with optimal cache utilization and communication free parallel execution
Xi He, Max. A. Little

TL;DR
This paper presents a novel combination generator that achieves optimal cache utilization, constant amortized time, and parallel execution, suitable for exhaustive and combinatorial optimization tasks, with formal correctness guarantees.
Contribution
It introduces a new recursive, efficient combination generator with optimal properties and extends the approach to nested structures, a novel contribution in the field.
Findings
Achieves constant amortized time for combination generation.
Ensures optimal cache utilization and parallelizability.
Extends to nested permutations and combinations with Gray code orderings.
Abstract
We introduce an efficient and elegant combination generator for producing all combinations of size less than or equal to K, designed for exhaustive generation and combinatorial optimization tasks. This generator can be implemented to achieve what we define as optimal efficiency: constant amortized time, optimal cache utilization, embarrassingly parallel execution, and a recursive structure compatible with pruning-based search. These properties are difficult to satisfy simultaneously in existing generators. For example, classical Gray code or lexicographic generators are typically list-based and sequentially defined, making them difficult to vectorized, inefficient in cache usage, and inherently hard to parallelize. Generators based on unranking methods, while easy to parallelize, are non-recursive. These limitations reduce their applicability in our target applications, where both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
