QuickLexSort: An efficient algorithm for lexicographically sorting nested restrictions of a database
David Haws

TL;DR
QuickLexSort is a novel lexicographical sorting algorithm that sorts from most to least important features, offering significant performance improvements for nested data structures in databases and related applications.
Contribution
The paper introduces QuickLexSort, a new lexicographical sorting method that refines sorting order from most to least important features, outperforming traditional stable sort methods in nested data scenarios.
Findings
Comparable runtime to stable sort for single lexicographical sorts
Performance improvement by a log factor for nested sub-matrix sorting
Linear runtime in the number of nested sub-matrices after pre-processing
Abstract
Lexicographical sorting is a fundamental problem with applications to contingency tables, databases, Bayesian networks, and more. A standard method to lexicographically sort general data is to iteratively use a stable sort -- a sort which preserves existing orders. Here we present a new method of lexicographical sorting called QuickLexSort. Whereas a stable sort based lexicographical sorting algorithm operates from the least important to most important features, in contrast, QuickLexSort sorts from the most important to least important features, refining the sort as it goes. QuickLexSort first requires a one-time modest pre-processing step where each feature of the data set is sorted independently. When lexicographically sorting a database, QuickLexSort (including pre-processing) has comparable running time to using a stable sort based approach. For a data base with rows and …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Algorithms and Data Compression · Data Management and Algorithms
