Faster Wavelet Tree Queries
Matteo Ceregini, Florian Kurpicz, Rossano Venturini

TL;DR
This paper introduces optimized wavelet tree techniques that significantly accelerate rank and select queries by restructuring the tree and employing predictive models to reduce cache misses, enhancing performance in text indexing applications.
Contribution
The paper proposes a novel 4-ary wavelet tree structure and a predictive prefetching approach to double or triple query speeds over existing implementations.
Findings
Queries are up to three times faster than standard wavelet trees.
The 4-ary tree structure reduces cache misses significantly.
Predictive models effectively prefetch data for rank queries.
Abstract
Given a text, rank and select queries return the number of occurrences of a character up to a position (rank) or the position of a character with a given rank (select). These queries have applications in, e.g., compression, computational geometry, and most notably pattern matching in the form of the backward search -- the backbone of many compressed full-text indices. Currently, in practice, for text over non-binary alphabets, the wavelet tree is probably the most used data structure for rank and select queries. In this paper, we present techniques to speed up queries by a factor of two (access and select) up to three (rank), compared to the wavelet tree implementation contained in the widely used Succinct Data Structure Library (SDSL). To this end, we change the underlying tree structure from a binary tree to a 4-ary tree and reduce cache misses by approximating rank queries using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · Advanced Data Compression Techniques
