TL;DR
This paper demonstrates that sorting rows and permuting columns in word-aligned bitmap indexes significantly reduces size and improves query performance, with additional insights into index construction and word length effects.
Contribution
It introduces effective row-reordering heuristics and column permutation strategies to optimize bitmap index compression and speed, along with algorithms for bitmap construction and aggregation.
Findings
Sorting reduces index size by a factor of 9
Column permutation increases sorting efficiency by 40%
64-bit indexes are slightly faster than 32-bit indexes on 64-bit CPUs
Abstract
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid (WAH) compression. These techniques are sensitive to the order of the rows: a simple lexicographical sort can divide the index size by 9 and make indexes several times faster. We investigate row-reordering heuristics. Simply permuting the columns of the table can increase the sorting efficiency by 40%. Secondary contributions include efficient algorithms to construct and aggregate bitmaps. The effect of word length is also reviewed by constructing 16-bit, 32-bit and 64-bit indexes. Using 64-bit CPUs, we find that 64-bit indexes are slightly faster than 32-bit indexes despite being nearly twice as large.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
