Wheeler maps
Andrej Bal\'az, Travis Gagie, Adri\'an Goga, Simon Heumos, Gonzalo, Navarro, Alessia Petescia, Jouni Sir\'en

TL;DR
This paper introduces Wheeler maps, a data structure generalizing Wheeler graphs, enabling efficient pattern matching and tag retrieval in texts, with applications in pangenomic read alignment.
Contribution
We propose Wheeler maps, a novel data structure that efficiently preprocesses texts and retrieves tags for pattern occurrences, improving pangenomic read alignment techniques.
Findings
Constructed Wheeler maps with space complexity O(g + r + t).
Achieved pattern preprocessing in O(m log n) time.
Retrieved distinct tags in optimal O(k) time.
Abstract
Motivated by challenges in pangenomic read alignment, we propose a generalization of Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text and an assignment of tags to the characters of such that we can preprocess a pattern and then, given and , quickly return all the distinct tags labeling the first characters of the occurrences of in . For the applications that most interest us, characters with long common contexts are likely to have the same tag, so we consider the number of runs in the list of tags sorted by their characters' positions in the Burrows-Wheeler Transform (BWT) of . We show how, given a straight-line program with rules for , we can build an -space Wheeler map, where is the number of runs in the BWT of , with which we can preprocess a pattern in time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression
