Teaching the Burrows-Wheeler Transform via the Positional Burrows-Wheeler Transform
Travis Gagie, Giovanni Manzini, Marinella Sciortino

TL;DR
This paper advocates teaching the Positional Burrows-Wheeler Transform (PBWT) before the classic BWT to improve understanding, demonstrating how PBWT relates to efficient string searches and how it can lead to the BWT and FM-index.
Contribution
It introduces the PBWT as a pedagogical tool for explaining the BWT and FM-index, highlighting its relation to radix sort and string cyclic shifts for efficient pattern matching.
Findings
PBWT can be used for fast positional search on string sets
Prefix search is a special case of positional search
Cyclic shifts of a string's PBWT yield the BWT of the string
Abstract
The Burrows-Wheeler Transform (BWT) is often taught in undergraduate courses on algorithmic bioinformatics, because it underlies the FM-index and thus important tools such as Bowtie and BWA. Its admirers consider the BWT a thing of beauty but, despite thousands of pages being written about it over nearly thirty years, to undergraduates seeing it for the first time it still often seems like magic. Some who persevere are later shown the Positional BWT (PBWT), which was published twenty years after the BWT. In this paper we argue that the PBWT should be taught {\em before} the BWT. We first use the PBWT's close relation to a right-to-left radix sort to explain how to use it as a fast and space-efficient index for {\em positional search} on a set of strings (that is, given a pattern and a position, quickly list the strings containing that pattern starting in that position). We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Genomics and Phylogenetic Studies
