Probabilistic Approaches to Alignment with Tandem Repeats
Michal N\'an\'asi, Tom\'a\v{s} Vina\v{r}, and Bro\v{n}a Brejov\'a

TL;DR
This paper introduces a pair hidden Markov model that effectively incorporates short tandem repeats into sequence alignment, improving accuracy over traditional models through novel decoding algorithms.
Contribution
It presents a new tractable pair HMM framework for aligning sequences with tandem repeats and develops specialized decoding algorithms for enhanced accuracy.
Findings
Our model outperforms classical three-state pair HMM in simulations.
Decoding algorithms based on gain functions improve alignment accuracy.
Block-based decoding algorithms are effective for tandem repeat regions.
Abstract
We propose a simple tractable pair hidden Markov model for pairwise sequence alignment that accounts for the presence of short tandem repeats. Using the framework of gain functions, we design several optimization criteria for decoding this model and describe the resulting decoding algorithms, ranging from the traditional Viterbi and posterior decoding to block-based decoding algorithms specialized for our model. We compare the accuracy of individual decoding algorithms on simulated data and find our approach superior to the classical three-state pair HMM in simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · RNA and protein synthesis mechanisms
