Approximate statistical alignment by iterative sampling of substitution matrices
Joseph L. Herman, Adrienn Szab\'o, Instv\'an Mikl\'os, Jotun, Hein

TL;DR
This paper introduces an MCMC-based method for jointly sampling substitution matrices and sequence alignments, enabling automated parameter selection and generating more accurate alignments than standard methods.
Contribution
It presents a novel iterative sampling procedure for substitution matrices and alignments, improving alignment accuracy and parameter selection automation.
Findings
Sampled alignments with highest likelihood outperform BLOSUM62-based alignments.
The method efficiently generates alternative alignments with expected accuracy.
Automated parameter selection enhances alignment quality.
Abstract
We outline a procedure for jointly sampling substitution matrices and multiple sequence alignments, according to an approximate posterior distribution, using an MCMC-based algorithm. This procedure provides an efficient and simple method by which to generate alternative alignments according to their expected accuracy, and allows appropriate parameters for substitution matrices to be selected in an automated fashion. In the cases considered here, the sampled alignments with the highest likelihood have an accuracy consistently higher than alignments generated using the standard BLOSUM62 matrix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Bayesian Methods and Mixture Models
