Small Coupling Expansion for Multiple Sequence Alignment
Louise Budzynski, Andrea Pagnani

TL;DR
This paper introduces a novel sequence alignment algorithm that incorporates long-range correlations using a small-coupling expansion, improving upon traditional profile models that assume site independence.
Contribution
The authors develop a message passing-based alignment method employing a perturbative small-coupling expansion to account for long-range correlations in biological sequences.
Findings
Outperforms standard alignment strategies on biological sequences
Effectively captures long-range correlations in sequence data
Provides a new framework for sequence alignment beyond profile models
Abstract
The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional/structural characterizations between homologous sequences in different organisms. Typically, state-of-the-art bioinformatics tools are based on profile models that assume the statistical independence of the different sites of the sequences. Over the last years, it has become increasingly clear that homologous sequences show complex patterns of long-range correlations over the primary sequence as a consequence of the natural evolution process that selects genetic variants under the constraint of preserving the functional/structural determinants of the sequence. Here, we present a new alignment algorithm based on message passing techniques that overcomes the limitations of profile models. Our method is based on a new perturbative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
MethodsTest
