Sequence-structure relations of biopolymers
Christopher Barrett, Fenix W. Huang, Christian M. Reidys

TL;DR
This paper explores the correlation between sequence and structure in biopolymers like RNA, introducing a novel approach to identify embedded patterns beyond traditional sequence alignment by analyzing mutual information and energy spectra.
Contribution
It presents a new framework connecting sequence-structure mutual information with partition functions and introduces methods to detect embedded patterns in RNA sequences.
Findings
Identifies multiple sequences with similar mutual information but poor alignment.
Connects partition function computation to mutual information in RNA structures.
Provides a criterion to distinguish native structures based on sequence patterns.
Abstract
Motivation: DNA data is transcribed into single-stranded RNA, which folds into specific molecular structures. In this paper we pose the question to what extent sequence- and structure-information correlate. We view this correlation as structural semantics of sequence data that allows for a different interpretation than conventional sequence alignment. Structural semantics could enable us to identify more general embedded "patterns" in DNA and RNA sequences. Results: We compute the partition function of sequences with respect to a fixed structure and connect this computation to the mutual information of a sequence-structure pair for RNA secondary structures. We present a Boltzmann sampler and obtain the a priori probability of specific sequence patterns. We present a detailed analysis for the three PDB-structures, 2JXV (hairpin), 2N3R (3-branch multi-loop) and 1EHZ (tRNA). We localize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
