Making ends meet or just meeting at the ends? Assessing end-to-end distance in folded RNA sequences and other branched structures
Torin Greenwood, Christine Heitsch

TL;DR
This paper analyzes the end-to-end distance in folded RNA and branched structures using combinatorial models, revealing that known RNA structures are more concentrated than randomized shuffles.
Contribution
It provides a comprehensive combinatorial characterization of end-to-end distances in RNA structures and compares theoretical predictions with real and randomized data.
Findings
Ends of branched structures are almost certainly close.
Known RNA structures are more concentrated in end-to-end distance than shuffled structures.
Theoretical models match the distributions of shuffled structures.
Abstract
Researchers have repeatedly found that the ends of an RNA sequence are significantly closer than expected for a random linear chain. However, we prove that the ends of a branched structure are almost certainly close. Our results are obtained via combinatorial branching models of increasing complexity using tools from multivariate analytic combinatorics. We completely characterize parameters tracking end-to-end distance, including means and variances. Then, we compare to existing datasets of known RNA structures, as well as the minimum free-energy structures of randomized shuffles. We find that the shuffled structures resemble our theoretical distributions while the known RNA structures have similar parameter values but are more concentrated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
