Towards Parsimonious Generative Modeling of RNA Families
Francesco Calvanese, Camille N. Lambert, Philippe Nghe, Francesco, Zamponi, Martin Weigt

TL;DR
This paper presents eaDCA, a simple, efficient, and interpretable generative model for RNA sequences that accurately mimics natural sequences and estimates the vast number of potential functional RNAs within a family.
Contribution
The paper introduces eaDCA, a sparse coevolutionary model tailored for RNA, which achieves high performance with fewer parameters and provides insights into the size of functional RNA sequence space.
Findings
eaDCA generates RNA sequences similar to natural ones in experiments.
eaDCA estimates approximately 10^39 functional sequences for a specific RNA family.
The model operates with significantly fewer parameters than complex existing methods.
Abstract
Generative probabilistic models emerge as a new paradigm in data-driven, evolution-informed design of biomolecular sequences. This paper introduces a novel approach, called Edge Activation Direct Coupling Analysis (eaDCA), tailored to the characteristics of RNA sequences, with a strong emphasis on simplicity, efficiency, and interpretability. eaDCA explicitly constructs sparse coevolutionary models for RNA families, achieving performance levels comparable to more complex methods while utilizing a significantly lower number of parameters. Our approach demonstrates efficiency in generating artificial RNA sequences that closely resemble their natural counterparts in both statistical analyses and SHAPE-MaP experiments, and in predicting the effect of mutations. Notably, eaDCA provides a unique feature: estimating the number of potential functional sequences within a given RNA family. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · RNA modifications and cancer · RNA Research and Splicing
