Deep learning models for predicting RNA degradation via dual crowdsourcing
Hannah K. Wayment-Steele, Wipapat Kladwang, Andrew M. Watkins, Do Soon, Kim, Bojan Tunguz, Walter Reade, Maggie Demkin, Jonathan Romano, Roger, Wellington-Oguri, John J. Nicol, Jiayang Gao, Kazuki Onodera, Kazuki, Fujikawa, Hanfei Mao, Gilles Vandewiele, Michele Tinti

TL;DR
This study used a crowdsourced machine learning competition to develop models predicting RNA degradation, achieving high accuracy and generalization, which can aid in designing more stable mRNA therapeutics.
Contribution
It introduces a novel crowdsourcing approach combining data collection and model development for RNA stability prediction, demonstrating rapid progress and effective model generalization.
Findings
41% of nucleotide predictions within experimental error
Models generalized well to longer mRNA molecules
Natural language processing architectures improved prediction accuracy
Abstract
Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · RNA Interference and Gene Delivery · Viral Infections and Immunology Research
