Understanding the errors of SHAPE-directed RNA structure modeling
Wipapat Kladwang, Christopher C. VanLang, Pablo Cordero, and Rhiju Das

TL;DR
This study benchmarks SHAPE-directed RNA structure modeling against crystallographic data, revealing significant errors and limitations in the method's information content and confidence levels, especially for complex RNAs.
Contribution
It provides a comprehensive analysis of SHAPE data accuracy and introduces helix-by-helix confidence estimates to improve interpretation of RNA structure models.
Findings
False negative rate of 17% in modeling helices
False discovery rate of 21% with at least one helix error in five of six RNAs
Filtering data can modestly improve modeling accuracy
Abstract
Single-nucleotide-resolution chemical mapping for structured RNA is being rapidly advanced by new chemistries, faster readouts, and coupling to computational algorithms. Recent tests have shown that selective 2'-hydroxyl acylation by primer extension (SHAPE) can give near-zero error rates (0-2%) in modeling the helices of RNA secondary structure. Here, we benchmark the method using six molecules for which crystallographic data are available: tRNA(phe) and 5S rRNA from Escherichia coli, the P4-P6 domain of the Tetrahymena group I ribozyme, and ligand-bound domains from riboswitches for adenine, cyclic di-GMP, and glycine. SHAPE-directed modeling of these highly structured RNAs gave an overall false negative rate (FNR) of 17% and a false discovery rate (FDR) of 21%, with at least one helix prediction error in five of the six cases. Extensive variations of data processing, normalization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
