Designing RNA Secondary Structures is Hard
\'Edouard Bonnet, Pawe{\l} Rz\k{a}\.zewski, Florian Sikora

TL;DR
This paper proves that designing RNA secondary structures with specific constraints is NP-complete, highlighting the computational difficulty of an important problem in bioinformatics.
Contribution
It establishes the NP-completeness of RNA secondary structure design under natural constraints in the simplest energy model, resolving a long-standing open question.
Findings
RNA design problem is NP-complete with sequence labeling constraints
The result applies to the simplest Watson-Crick energy model
Implications suggest similar complexity for more realistic models
Abstract
An RNA sequence is a word over an alphabet on four elements called bases. RNA sequences fold into secondary structures where some bases match one another while others remain unpaired. Pseudoknot-free secondary structures can be represented as well-parenthesized expressions with additional dots, where pairs of matching parentheses symbolize paired bases and dots, unpaired bases. The two fundamental problems in RNA algorithmic are to predict how sequences fold within some model of energy and to design sequences of bases which will fold into targeted secondary structures. Predicting how a given RNA sequence folds into a pseudoknot-free secondary structure is known to be solvable in cubic time since the eighties and in truly subcubic time by a recent result of Bringmann et al. (FOCS 2016). As a stark contrast, it is unknown whether or not designing a given RNA secondary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
