Designing RNA Secondary Structures is Hard

\'Edouard Bonnet; Pawe{\l} Rz\k{a}\.zewski; Florian Sikora

arXiv:1710.11513·cs.DS·March 28, 2018

Designing RNA Secondary Structures is Hard

\'Edouard Bonnet, Pawe{\l} Rz\k{a}\.zewski, Florian Sikora

PDF

TL;DR

This paper proves that designing RNA secondary structures with specific constraints is NP-complete, highlighting the computational difficulty of an important problem in bioinformatics.

Contribution

It establishes the NP-completeness of RNA secondary structure design under natural constraints in the simplest energy model, resolving a long-standing open question.

Findings

01

RNA design problem is NP-complete with sequence labeling constraints

02

The result applies to the simplest Watson-Crick energy model

03

Implications suggest similar complexity for more realistic models

Abstract

An RNA sequence is a word over an alphabet on four elements ${A, C, G, U}$ called bases. RNA sequences fold into secondary structures where some bases match one another while others remain unpaired. Pseudoknot-free secondary structures can be represented as well-parenthesized expressions with additional dots, where pairs of matching parentheses symbolize paired bases and dots, unpaired bases. The two fundamental problems in RNA algorithmic are to predict how sequences fold within some model of energy and to design sequences of bases which will fold into targeted secondary structures. Predicting how a given RNA sequence folds into a pseudoknot-free secondary structure is known to be solvable in cubic time since the eighties and in truly subcubic time by a recent result of Bringmann et al. (FOCS 2016). As a stark contrast, it is unknown whether or not designing a given RNA secondary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.