Hardness of RNA Folding Problem with Four Symbols
Yi-Jun Chang

TL;DR
This paper establishes a stronger computational hardness lower bound for the RNA folding problem with the biologically relevant four-symbol alphabet, and relates it to the Dyck edit distance problem, simplifying existing complexity proofs.
Contribution
It improves the lower bound for RNA folding with four symbols and connects it to Dyck edit distance, reducing the alphabet size needed for the hardness proof.
Findings
Lower bound established for RNA folding with 4 symbols
Reduction from RNA folding to Dyck edit distance with alphabet size 10
Simplified proof of Dyck edit distance hardness
Abstract
An RNA sequence is a string composed of four types of nucleotides, , and . The goal of the RNA folding problem is to find a maximum cardinality set of crossing-free pairs of the form or in a given RNA sequence. The problem is central in bioinformatics and has received much attention over the years. Abboud, Backurs, and Williams (FOCS 2015) demonstrated a conditional lower bound for a generalized version of the RNA folding problem based on a conjectured hardness of the -clique problem. Their lower bound requires the RNA sequence to have at least 36 types of symbols, making the result not applicable to the RNA folding problem in real life (i.e., alphabet size 4). In this paper, we present an improved lower bound that works for the alphabet size 4 case. We also investigate the Dyck edit distance problem, which is a string problem closely related to RNA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Algorithms and Data Compression · Genomics and Phylogenetic Studies
