Robust identification of noncoding RNA from transcriptomes requires phylogenetically-informed sampling
Stinus Lindgreen, Sinan Ugur Umu, Alicia Sook-Wei Lai, Hisham Eldai,, Wenting Liu, Stephanie McGimpsey, Nicole Wheeler, Patrick J. Biggs, Nick R., Thomson, Lars Barquist, Anthony M. Poole, Paul P. Gardner

TL;DR
This study demonstrates that identifying noncoding RNAs from transcriptome data critically depends on phylogenetically-informed sampling, revealing limitations of current methods and emphasizing the need for phylogeny-aware experimental design.
Contribution
It shows that phylogeny-aware sampling is essential for robust noncoding RNA identification from transcriptomes, highlighting the narrow phylogenetic window for comparative methods.
Findings
Nearly a thousand candidate noncoding RNAs identified
Phylogenetic sampling strongly influences detection success
Only one phylogenetic cluster allowed effective separation of noncoding RNAs from noise
Abstract
Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression quantification. However, a major challenge is to robustly distinguish functional outputs from transcriptional noise. To establish whether annotation of existing transcriptome data has effectively captured all functional outputs, we analysed over 400 publicly available RNA-seq datasets spanning 37 different Archaea and Bacteria. Using comparative tools, we identify close to a thousand highly-expressed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
