GENNUS: generative approaches for nucleotide sequences enhance mirtron classification
Alisson Gaspar Chiquitto, Liliane Santana Oliveira, Pedro Henrique Bugatti, Priscila Tiemi Maeda Saito, Mark Basham, Roberto Tadeu Raittz, Alexandre Rossi Paschoal

TL;DR
This paper introduces GENNUS, a new method using generative models to improve the classification of mirtrons and microRNAs by addressing data imbalance.
Contribution
GENNUS introduces novel data augmentation strategies using GANs and SMOTE to enhance mirtron and miRNA classification.
Findings
GAN-based methods generate high-quality synthetic data that improve classification accuracy.
Models trained with GAN-generated data outperform those using only real data or traditional SMOTE.
The approach enhances generalization across different machine learning frameworks.
Abstract
Classifying non-coding RNA (ncRNA) sequences, particularly mirtrons, is essential for elucidating gene regulation mechanisms. However, the prevalent class imbalance in ncRNA datasets presents significant challenges, often resulting in overfitting and diminished generalization in machine learning models. In this study, GENNUS (GENerative approaches for NUcleotide Sequences) is proposed, introducing novel data augmentation strategies using generative adversarial networks (GANs) and synthetic minority over-sampling technique (SMOTE) to enhance mirtron and canonical microRNA (miRNA) classification performance. Our GAN-based methods effectively generate high-quality synthetic data that capture the intricate patterns and diversity of real mirtron sequences, eliminating the need for extensive feature engineering. Through four experiments, it is demonstrated that models trained on a combination…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer-related molecular mechanisms research · RNA and protein synthesis mechanisms · RNA Research and Splicing
