TL;DR
CapsProm introduces a capsule network-based model for promoter prediction in DNA sequences, demonstrating versatility across multiple organisms and outperforming baseline methods in most tested datasets.
Contribution
The paper presents a novel capsule network architecture for promoter prediction that generalizes across diverse organisms, unlike previous models.
Findings
Outperformed baseline in 5 of 7 datasets (F1-score)
Effective transfer learning across different organisms
Versatile architecture applicable to both eukaryotic and prokaryotic DNA
Abstract
Locating the promoter region in DNA sequences is of paramount importance in the field of bioinformatics. This is a problem widely studied in the literature, however, not yet fully resolved. Some researchers have presented remarkable results using convolution networks, that allowed the automatic extraction of features from a DNA chain. However, a universal architecture that could generalize to several organisms has not yet been achieved, and thus, requiring researchers to seek new architectures and hyperparameters for each new organism evaluated. In this work, we propose a versatile architecture, based on capsule network, that can accurately identify promoter sequences in raw DNA data from seven different organisms, eukaryotic, and prokaryotic. Our model, the CapsProm, could assist in the transfer of learning between organisms and expand its applicability. Furthermore the CapsProm showed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
