AttnGen: Attention-Guided Saliency Learning for Interpretable Genomic Sequence Classification
Rayhaneh Shabani Nia, Ali Karkehabadi

TL;DR
AttnGen introduces an attention-guided training framework for genomic sequence classification that enhances interpretability by focusing on informative regions, leading to improved accuracy and stability.
Contribution
This work presents AttnGen, a novel attention-based method that integrates interpretability into training, outperforming baseline models in genomic sequence classification tasks.
Findings
AttnGen achieves 96.73% accuracy, surpassing the CNN baseline's 95.83%.
The model's importance scores are functionally relevant, as removing high-saliency nucleotides drastically reduces accuracy.
Masking 10-20% of positions balances interpretability and predictive performance.
Abstract
Deep neural networks have achieved strong performance in genomic sequence classification; however, relating their predictions to biologically meaningful sequence patterns remains challenging. In this work, we present AttnGen, an attention-guided training framework that embeds interpretability directly into the optimization process. AttnGen computes nucleotide-level importance scores using an attention mechanism and progressively suppresses low-contribution positions during training. This encourages the model to focus its predictions on a compact set of informative regions while reducing reliance on noisy sequence elements. We evaluate AttnGen on the standardized demo_human_or_worm benchmark, a binary classification task over 200-nucleotide sequences. With moderate masking, AttnGen achieves a validation accuracy of 96.73%, outperforming a conventional CNN baseline with 95.83% accuracy,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
