You Only Train Once: Differentiable Subset Selection for Omics Data
Daphn\'e Chopard, Jorge da Silva Gon\c{c}alves, Irene Cannistraci, Thomas M. Sutter, Julia E. Vogt

TL;DR
YOTO is an end-to-end differentiable framework that jointly selects gene subsets and predicts outcomes from single-cell transcriptomic data, improving interpretability and performance without additional classifiers.
Contribution
The paper introduces YOTO, a novel end-to-end differentiable method for joint gene subset selection and prediction, enabling more effective biomarker discovery from single-cell RNA-seq data.
Findings
YOTO outperforms state-of-the-art baselines on two single-cell RNA-seq datasets.
The method produces compact gene subsets that generalize across tasks.
End-to-end training improves predictive accuracy and interpretability.
Abstract
Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Bioinformatics and Genomic Networks · Domain Adaptation and Few-Shot Learning
