ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data
Arnab Kumar Mondal, Himanshu Asnani, Parag Singla, Prathosh AP

TL;DR
This paper introduces scRAE, a deterministic autoencoder with a learnable prior, designed to improve clustering of high-dimensional, sparse single-cell gene expression data by better balancing bias and variance.
Contribution
The paper proposes a novel deterministic autoencoder framework with a flexible prior generator for enhanced clustering of scRNA-seq data, addressing bias-variance trade-offs in regularized autoencoders.
Findings
scRAE outperforms existing methods on real-world datasets
The learnable prior improves clustering accuracy
The approach effectively balances bias and variance
Abstract
Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping from the high-dimensional data space to a low-dimensional latent space and vice-versa, simultaneously imposing a distributional prior on the latent space, which brings in a regularization effect. This paper argues that RAEs suffer from the infamous problem of bias-variance trade-off in their naive formulation. While a simple AE without a latent regularization results in data over-fitting, a very strong prior leads to under-representation and thus bad clustering. To address the above issues, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cancer-related molecular mechanisms research · Domain Adaptation and Few-Shot Learning
MethodsRegularized Autoencoders · Autoencoders
