ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for   Clustering Single-cell Gene Expression Data

Arnab Kumar Mondal; Himanshu Asnani; Parag Singla; Prathosh AP

arXiv:2107.07709·cs.LG·July 19, 2021

ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data

Arnab Kumar Mondal, Himanshu Asnani, Parag Singla, Prathosh AP

PDF

Open Access 1 Repo

TL;DR

This paper introduces scRAE, a deterministic autoencoder with a learnable prior, designed to improve clustering of high-dimensional, sparse single-cell gene expression data by better balancing bias and variance.

Contribution

The paper proposes a novel deterministic autoencoder framework with a flexible prior generator for enhanced clustering of scRNA-seq data, addressing bias-variance trade-offs in regularized autoencoders.

Findings

01

scRAE outperforms existing methods on real-world datasets

02

The learnable prior improves clustering accuracy

03

The approach effectively balances bias and variance

Abstract

Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping from the high-dimensional data space to a low-dimensional latent space and vice-versa, simultaneously imposing a distributional prior on the latent space, which brings in a regularization effect. This paper argues that RAEs suffer from the infamous problem of bias-variance trade-off in their naive formulation. While a simple AE without a latent regularization results in data over-fitting, a very strong prior leads to under-representation and thus bad clustering. To address the above issues, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arnabkmondal/scRAE
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSingle-cell and spatial transcriptomics · Cancer-related molecular mechanisms research · Domain Adaptation and Few-Shot Learning

MethodsRegularized Autoencoders · Autoencoders