GroupEnc: encoder with group loss for global structure preservation

David Novak; Sofie Van Gassen; Yvan Saeys

arXiv:2309.02917·cs.LG·September 7, 2023

GroupEnc: encoder with group loss for global structure preservation

David Novak, Sofie Van Gassen, Yvan Saeys

PDF

Open Access

TL;DR

GroupEnc is a deep learning encoder that uses a novel group loss function within a variational autoencoder framework to better preserve global data structure in low-dimensional embeddings, aiding biological data analysis.

Contribution

It introduces a new group loss function for VAEs, improving global structure preservation in embeddings compared to existing methods.

Findings

01

Better global structure preservation demonstrated on biological datasets

02

Uses RNX curves for quantitative evaluation

03

Flexible, parametric model with improved embedding quality

Abstract

Recent advances in dimensionality reduction have achieved more accurate lower-dimensional embeddings of high-dimensional data. In addition to visualisation purposes, these embeddings can be used for downstream processing, including batch effect normalisation, clustering, community detection or trajectory inference. We use the notion of structure preservation at both local and global levels to create a deep learning model, based on a variational autoencoder (VAE) and the stochastic quartet loss from the SQuadMDS algorithm. Our encoder model, called GroupEnc, uses a 'group loss' function to create embeddings with less global structure distortion than VAEs do, while keeping the model parametric and the architecture flexible. We validate our approach using publicly available biological single-cell transcriptomic datasets, employing RNX curves for evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSingle-cell and spatial transcriptomics · Gene Regulatory Network Analysis · Bioinformatics and Genomic Networks