GSAE: an autoencoder with embedded gene-set nodes for genomics   functional characterization

Hung-I Harry Chen; Yu-Chiao Chiu; Tinghe Zhang; Songyao Zhang; Yufei; Huang; Yidong Chen

arXiv:1805.07874·stat.ML·January 1, 2019

GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization

Hung-I Harry Chen, Yu-Chiao Chiu, Tinghe Zhang, Songyao Zhang, Yufei, Huang, Yidong Chen

PDF

TL;DR

This paper introduces GSAE, a deep learning autoencoder that embeds gene sets into a latent space, capturing biological relevance and improving tumor subtype classification and prognosis prediction.

Contribution

The study proposes a novel gene superset autoencoder (GSAE) that combines gene sets into an unbiased latent representation, enhancing biological interpretability and predictive power in genomics.

Findings

01

Gene supersets discriminate tumor subtypes effectively.

02

Supersets show strong prognostic capabilities.

03

High reproducibility in survival analysis.

Abstract

Bioinformatics tools have been developed to interpret gene expression data at the gene set level, and these gene set based analyses improve the biologists' capability to discover functional relevance of their experiment design. While elucidating gene set individually, inter gene sets association is rarely taken into consideration. Deep learning, an emerging machine learning technique in computational biology, can be used to generate an unbiased combination of gene set, and to determine the biological relevance and analysis consistency of these combining gene sets by leveraging large genomic data sets. In this study, we proposed a gene superset autoencoder (GSAE), a multi-layer autoencoder model with the incorporation of a priori defined gene sets that retain the crucial biological features in the latent layer. We introduced the concept of the gene superset, an unbiased combination of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729