Cancer Subtyping via Embedded Unsupervised Learning on Transcriptomics   Data

Ziwei Yang; Lingwei Zhu; Zheng Chen; Ming Huang; Naoaki Ono; MD; Altaf-Ul-Amin; Shigehiko Kanaya

arXiv:2204.02278·cs.LG·April 6, 2022

Cancer Subtyping via Embedded Unsupervised Learning on Transcriptomics Data

Ziwei Yang, Lingwei Zhu, Zheng Chen, Ming Huang, Naoaki Ono, MD, Altaf-Ul-Amin, Shigehiko Kanaya

PDF

Open Access

TL;DR

This paper introduces an unsupervised deep learning approach for cancer subtyping that models data distribution directly, reducing overfitting and capturing molecular features more effectively, which improves classification accuracy.

Contribution

It proposes a novel vector quantization-based method that bypasses Gaussian assumptions, enhancing unsupervised cancer subtyping from transcriptomics data.

Findings

01

Better capture of latent space features

02

Reduced overfitting in small sample sizes

03

Improved subtyping accuracy

Abstract

Cancer is one of the deadliest diseases worldwide. Accurate diagnosis and classification of cancer subtypes are indispensable for effective clinical treatment. Promising results on automatic cancer subtyping systems have been published recently with the emergence of various deep learning methods. However, such automatic systems often overfit the data due to the high dimensionality and scarcity. In this paper, we propose to investigate automatic subtyping from an unsupervised learning perspective by directly constructing the underlying data distribution itself, hence sufficient data can be generated to alleviate the issue of overfitting. Specifically, we bypass the strong Gaussianity assumption that typically exists but fails in the unsupervised learning subtyping literature due to small-sized samples by vector quantization. Our proposed method better captures the latent space features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Machine Learning and Data Classification · AI in cancer detection