Automated Cancer Subtyping via Vector Quantization Mutual Information   Maximization

Zheng Chen; Lingwei Zhu; Ziwei Yang; Takashi Matsubara

arXiv:2206.10801·cs.LG·November 15, 2022

Automated Cancer Subtyping via Vector Quantization Mutual Information Maximization

Zheng Chen, Lingwei Zhu, Ziwei Yang, Takashi Matsubara

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised clustering method that uses vector quantization and mutual information maximization to identify cancer subtypes from high-dimensional genetic expression data, refining labels and correlating with survival rates.

Contribution

It presents a novel, label-agnostic clustering approach that adaptively determines the number of cancer subtypes using mutual information maximization on genetic profiles.

Findings

01

Refines existing cancer subtype labels

02

High correlation with cancer survival rates

03

Automatically determines the number of subtypes

Abstract

Cancer subtyping is crucial for understanding the nature of tumors and providing suitable therapy. However, existing labelling methods are medically controversial, and have driven the process of subtyping away from teaching signals. Moreover, cancer genetic expression profiles are high-dimensional, scarce, and have complicated dependence, thereby posing a serious challenge to existing subtyping models for outputting sensible clustering. In this study, we propose a novel clustering method for exploiting genetic expression profiles and distinguishing subtypes in an unsupervised manner. The proposed method adaptively learns categorical correspondence from latent representations of expression profiles to the subtypes output by the model. By maximizing the problem -- agnostic mutual information between input expression profiles and output subtypes, our method can automatically decide a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhengchen3/ECML_VQRIM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Machine Learning in Bioinformatics · Machine Learning and Data Classification