Network-based Distance Metric with Application to Discover Disease   Subtypes in Cancer

Jipeng Qiang; Wei Ding; John Quackenbush; Ping Chen

arXiv:1703.01900·q-bio.QM·March 7, 2017·1 cites

Network-based Distance Metric with Application to Discover Disease Subtypes in Cancer

Jipeng Qiang, Wei Ding, John Quackenbush, Ping Chen

PDF

Open Access

TL;DR

This paper introduces a novel network-based distance metric for clustering sparse, high-dimensional gene mutational data to improve cancer subtype discovery, outperforming existing methods and identifying previously undetectable subtypes.

Contribution

A new network-based distance metric tailored for sparse mutational data enhances cancer subtype detection beyond current clustering algorithms.

Findings

01

Outperforms top competitors in synthetic data tests.

02

Detects novel cancer subtypes in real data.

03

Effective with extremely sparse mutational profiles.

Abstract

While we once thought of cancer as single monolithic diseases affecting a specific organ site, we now understand that there are many subtypes of cancer defined by unique patterns of gene mutations. These gene mutational data, which can be more reliably obtained than gene expression data, help to determine how the subtypes develop, evolve, and respond to therapies. Different from dense continuous-value gene expression data, which most existing cancer subtype discovery algorithms use, somatic mutational data are extremely sparse and heterogeneous, because there are less than 0.5\% mutated genes in discrete value 1/0 out of 20,000 human protein-coding genes, and identical mutated genes are rarely shared by cancer patients. Our focus is to search for cancer subtypes from extremely sparse and high dimensional gene mutational data in discrete 1 and 0 values using unsupervised learning. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Genomics and Phylogenetic Studies