Mutation Clusters from Cancer Exome

Zura Kakushadze; Willie Yu

arXiv:1707.08504·q-bio.GN·August 16, 2017

Mutation Clusters from Cancer Exome

Zura Kakushadze, Willie Yu

PDF

Open Access

TL;DR

This paper demonstrates that a deterministic clustering algorithm effectively identifies stable mutation clusters in cancer exome data, outperforming traditional NMF methods in stability and computational efficiency, with potential implications for cancer diagnostics.

Contribution

The study introduces a stable, deterministic clustering approach for analyzing cancer exome mutations, showing advantages over NMF in stability and computational cost.

Findings

01

Mutation clusters are stable across samples and cancer types.

02

Deterministic clustering outperforms NMF in stability.

03

Potential for faster, cost-effective cancer diagnostics.

Abstract

We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1,389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics such as novel blood-test methods currently in development.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Cancer Genomics and Diagnostics