Mutation Clusters from Cancer Exome
Zura Kakushadze, Willie Yu

TL;DR
This paper demonstrates that a deterministic clustering algorithm effectively identifies stable mutation clusters in cancer exome data, outperforming traditional NMF methods in stability and computational efficiency, with potential implications for cancer diagnostics.
Contribution
The study introduces a stable, deterministic clustering approach for analyzing cancer exome mutations, showing advantages over NMF in stability and computational cost.
Findings
Mutation clusters are stable across samples and cancer types.
Deterministic clustering outperforms NMF in stability.
Potential for faster, cost-effective cancer diagnostics.
Abstract
We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1,389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics such as novel blood-test methods currently in development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Cancer Genomics and Diagnostics
