A Graph Theoretic Approach to Utilizing Protein Structure to Identify Non-Random Somatic Mutations
Gregory Ryslik, Yuwei Cheng, Kei-Hoi Cheung, Yorgo Modis and, Hongyu Zhao

TL;DR
This paper introduces GraphPAC, a novel graph-theoretic method that leverages protein tertiary structure to improve detection of mutational clusters, aiding in identifying cancer driver mutations more effectively.
Contribution
It presents GraphPAC, a new approach that incorporates 3D protein structure into mutational clustering, outperforming existing methods in identifying novel cancer driver mutation clusters.
Findings
GraphPAC detects known mutational clusters in oncogenes like EGFR and KRAS.
It identifies new clusters in proteins such as DPP4 and NRP1.
GraphPAC outperforms current clustering methods by utilizing structural information.
Abstract
Background: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to some recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of methods have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. Results: We have designed and implemented GraphPAC (Graph Protein Amino Acid Clustering) to identify mutational clustering while considering protein spatial structure. Using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Genomics and Rare Diseases · Gene expression and cancer classification
