Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework
Rick S. H. Willemsen, Tenindra Abeywickrama, Ramu Anandakrishnan

TL;DR
This paper introduces a fast, efficient framework for identifying multi-hit gene mutation combinations driving cancer, outperforming previous methods that required massive parallelization, and provides the first provably optimal solutions for many benchmark cases.
Contribution
It formalizes the Multi-Hit Cancer Driver Set Cover Problem and develops constraint programming and mixed integer programming heuristics that are computationally efficient and provably optimal for many instances.
Findings
Framework matches state-of-the-art methods using a single CPU in under a minute.
Proposes a heuristic that provides the first provably optimal solutions for over half of benchmark instances.
Demonstrates that the problem is less computationally demanding than previously believed.
Abstract
Cancer is often driven by specific combinations of an estimated two to nine gene mutations, known as multi-hit combinations. Identifying these multi-hit combinations of gene mutations that drive cancer is critical for understanding carcinogenesis and designing targeted therapies. We formalize this challenge as the Multi-Hit Cancer Driver Set Cover Problem (MHCDSCP), optimizing the selection of gene combinations to maximize tumor coverage while strictly minimizing normal sample misclassification. While existing approaches rely on exhaustive enumeration and massive parallelization, we introduce fast heuristics based on constraint programming and mixed integer programming formulations. Evaluated on real-world cancer genomics data, our framework matches state-of-the-art supercomputing methods using a single commodity CPU in under a minute. We also propose a price-and-branch heuristic which,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
