Searching for Gene Sets with Mutually Exclusive Mutations
Paul Ginzberg, Federico Giorgi, Andrea Califano

TL;DR
This paper introduces MOCA, a new greedy algorithm for identifying gene sets with mutually exclusive mutations in cancer data, using advanced statistical testing to control error rates.
Contribution
The paper presents MOCA, a novel algorithm combining multiple hypothesis testing and statistical tests to effectively find mutually exclusive gene sets in large cancer datasets.
Findings
MOCA effectively identifies mutually exclusive gene sets.
The method controls familywise error rate and false discovery rate.
It demonstrates high statistical power in large-scale testing.
Abstract
Cancer cells evolve through random somatic mutations. "Beneficial" mutations which disrupt key pathways (e.g. cell cycle regulation) are subject to natural selection. Multiple mutations may lead to the same "beneficial" effect, in which case there is no selective advantage to having more than one of these mutations. Hence we are interested in finding sets of genes whose mutations are approximately mutually exclusive (anti-co-occurring) within the TCGA Pancancer dataset. In principle, finding the best set is NP Hard. Nevertheless, we will show how a new Mutation anti-co-OCcurrence Algorithm (MOCA) provides an effective greedy search and testing algorithm with guaranteed control of the familywise error rate or false discovery rate, by combining some under-appreciated ideas from frequentist hypothesis testing. These ideas include: (a) A novel exact conditional test for the tendency of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer Genomics and Diagnostics · Gene expression and cancer classification · Molecular Biology Techniques and Applications
