A New Approach for Large Scale Multiple Testing with Application to FDR Control for Graphically Structured Hypotheses
Wenge Guo, Gavin Lynch, Joseph P. Romano

TL;DR
This paper introduces a new general approach for large-scale multiple testing that leverages graphical structures like DAGs to improve power and control error rates such as FDR, demonstrated through simulations and real data.
Contribution
It develops a novel framework for multiple testing with graphical hypotheses, enabling error rate control and structure preservation, especially for DAG-structured hypotheses.
Findings
Proposed a PFER controlling procedure for DAG-structured hypotheses
Developed an FDR controlling procedure that maintains DAG structure among rejected hypotheses
Demonstrated good performance through simulations and real data analysis
Abstract
In many large scale multiple testing applications, the hypotheses often have a known graphical structure, such as gene ontology in gene expression data. Exploiting this graphical structure in multiple testing procedures can improve power as well as aid in interpretation. However, incorporating the structure into large scale testing procedures and proving that an error rate, such as the false discovery rate (FDR), is controlled can be challenging. In this paper, we introduce a new general approach for large scale multiple testing, which can aid in developing new procedures under various settings with proven control of desired error rates. This approach is particularly useful for developing FDR controlling procedures, which is simplified as the problem of developing per-family error rate (PFER) controlling procedures. Specifically, for testing hypotheses with a directed acyclic graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Gene expression and cancer classification · Bioinformatics and Genomic Networks
