FDR Controlled Multiple Testing for Union Null Hypotheses: A Knockoff-based Approach
Ran Dai, Cheng Zheng

TL;DR
This paper introduces a knockoff-based method for controlling the false discovery rate when testing union null hypotheses across multiple independent datasets, ensuring reliable feature selection in high-dimensional studies.
Contribution
It proposes a novel simultaneous knockoff approach that guarantees exact FDR control for union null hypotheses in diverse model settings with multiple data sources.
Findings
The method achieves exact FDR control in finite samples.
It performs well in simulations across various models.
It successfully identifies genetic biomarkers in real data examples.
Abstract
False discovery rate (FDR) controlling procedures provide important statistical guarantees for the replicability in signal identification based on multiple hypotheses testing. In many fields of study, FDR controlling procedures are used in high-dimensional (HD) analyses to discover features that are truly associated with the outcome. In some recent applications, data on the same set of candidate features are independently collected in multiple different studies. For example, gene expression data are collected at different facilities and with different cohorts, to identify the genetic biomarkers of multiple types of cancers. These studies provide us opportunities to identify signals by considering information from different sources (with potential heterogeneity) jointly. This paper is about how to provide FDR control guarantees for the tests of union null hypotheses of conditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Gene expression and cancer classification · Statistical Methods and Inference
