FDR Controlled Multiple Testing for Union Null Hypotheses: A   Knockoff-based Approach

Ran Dai; Cheng Zheng

arXiv:2106.12719·stat.ME·October 4, 2022·1 cites

FDR Controlled Multiple Testing for Union Null Hypotheses: A Knockoff-based Approach

Ran Dai, Cheng Zheng

PDF

Open Access

TL;DR

This paper introduces a knockoff-based method for controlling the false discovery rate when testing union null hypotheses across multiple independent datasets, ensuring reliable feature selection in high-dimensional studies.

Contribution

It proposes a novel simultaneous knockoff approach that guarantees exact FDR control for union null hypotheses in diverse model settings with multiple data sources.

Findings

01

The method achieves exact FDR control in finite samples.

02

It performs well in simulations across various models.

03

It successfully identifies genetic biomarkers in real data examples.

Abstract

False discovery rate (FDR) controlling procedures provide important statistical guarantees for the replicability in signal identification based on multiple hypotheses testing. In many fields of study, FDR controlling procedures are used in high-dimensional (HD) analyses to discover features that are truly associated with the outcome. In some recent applications, data on the same set of candidate features are independently collected in multiple different studies. For example, gene expression data are collected at different facilities and with different cohorts, to identify the genetic biomarkers of multiple types of cancers. These studies provide us opportunities to identify signals by considering information from different sources (with potential heterogeneity) jointly. This paper is about how to provide FDR control guarantees for the tests of union null hypotheses of conditional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Gene expression and cancer classification · Statistical Methods and Inference