NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy

Lin Qiu; Nils Murrugarra-Llerena; V\'itor Silva; Lin Lin; Vernon M.; Chinchilli

arXiv:2101.09809·stat.ML·January 26, 2021

NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy

Lin Qiu, Nils Murrugarra-Llerena, V\'itor Silva, Lin Lin, Vernon M., Chinchilli

PDF

Open Access 2 Repos

TL;DR

NeurT-FDR is a neural network-based method that effectively controls the false discovery rate in multiple hypothesis testing by incorporating hierarchical feature information, leading to increased statistical power.

Contribution

It introduces a novel neural network framework that models hierarchical covariates for improved FDR control and discovery in large-scale testing problems.

Findings

01

NeurT-FDR achieves strong FDR control in synthetic and real datasets.

02

The method makes significantly more discoveries than existing baselines.

03

It efficiently handles high-dimensional hierarchical features through end-to-end training.

Abstract

Controlling false discovery rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science. Existing methods rely on the test-level covariates while ignoring possible hierarchy among the covariates. This strategy may not be optimal for complex large-scale problems, where hierarchical information often exists among those test-level covariates. We propose NeurT-FDR which boosts statistical power and controls FDR for multiple hypothesis testing while leveraging the hierarchy among test-level covariates. Our method parametrizes the test-level covariates as a neural network and adjusts the feature hierarchy through a regression framework, which enables flexible handling of high-dimensional features as well as efficient end-to-end optimization. We show that NeurT-FDR has strong FDR guarantees and makes substantially more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Statistical Methods in Clinical Trials · Adversarial Robustness in Machine Learning