False Discovery Rate Control for Fast Screening of Large-Scale Genomics Biobanks
Jasin Machkour, Michael Muma, Daniel P. Palomar

TL;DR
This paper introduces the Screen-T-Rex selector, a fast and scalable method for controlling false discovery rate in large-scale genomics biobanks, improving reproducibility and reducing computation time significantly.
Contribution
The paper presents a novel FDR-controlling method tailored for large-scale biobank screening that requires no additional parameter tuning and outperforms existing benchmark methods.
Findings
Superior performance in simulations and real-world HIV-1 drug resistance data
Significantly lower computation time than benchmark knockoff methods
Effective in high-dimensional, multivariate genomic screening
Abstract
Genomics biobanks are information treasure troves with thousands of phenotypes (e.g., diseases, traits) and millions of single nucleotide polymorphisms (SNPs). The development of methodologies that provide reproducible discoveries is essential for the understanding of complex diseases and precision drug development. Without statistical reproducibility guarantees, valuable efforts are spent on researching false positives. Therefore, scalable multivariate and high-dimensional false discovery rate (FDR)-controlling variable selection methods are urgently needed, especially, for complex polygenic diseases and traits. In this work, we propose the Screen-T-Rex selector, a fast FDR-controlling method based on the recently developed T-Rex selector. The method is tailored to screening large-scale biobanks and it does not require choosing additional parameters (sparsity parameter, target FDR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Ethics in Clinical Research · Genomics and Rare Diseases
