Exploratory data analysis for large-scale multiple testing problems and its application in gene expression studies
Paramita Chakraborty, Chong Ma, John Grego, James Lynch

TL;DR
This paper introduces a novel empirical Bayes-based method with sample splitting for large-scale multiple testing, improving false discovery rate control and discovery reliability in gene expression studies.
Contribution
It proposes a cross-validation approach with detection frequency thresholds to enhance Fdr control and explore relationships among significant findings.
Findings
Effective Fdr control demonstrated on microarray and RNA-seq data
Method reduces overfitting through resampling and cross-validation
Power analysis shows improved detection efficiency
Abstract
In large scale multiple testing problems, a two-class empirical Bayes approach can be used to control the false discovery rate (Fdr) for the entire array of hypotheses under study. A sample splitting step is incorporated to modify that approach where one part of the data is used for model fitting and the other part for detecting the significant cases by a screening technique featuring the empirical Bayes mode of Fdr control. Cases with high detection frequency across repeated random sample splits are considered true discoveries. A critical detection frequency is set to control the overall false discovery rate. The proposed method helps to balance out unwanted sources of variation and addresses potential statistical overfitting of the core empirical model by cross-validation through resampling. Further, concurrent detection frequencies are used to provide visual tools to explore the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods in Clinical Trials · Molecular Biology Techniques and Applications
