Multiple hypothesis testing adjusted for latent variables, with an application to the AGEMAP gene expression data
Yunting Sun, Nancy R. Zhang, Art B. Owen

TL;DR
This paper introduces LEAPP, a two-stage method to improve hypothesis testing in high throughput data by accounting for latent variables, demonstrated through gene expression analysis related to aging.
Contribution
LEAPP is a novel approach that isolates latent variables from primary variables, enhancing the accuracy of hypothesis ranking in high throughput studies.
Findings
LEAPP outperforms SVA and EIGENSTRAT in simulation tests.
LEAPP produces more consistent gene rankings across tissues.
Application to AGEMAP data shows improved hypothesis ordering.
Abstract
In high throughput settings we inspect a great many candidate variables (e.g., genes) searching for associations with a primary variable (e.g., a phenotype). High throughput hypothesis testing can be made difficult by the presence of systemic effects and other latent variables. It is well known that those variables alter the level of tests and induce correlations between tests. They also change the relative ordering of significance levels among hypotheses. Poor rankings lead to wasteful and ineffective follow-up studies. The problem becomes acute for latent variables that are correlated with the primary variable. We propose a two-stage analysis to counter the effects of latent variables on the ranking of hypotheses. Our method, called LEAPP, statistically isolates the latent variables from the primary one. In simulations, it gives better ordering of hypotheses than competing methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
