Mapping beyond diseases: Controlled variable selection for secondary phenotypes using tilted knockoffs
Qian Zhao, Susan Service, Carrie E. Bearden, Carlos Lopez-Jaramillo, Nelson Freimer, Chiara Sabatti

TL;DR
This paper introduces a method using tilted knockoffs to control false discovery rate when selecting important variables in biased samples, such as case-control studies, ensuring valid secondary phenotype analysis.
Contribution
It develops a novel tilted knockoff approach that accounts for biased sampling, enabling reliable variable selection with FDR control in complex biomedical studies.
Findings
Tilted knockoffs effectively control FDR in biased sampling scenarios.
The method demonstrates good power in simulated examples.
Application to genetic data reveals meaningful secondary phenotypes.
Abstract
Researchers in biomedical studies often work with samples that are not selected uniformly at random from the population of interest, a major example being a case-control study. While these designs are motivated by specific scientific questions, it is often of interest to use the data collected to pursue secondary lines of investigations. In these cases, ignoring the fact that observations are not sampled uniformly at random can lead to spurious results. For example, in a case-control study, one might identify a spurious association between an exposure and a secondary phenotype when both affect the case-control status. This phenomenon is known as collider bias in the causal inference literature. While tests of independence under biased sampling are available, these methods typically do not apply when the number of variables is large. Here, we are interested in using the biased sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
