One-at-a-time knockoffs: controlled false discovery rate with higher   power

Charlie K. Guan; Zhimei Ren; Daniel W. Apley

arXiv:2502.18750·stat.ME·February 27, 2025

One-at-a-time knockoffs: controlled false discovery rate with higher power

Charlie K. Guan, Zhimei Ren, Daniel W. Apley

PDF

Open Access 1 Repo

TL;DR

The paper introduces one-at-a-time knockoffs (OATK), a new method for variable selection in linear regression that controls false discovery rate with higher power and computational efficiency compared to existing approaches.

Contribution

OATK simplifies and relaxes the knockoff filter by generating knockoffs one-at-a-time, enabling higher power, better computational efficiency, and additional enhancements for FDR control.

Findings

01

OATK asymptotically controls FDR under mild conditions.

02

OATK achieves higher power than existing methods like BC.

03

OATK offers computational advantages and flexibility for further improvements.

Abstract

We propose one-at-a-time knockoffs (OATK), a new methodology for detecting important explanatory variables in linear regression models while controlling the false discovery rate (FDR). For each explanatory variable, OATK generates a knockoff design matrix that preserves the Gram matrix by replacing one-at-a-time only the single corresponding column of the original design matrix. OATK is a substantial relaxation and simplification of the knockoff filter by Barber and Cand\`es (BC), which simultaneously generates all columns of the knockoff design matrix to satisfy a much larger set of constraints. To test each variable's importance, statistics are then constructed by comparing the original vs. knockoff coefficients. Under a mild correlation assumption on the original design matrix, OATK asymptotically controls the FDR at any desired level. Moreover, OATK consistently achieves (often…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

charlie-guan/oatk
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Generative Adversarial Networks and Image Synthesis