Two-sample testing of high-dimensional linear regression coefficients via complementary sketching
Fengnan Gao, Tengyao Wang

TL;DR
This paper presents a novel complementary sketching method for two-sample testing of high-dimensional linear regression coefficients, achieving near-optimal power without requiring individual coefficient estimability.
Contribution
Introduces a new complementary sketching technique for two-sample testing in high-dimensional regression that works under sparse and dense alternatives without estimating individual coefficients.
Findings
Method achieves asymptotic optimal power under Gaussian design.
Performs well across various simulation settings.
Demonstrated utility on single-cell RNA sequencing data.
Abstract
We introduce a new method for two-sample testing of high-dimensional linear regression coefficients without assuming that those coefficients are individually estimable. The procedure works by first projecting the matrices of covariates and response vectors along directions that are complementary in sign in a subset of the coordinates, a process which we call 'complementary sketching'. The resulting projected covariates and responses are aggregated to form two test statistics, which are shown to have essentially optimal asymptotic power under a Gaussian design when the difference between the two regression coefficients is sparse and dense respectively. Simulations confirm that our methods perform well in a broad class of settings and an application to a large single-cell RNA sequencing dataset demonstrates its utility in the real world.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gene expression and cancer classification
