Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis
Yash Deshpande, Adel Javanmard, Mohammad Mehrabi

TL;DR
This paper introduces an online debiasing method for high-dimensional linear regression with adaptively collected data, enabling valid inference such as p-values and confidence intervals despite bias from regularization and adaptivity.
Contribution
The paper proposes a novel online debiasing procedure that corrects biases in LASSO estimates under adaptive data collection, applicable to time series and batched data contexts.
Findings
Online debiasing effectively removes bias in high-dimensional LASSO estimates.
Debiased estimators enable accurate p-values and confidence intervals.
Method achieves optimal debiasing when sparsity is below a certain threshold.
Abstract
Adaptive collection of data is commonplace in applications throughout science and engineering. From the point of view of statistical inference however, adaptive data collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively, and the sample size can be smaller than , the number of covariates. In this setting, there are two distinct sources of bias: the first due to regularization imposed for consistent estimation, e.g. using the LASSO, and the second due to adaptivity in collecting the samples. We propose "online debiasing", a general procedure for estimators such as the LASSO, which addresses both sources of bias. In two concrete contexts time series analysis and batched data collection, we demonstrate that online debiasing optimally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
