The noise level in linear regression with dependent data
Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni

TL;DR
This paper establishes upper bounds for the noise level in linear regression with dependent data, extending understanding beyond the realizable case and accurately capturing the variance predicted by the CLT.
Contribution
It provides the first non-asymptotic bounds for dependent data in linear regression without realizability assumptions, matching CLT variance predictions.
Findings
Bounds correctly recover the CLT variance term
Results are sharp in the moderate deviations regime
Analysis does not inflate leading order terms by mixing time
Abstract
We derive upper bounds for random design linear regression with dependent (-mixing) data absent any realizability assumptions. In contrast to the strictly realizable martingale noise regime, no sharp instance-optimal non-asymptotics are available in the literature. Up to constant factors, our analysis correctly recovers the variance term predicted by the Central Limit Theorem -- the noise level of the problem -- and thus exhibits graceful degradation as we introduce misspecification. Past a burn-in, our result is sharp in the moderate deviations regime, and in particular does not inflate the leading order term by mixing time factors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Markov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference
MethodsLinear Regression
