Performance of Empirical Risk Minimization for Linear Regression with Dependent Data
Christian Brownlees, Gu{\dh}mundur Stef\'an Gu{\dh}mundsson

TL;DR
This paper analyzes the performance of empirical risk minimization in high-dimensional linear regression with dependent, heavy-tailed data, showing near-optimal results without assuming specific data relationships.
Contribution
It extends existing bounds to dependent and heavy-tailed data, providing a nonparametric analysis of ERM in these complex settings.
Findings
ERM achieves near-optimal performance with dependent data
Results hold for both i.i.d. and heterogeneously distributed data
Analysis is nonparametric, not assuming specific data relationships
Abstract
This paper establishes bounds on the performance of empirical risk minimization for large-dimensional linear regression. We generalize existing results by allowing the data to be dependent and heavy-tailed. The analysis covers both the cases of identically and heterogeneously distributed observations. Our analysis is nonparametric in the sense that the relationship between the regressand and the regressors is not specified. The main results of this paper show that the empirical risk minimizer achieves the optimal performance (up to a logarithmic factor) in a dependent data setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
