Asymptotic Theory of Iterated Empirical Risk Minimization, with Applications to Active Learning
Hugo Cui, Yue M. Lu

TL;DR
This paper develops an asymptotic theory for iterated empirical risk minimization, revealing fundamental tradeoffs and behaviors in active learning scenarios with data reuse and prediction-dependent losses.
Contribution
It provides the first sharp asymptotic analysis of iterated ERM with data reuse, especially in active learning, and uncovers key tradeoffs and phenomena like double descent.
Findings
Explicit asymptotic predictions for second-stage test error.
Removal of oracle and sample-splitting assumptions in active learning.
Identification of a fundamental tradeoff in labeling budget allocation.
Abstract
We study a class of iterated empirical risk minimization (ERM) procedures in which two successive ERMs are performed on the same dataset, and the predictions of the first estimator enter as an argument in the loss function of the second. This setting, which arises naturally in active learning and reweighting schemes, introduces intricate statistical dependencies across samples and fundamentally distinguishes the problem from classical single-stage ERM analyses. For linear models trained with a broad class of convex losses on Gaussian mixture data, we derive a sharp asymptotic characterization of the test error in the high-dimensional regime where the sample size and ambient dimension scale proportionally. Our results provide explicit, fully asymptotic predictions for the performance of the second-stage estimator despite the reuse of data and the presence of prediction-dependent losses.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods
