Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles
Kevin Scaman, Mathieu Even, Batiste Le Bars, Laurent Massouli\'e

TL;DR
This paper analyzes the worst-case excess risk of first-order optimization methods in various statistical learning scenarios with data-dependent gradient oracles, providing sharp bounds linked to gradient estimation errors.
Contribution
It introduces a framework for bounding minimax excess risk with data-dependent oracles, unifying analysis across multiple learning paradigms.
Findings
Bounds proportional to gradient estimation error.
Sharp upper and lower bounds for strongly convex, smooth problems.
Applicability to transfer, robust, and federated learning scenarios.
Abstract
In this paper, our aim is to analyse the generalization capabilities of first-order methods for statistical learning in multiple, different yet related, scenarios including supervised learning, transfer learning, robust learning and federated learning. To do so, we provide sharp upper and lower bounds for the minimax excess risk of strongly convex and smooth statistical learning when the gradient is accessed through partial observations given by a data-dependent oracle. This novel class of oracles can query the gradient with any given data distribution, and is thus well suited to scenarios in which the training data distribution does not match the target (or test) distribution. In particular, our upper and lower bounds are proportional to the smallest mean square error achievable by gradient estimators, thus allowing us to easily derive multiple sharp bounds in the aforementioned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
