Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization
Matan Schliserman, Tomer Koren

TL;DR
This paper provides new theoretical insights into the sample complexity of vector-valued linear predictors with convex Lipschitz loss functions, connecting linear models and stochastic convex optimization.
Contribution
It offers a tight characterization of ERM sample complexity and a black-box reduction from stochastic convex optimization to vector-valued prediction.
Findings
ERM requires rac{k}{\u03b5^2} samples for risk
The results improve previous bounds by Magen and Shamir (2023)
Any stochastic convex optimization problem can be embedded as a vector-valued prediction problem with outputs proportional to the input dimension d
Abstract
We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an -dimensional feature vector to a -dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that examples are necessary for ERM to reach excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2023) in terms of the dependence on the target dimension , and matches a classical upper bound due to Maurer (2016). Second, we present a black-box…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
