Complexity of Vector-valued Prediction: From Linear Models to Stochastic   Convex Optimization

Matan Schliserman; Tomer Koren

arXiv:2412.04274·cs.LG·December 6, 2024

Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization

Matan Schliserman, Tomer Koren

PDF

Open Access

TL;DR

This paper provides new theoretical insights into the sample complexity of vector-valued linear predictors with convex Lipschitz loss functions, connecting linear models and stochastic convex optimization.

Contribution

It offers a tight characterization of ERM sample complexity and a black-box reduction from stochastic convex optimization to vector-valued prediction.

Findings

01

ERM requires rac{k}{\u03b5^2} samples for risk

02

The results improve previous bounds by Magen and Shamir (2023)

03

Any stochastic convex optimization problem can be embedded as a vector-valued prediction problem with outputs proportional to the input dimension d

Abstract

We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an $m$ -dimensional feature vector to a $k$ -dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that $Ω (k / ϵ^{2})$ examples are necessary for ERM to reach $ϵ$ excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2023) in terms of the dependence on the target dimension $k$ , and matches a classical upper bound due to Maurer (2016). Second, we present a black-box…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsFocus