Convex Risk Minimization and Conditional Probability Estimation
Matus Telgarsky, Miroslav Dud\'ik, Robert Schapire

TL;DR
This paper establishes that convex risk minimization uniquely determines a conditional probability model in general settings, including cases without a true minimum, and proves convergence of predictors to this model.
Contribution
It generalizes previous results by including cases with no minimum and provides convergence proofs for both source and empirical risk minimization.
Findings
Predictors converge to the unique model in linear predictor classes.
Empirical risk minimization converges in finite-dimensional predictor classes.
Provides a norm-free generalization bound for empirical risk minimization.
Abstract
This paper proves, in very general settings, that convex risk minimization is a procedure to select a unique conditional probability model determined by the classification problem. Unlike most previous work, we give results that are general enough to include cases in which no minimum exists, as occurs typically, for instance, with standard boosting algorithms. Concretely, we first show that any sequence of predictors minimizing convex risk over the source distribution will converge to this unique model when the class of predictors is linear (but potentially of infinite dimension). Secondly, we show the same result holds for \emph{empirical} risk minimization whenever this class of predictors is finite dimensional, where the essential technical contribution is a norm-free generalization bound.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Bayesian Modeling and Causal Inference
