Second-Order Bounds for [0,1]-Valued Regression via Betting Loss
Yinan Li, Kwang-Sung Jun

TL;DR
This paper introduces the betting loss, a new loss function for [0,1]-valued regression, that achieves variance-dependent second-order bounds without prior variance knowledge, improving upon first-order bounds.
Contribution
The paper proposes the betting loss, a novel variance-adaptive loss function that provides second-order bounds in [0,1]-valued regression without needing variance estimation.
Findings
Betting loss achieves variance-dependent bounds.
It is variance-adaptive without prior variance knowledge.
Improves upon first-order bounds in regression.
Abstract
We consider the -valued regression problem in the i.i.d. setting. In a related problem called cost-sensitive classification, \citet{foster21efficient} have shown that the log loss minimizer achieves an improved generalization bound compared to that of the squared loss minimizer in the sense that the bound scales with the cost of the best classifier, which can be arbitrarily small depending on the problem at hand. Such a result is often called a first-order bound. For -valued regression, we first show that the log loss minimizer leads to a similar first-order bound. We then ask if there exists a loss function that achieves a variance-dependent bound (also known as a second order bound), which is a strict improvement upon first-order bounds. We answer this question in the affirmative by proposing a novel loss function called the betting loss. Our result is…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper is clear and easy to follow.
Honestly, this paper gives me a hard time evaluating it fairly. On one hand, the __theoretical__ finding of this paper, though niche, is interesting enough. The proof technique is simple and natural, nothing to write home about. On the other hand, one might argue that the result only makes sense theoretically, and in practice, it is absolutely intractable due to the multi-level (not even bi-level) optimization nature. As a result, the experiments are very simple and only for demonstration purpos
- The authors proved improved bounds (in certain regimes) for a fundamental problem in learning theory. - The results seem to be following in a non-trivial way.
- I found the intuition behind the betting loss function a bit unclear. - The result requires that the label space $y$ is bounded in $[0,1]$.
The problem studied by this paper, namely giving variance-adaptive / second-order generalization bounds, is a nice theoretical problem in statistical learning theory. The method proposed (minimizing a surrogate betting loss) seems appealingly simple and practical to use. The paper is generally written in a clear way.
The main weakness of the paper is that it does not seem to be aware of / engage with a large body of closely related prior work in generalization theory for supervised learning. This prior work is usually associated with keywords such as "optimistic rates", "optimal / localized generalization bounds", "benign overfitting", and "Moreau envelope theory", among others. The authors may claim that these works focus on so-called "first-order" bounds, whereas they are concerned with "second-order" boun
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
