Second-Order Bounds for [0,1]-Valued Regression via Betting Loss

Yinan Li; Kwang-Sung Jun

arXiv:2507.12584·cs.LG·July 18, 2025

Second-Order Bounds for [0,1]-Valued Regression via Betting Loss

Yinan Li, Kwang-Sung Jun

PDF

Open Access 3 Reviews

TL;DR

This paper introduces the betting loss, a new loss function for [0,1]-valued regression, that achieves variance-dependent second-order bounds without prior variance knowledge, improving upon first-order bounds.

Contribution

The paper proposes the betting loss, a novel variance-adaptive loss function that provides second-order bounds in [0,1]-valued regression without needing variance estimation.

Findings

01

Betting loss achieves variance-dependent bounds.

02

It is variance-adaptive without prior variance knowledge.

03

Improves upon first-order bounds in regression.

Abstract

We consider the $[0, 1]$ -valued regression problem in the i.i.d. setting. In a related problem called cost-sensitive classification, \citet{foster21efficient} have shown that the log loss minimizer achieves an improved generalization bound compared to that of the squared loss minimizer in the sense that the bound scales with the cost of the best classifier, which can be arbitrarily small depending on the problem at hand. Such a result is often called a first-order bound. For $[0, 1]$ -valued regression, we first show that the log loss minimizer leads to a similar first-order bound. We then ask if there exists a loss function that achieves a variance-dependent bound (also known as a second order bound), which is a strict improvement upon first-order bounds. We answer this question in the affirmative by proposing a novel loss function called the betting loss. Our result is…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 1

Strengths

The paper is clear and easy to follow.

Weaknesses

Honestly, this paper gives me a hard time evaluating it fairly. On one hand, the __theoretical__ finding of this paper, though niche, is interesting enough. The proof technique is simple and natural, nothing to write home about. On the other hand, one might argue that the result only makes sense theoretically, and in practice, it is absolutely intractable due to the multi-level (not even bi-level) optimization nature. As a result, the experiments are very simple and only for demonstration purpos

Reviewer 02Rating 8Confidence 2

Strengths

- The authors proved improved bounds (in certain regimes) for a fundamental problem in learning theory. - The results seem to be following in a non-trivial way.

Weaknesses

- I found the intuition behind the betting loss function a bit unclear. - The result requires that the label space $y$ is bounded in $[0,1]$.

Reviewer 03Rating 2Confidence 4

Strengths

The problem studied by this paper, namely giving variance-adaptive / second-order generalization bounds, is a nice theoretical problem in statistical learning theory. The method proposed (minimizing a surrogate betting loss) seems appealingly simple and practical to use. The paper is generally written in a clear way.

Weaknesses

The main weakness of the paper is that it does not seem to be aware of / engage with a large body of closely related prior work in generalization theory for supervised learning. This prior work is usually associated with keywords such as "optimistic rates", "optimal / localized generalization bounds", "benign overfitting", and "Moreau envelope theory", among others. The authors may claim that these works focus on so-called "first-order" bounds, whereas they are concerned with "second-order" boun

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models