Generalization Bounds and Stopping Rules for Learning with Self-Selected Data

Julian Rodemann; James Bailie

arXiv:2505.07367·cs.LG·May 13, 2025

Generalization Bounds and Stopping Rules for Learning with Self-Selected Data

Julian Rodemann, James Bailie

PDF

Open Access

TL;DR

This paper establishes universal generalization bounds and stopping rules for reciprocal learning, a framework encompassing active learning, semi-supervised learning, bandits, and boosting, ensuring out-of-sample performance without distribution assumptions.

Contribution

It introduces a unified theoretical framework with generalization bounds and stopping rules for reciprocal learning, applicable to various self-selected data paradigms.

Findings

01

Universal generalization bounds using covering numbers and Wasserstein ambiguity sets.

02

Stopping rules for reciprocal learning algorithms to guarantee out-of-sample performance.

03

Bounds and rules illustrated for semi-supervised learning case.

Abstract

Many learning paradigms self-select training data in light of previously learned parameters. Examples include active learning, semi-supervised learning, bandits, or boosting. Rodemann et al. (2024) unify them under the framework of "reciprocal learning". In this article, we address the question of how well these methods can generalize from their self-selected samples. In particular, we prove universal generalization bounds for reciprocal learning using covering numbers and Wasserstein ambiguity sets. Our results require no assumptions on the distribution of self-selected data, only verifiable conditions on the algorithms. We prove results for both convergent and finite iteration solutions. The latter are anytime valid, thereby giving rise to stopping rules for a practitioner seeking to guarantee the out-of-sample performance of their reciprocal learning algorithm. Finally, we illustrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques