Fast Rate Generalization Error Bounds: Variations on a Theme

Xuetong Wu; Jonathan H. Manton; Uwe Aickelin; Jingge Zhu

arXiv:2205.03131·cs.IT·May 16, 2022

Fast Rate Generalization Error Bounds: Variations on a Theme

Xuetong Wu, Jonathan H. Manton, Uwe Aickelin, Jingge Zhu

PDF

Open Access

TL;DR

This paper explores conditions under which fast O(1/n) generalization error bounds can be achieved using information-theoretic measures, challenging the common belief that such bounds are slow due to square root dependencies.

Contribution

It introduces the (eta,c)-central condition that enables fast rate bounds and demonstrates how information-theoretic bounds can be applied under this condition for specific algorithms.

Findings

01

Fast rate (O(1/n)) bounds are possible under certain assumptions.

02

The (eta,c)-central condition is key for achieving fast rates.

03

Analytical examples validate the effectiveness of the proposed bounds.

Abstract

A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of O(sqrt{lambda/n}) where lambda is some information-theoretic quantities such as the mutual information between the data sample and the learned hypothesis. However, such a learning rate is typically considered to be "slow", compared to a "fast rate" of O(1/n) in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate (O(1/n)) result can still be obtained using this bound under appropriate assumptions. Furthermore, we identify the key conditions needed for the fast rate generalization error, which we call the (eta,c)-central condition. Under this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning