Generalization of the Gibbs algorithm with high probability at low   temperatures

Andreas Maurer

arXiv:2502.11071·cs.LG·April 7, 2025

Generalization of the Gibbs algorithm with high probability at low temperatures

Andreas Maurer

PDF

Open Access

TL;DR

This paper provides a probabilistic bound on the generalization error of the Gibbs algorithm across temperature ranges, emphasizing the role of the loss landscape and flat minima, with implications for stochastic algorithms.

Contribution

It extends existing bounds to low temperatures, linking generalization to the data-dependent loss landscape and prior volume, supporting the importance of flat minima.

Findings

01

High probability bounds on generalization error at low temperatures

02

Generalization depends on the data-dependent loss landscape

03

Supports the benefit of flat minima in optimization

Abstract

The paper gives a bound on the generalization error of the Gibbs algorithm, which recovers known data-independent bounds for the high temperature range and extends to the low-temperature range, where generalization depends critically on the data-dependent loss-landscape. It is shown, that with high probability the generalization error of a single hypothesis drawn from the Gibbs posterior decreases with the total prior volume of all hypotheses with similar or smaller empirical error. This gives theoretical support to the belief in the benefit of flat minima. The zero temperature limit is discussed and the bound is extended to a class of similar stochastic algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications