Generalization error bounds for iterative learning algorithms with   bounded updates

Jingwen Fu; Nanning Zheng

arXiv:2309.05077·cs.LG·October 17, 2023

Generalization error bounds for iterative learning algorithms with bounded updates

Jingwen Fu, Nanning Zheng

PDF

Open Access

TL;DR

This paper develops new information-theoretic bounds on the generalization error of iterative learning algorithms with bounded updates, especially for non-convex loss functions, enhancing understanding of their theoretical properties.

Contribution

It introduces a novel mutual information reformulation and a variance decomposition technique to derive tighter generalization bounds for bounded-update algorithms.

Findings

01

Improved generalization bounds under various settings

02

New perspective on mutual information as update uncertainty

03

Analysis of large language models' scaling behavior

Abstract

This paper explores the generalization characteristics of iterative learning algorithms with bounded updates for non-convex loss functions, employing information-theoretic techniques. Our key contribution is a novel bound for the generalization error of these algorithms with bounded updates. Our approach introduces two main novelties: 1) we reformulate the mutual information as the uncertainty of updates, providing a new perspective, and 2) instead of using the chaining rule of mutual information, we employ a variance decomposition technique to decompose information across iterations, allowing for a simpler surrogate process. We analyze our generalization bound under various settings and demonstrate improved bounds. To bridge the gap between theory and practice, we also examine the previously observed scaling behavior in large language models. Ultimately, our work takes a further step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM