On information captured by neural networks: connections with   memorization and generalization

Hrayr Harutyunyan

arXiv:2306.15918·cs.LG·June 29, 2023

On information captured by neural networks: connections with memorization and generalization

Hrayr Harutyunyan

PDF

Open Access 1 Repo

TL;DR

This paper investigates how neural networks capture information during training, linking it to memorization and generalization, and introduces methods to analyze and improve understanding of these processes.

Contribution

It provides an information-theoretic framework for understanding neural network learning, including a new algorithm limiting label noise and insights into example informativeness.

Findings

01

Limits label noise information in weights during training

02

Defines a notion of unique sample information affecting training

03

Relates example informativeness to generalization bounds

Abstract

Despite the popularity and success of deep learning, there is limited understanding of when, how, and why neural networks generalize to unseen examples. Since learning can be seen as extracting information from data, we formally study information captured by neural networks during training. Specifically, we start with viewing learning in presence of noisy labels from an information-theoretic perspective and derive a learning algorithm that limits label noise information in weights. We then define a notion of unique information that an individual sample provides to the training of a deep network, shedding some light on the behavior of neural networks on examples that are atypical, ambiguous, or belong to underrepresented subpopulations. We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds. Finally, by studying knowledge distillation, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awslabs/aws-cv-unique-information
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification