Rethinking generalization requires revisiting old ideas: statistical   mechanics approaches and complex learning behavior

Charles H. Martin; Michael W. Mahoney

arXiv:1710.09553·cs.LG·February 19, 2019·29 cites

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

Charles H. Martin, Michael W. Mahoney

PDF

Open Access

TL;DR

This paper revisits classical statistical mechanics ideas to better understand the complex and counterintuitive generalization behaviors of deep neural networks, moving beyond traditional capacity control theories.

Contribution

It introduces a simple deep learning model controlled by two parameters, linking statistical mechanics concepts to explain neural network generalization phenomena.

Findings

01

Explains overfitting and sharp transitions in generalization.

02

Describes how noise and early stopping affect model behavior.

03

Provides qualitative insights into empirical neural network behaviors.

Abstract

We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks. The approach involves going beyond worst-case theoretical capacity control frameworks that have been popular in machine learning in recent years to revisit old ideas in the statistical mechanics of neural networks. Within this approach, we present a prototypical Very Simple Deep Learning (VSDL) model, whose behavior is controlled by two control parameters, one describing an effective amount of data, or load, on the network (that decreases when noise is added to the input), and one with an effective temperature interpretation (that increases when algorithms are early stopped). Using this model, we describe how a very simple application of ideas from the statistical mechanics theory of generalization provides a strong qualitative description of recently-observed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Model Reduction and Neural Networks