Deep Learning is Not So Mysterious or Different
Andrew Gordon Wilson

TL;DR
This paper argues that deep neural networks' surprising generalization behaviors are not unique or mysterious, and can be understood through classical frameworks like PAC-Bayes, emphasizing soft inductive biases.
Contribution
It demonstrates that phenomena like benign overfitting and double descent are explainable within existing theories, challenging the notion that deep learning is fundamentally different.
Findings
Generalization phenomena are explainable by PAC-Bayes and hypothesis bounds.
Soft inductive biases favor simpler, data-consistent solutions.
Deep learning's uniqueness lies in representation learning and mode connectivity.
Abstract
Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular examples of anomalous generalization behaviour include benign overfitting, double descent, and the success of overparametrization. We argue that these phenomena are not distinct to neural networks, or particularly mysterious. Moreover, this generalization behaviour can be intuitively understood, and rigorously characterized, using long-standing generalization frameworks such as PAC-Bayes and countable hypothesis bounds. We present soft inductive biases as a key unifying principle in explaining these phenomena: rather than restricting the hypothesis space to avoid overfitting, embrace a flexible hypothesis space, with a soft preference for simpler solutions that are consistent with the data. This principle can be encoded in many model classes, and thus deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]· youtube
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Explainable Artificial Intelligence (XAI)
