Deep Learning is Not So Mysterious or Different

Andrew Gordon Wilson

arXiv:2503.02113·cs.LG·July 11, 2025·2 cites

Deep Learning is Not So Mysterious or Different

Andrew Gordon Wilson

PDF

Open Access 1 Video

TL;DR

This paper argues that deep neural networks' surprising generalization behaviors are not unique or mysterious, and can be understood through classical frameworks like PAC-Bayes, emphasizing soft inductive biases.

Contribution

It demonstrates that phenomena like benign overfitting and double descent are explainable within existing theories, challenging the notion that deep learning is fundamentally different.

Findings

01

Generalization phenomena are explainable by PAC-Bayes and hypothesis bounds.

02

Soft inductive biases favor simpler, data-consistent solutions.

03

Deep learning's uniqueness lies in representation learning and mode connectivity.

Abstract

Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular examples of anomalous generalization behaviour include benign overfitting, double descent, and the success of overparametrization. We argue that these phenomena are not distinct to neural networks, or particularly mysterious. Moreover, this generalization behaviour can be intuitively understood, and rigorously characterized, using long-standing generalization frameworks such as PAC-Bayes and countable hypothesis bounds. We present soft inductive biases as a key unifying principle in explaining these phenomena: rather than restricting the hypothesis space to avoid overfitting, embrace a flexible hypothesis space, with a soft preference for simpler solutions that are consistent with the data. This principle can be encoded in many model classes, and thus deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Explainable Artificial Intelligence (XAI)