Uniform convergence may be unable to explain generalization in deep   learning

Vaishnavh Nagarajan; J. Zico Kolter

arXiv:1902.04742·cs.LG·October 19, 2021·42 cites

Uniform convergence may be unable to explain generalization in deep learning

Vaishnavh Nagarajan, J. Zico Kolter

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that uniform convergence bounds often fail to explain why overparameterized deep networks generalize well, as these bounds can be vacuous or even increase with dataset size, highlighting limitations of current theoretical explanations.

Contribution

The paper provides empirical evidence and theoretical examples showing uniform convergence cannot fully account for generalization in deep learning, challenging existing bounds.

Findings

01

Uniform convergence bounds can increase with dataset size in practice.

02

Uniform convergence cannot explain generalization in certain overparameterized models.

03

Existing bounds often yield vacuous guarantees for models with low test error.

Abstract

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic technique of uniform convergence. While it is well-known that many of these existing bounds are numerically large, through numerous experiments, we bring to light a more concerning aspect of these bounds: in practice, these bounds can {\em increase} with the training dataset size. Guided by our observations, we then present examples of overparameterized linear classifiers and neural networks trained by gradient descent (GD) where uniform convergence provably cannot "explain generalization" -- even if we take into account the implicit bias of GD {\em to the fullest extent possible}. More precisely, even if we consider only the set of classifiers output by GD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

locuslab/uniform-convergence-NeurIPS19
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning