Provable Tempered Overfitting of Minimal Nets and Typical Nets

Itamar Harel; William M. Hoza; Gal Vardi; Itay Evron; Nathan Srebro,; Daniel Soudry

arXiv:2410.19092·cs.LG·October 28, 2024

Provable Tempered Overfitting of Minimal Nets and Typical Nets

Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro,, Daniel Soudry

PDF

Open Access 1 Video

TL;DR

This paper provides the first theoretical analysis showing that deep neural networks with binary weights exhibit tempered overfitting when trained on noisy data, applicable across various input dimensions.

Contribution

It introduces new bounds on threshold circuit sizes and demonstrates that overfitting in deep NNs can be benign or tempered without extreme input dimension assumptions.

Findings

01

Overfitting is proven to be tempered in deep NNs with binary weights.

02

Theoretical bounds are established for threshold circuits consistent with partial functions.

03

First results applying to deep NNs without requiring very high or very low input dimensions.

Abstract

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Provable Tempered Overfitting of Minimal Nets and Typical Nets· slideslive

Taxonomy

TopicsScheduling and Optimization Algorithms · Neural Networks and Applications · Petri Nets in System Modeling