A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs

Devansh Bisla; Apoorva Nandini Saridena; Anna Choromanska

arXiv:2105.01867·cs.LG·May 6, 2021

A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs

Devansh Bisla, Apoorva Nandini Saridena, Anna Choromanska

PDF

TL;DR

This paper introduces a new method to estimate the generalization error of deep neural networks without relying on traditional capacity measures, using assumptions about error probability related to data distance, and validates it empirically.

Contribution

It proposes a novel empirical-theoretical approach to estimate DNN generalization error based on data distance assumptions, applicable to complex networks and validated on real datasets.

Findings

01

Estimates scale as O(1/(elta N^{1/d})) with data size N

02

Empirical validation shows good match with actual error behavior

03

Provides practical data requirements for deploying DNNs in safety-critical applications

Abstract

This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs). Existing techniques in statistical learning require computation of capacity measures, such as VC dimension, to provably bound this error. It is however unclear how to extend these measures to DNNs and therefore the existing analyses are applicable to simple neural networks, which are not used in practice, e.g., linear or shallow ones or otherwise multi-layer perceptrons. Moreover, many theoretical error bounds are not empirically verifiable. We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures. The enabling technique in our approach hinges on two major assumptions: i) the network achieves zero training error, ii) the probability of making an error on a test point is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.