A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs
Devansh Bisla, Apoorva Nandini Saridena, Anna Choromanska

TL;DR
This paper introduces a new method to estimate the generalization error of deep neural networks without relying on traditional capacity measures, using assumptions about error probability related to data distance, and validates it empirically.
Contribution
It proposes a novel empirical-theoretical approach to estimate DNN generalization error based on data distance assumptions, applicable to complex networks and validated on real datasets.
Findings
Estimates scale as O(1/(elta N^{1/d})) with data size N
Empirical validation shows good match with actual error behavior
Provides practical data requirements for deploying DNNs in safety-critical applications
Abstract
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs). Existing techniques in statistical learning require computation of capacity measures, such as VC dimension, to provably bound this error. It is however unclear how to extend these measures to DNNs and therefore the existing analyses are applicable to simple neural networks, which are not used in practice, e.g., linear or shallow ones or otherwise multi-layer perceptrons. Moreover, many theoretical error bounds are not empirically verifiable. We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures. The enabling technique in our approach hinges on two major assumptions: i) the network achieves zero training error, ii) the probability of making an error on a test point is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
