Providing theoretical learning guarantees to Deep Learning Networks
Rodrigo Fernandes de Mello, Martha Dais Ferreira, Moacir Antonelli, Ponti

TL;DR
This paper uses Statistical Learning Theory to analyze the convergence and complexity of Deep Neural Networks, providing theoretical guarantees and insights into their learning capabilities and potential overfitting issues.
Contribution
It introduces a method to estimate the Shattering coefficient of deep networks, offering a theoretical framework for understanding their learning guarantees and biases.
Findings
Shattering coefficient estimation for CNNs like AlexNet and VGG16
Conditions under which deep networks can theoretically learn effectively
Highlighting the gap between empirical risk minimization and model complexity
Abstract
Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
