Providing theoretical learning guarantees to Deep Learning Networks

Rodrigo Fernandes de Mello; Martha Dais Ferreira; Moacir Antonelli; Ponti

arXiv:1711.10292·cs.LG·November 29, 2017·6 cites

Providing theoretical learning guarantees to Deep Learning Networks

Rodrigo Fernandes de Mello, Martha Dais Ferreira, Moacir Antonelli, Ponti

PDF

Open Access

TL;DR

This paper uses Statistical Learning Theory to analyze the convergence and complexity of Deep Neural Networks, providing theoretical guarantees and insights into their learning capabilities and potential overfitting issues.

Contribution

It introduces a method to estimate the Shattering coefficient of deep networks, offering a theoretical framework for understanding their learning guarantees and biases.

Findings

01

Shattering coefficient estimation for CNNs like AlexNet and VGG16

02

Conditions under which deep networks can theoretically learn effectively

03

Highlighting the gap between empirical risk minimization and model complexity

Abstract

Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms