Norm-based generalisation bounds for multi-class convolutional neural   networks

Antoine Ledent; Waleed Mustafa; Yunwen Lei; Marius Kloft

arXiv:1905.12430·cs.LG·February 23, 2021·5 cites

Norm-based generalisation bounds for multi-class convolutional neural networks

Antoine Ledent, Waleed Mustafa, Yunwen Lei, Marius Kloft

PDF

Open Access

TL;DR

This paper derives new generalisation error bounds for deep convolutional neural networks that depend only logarithmically on the number of classes and incorporate weight sharing, improving theoretical understanding of CNNs.

Contribution

The authors develop class-number-independent bounds for CNNs using Rademacher analysis with weight sharing, including pooling and sparse connections, advancing theoretical insights into CNN generalisation.

Findings

01

Bounds depend on weight norms, not parameter count

02

Bounds are asymptotically tight near initialization

03

Incorporates weight sharing and pooling effects

Abstract

We show generalisation error bounds for deep learning with two main improvements over the state of the art. (1) Our bounds have no explicit dependence on the number of classes except for logarithmic factors. This holds even when formulating the bounds in terms of the $L^{2}$ -norm of the weight matrices, where previous bounds exhibit at least a square-root dependence on the number of classes. (2) We adapt the classic Rademacher analysis of DNNs to incorporate weight sharing -- a task of fundamental theoretical importance which was previously attempted only under very restrictive assumptions. In our results, each convolutional filter contributes only once to the bound, regardless of how many times it is applied. Further improvements exploiting pooling and sparse connections are provided. The presented bounds scale as the norms of the parameter matrices, rather than the number of parameters.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques