Analysis of the rate of convergence of an over-parametrized   convolutional neural network image classifier learned by gradient descent

Michael Kohler; Adam Krzyzak; Benjamin Walter

arXiv:2405.07619·stat.ML·May 14, 2024·1 cites

Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent

Michael Kohler, Adam Krzyzak, Benjamin Walter

PDF

Open Access

TL;DR

This paper analyzes how quickly over-parametrized convolutional neural networks trained with gradient descent converge in terms of misclassification risk, providing theoretical bounds on their learning rate.

Contribution

It introduces a theoretical bound on the convergence rate of over-parametrized CNNs trained by gradient descent for image classification.

Findings

01

Derived a bound on the convergence rate of the misclassification risk

02

Provides insights into the learning dynamics of over-parametrized CNNs

03

Enhances understanding of training efficiency for CNN classifiers

Abstract

Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications