Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Michael Kohler, Adam Krzyzak, Benjamin Walter

TL;DR
This paper analyzes how quickly over-parametrized convolutional neural networks trained with gradient descent converge in terms of misclassification risk, providing theoretical bounds on their learning rate.
Contribution
It introduces a theoretical bound on the convergence rate of over-parametrized CNNs trained by gradient descent for image classification.
Findings
Derived a bound on the convergence rate of the misclassification risk
Provides insights into the learning dynamics of over-parametrized CNNs
Enhances understanding of training efficiency for CNN classifiers
Abstract
Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
