A Margin-based Multiclass Generalization Bound via Geometric Complexity

Michael Munn; Benoit Dherin; Javier Gonzalvo

arXiv:2405.18590·stat.ML·May 30, 2024

A Margin-based Multiclass Generalization Bound via Geometric Complexity

Michael Munn, Benoit Dherin, Javier Gonzalvo

PDF

Open Access

TL;DR

This paper derives a new margin-based generalization bound for neural networks using geometric complexity, providing insights into their ability to generalize across different data distributions and model architectures.

Contribution

It introduces a novel upper bound on generalization error that depends on margin-normalized geometric complexity, applicable to various neural network models and data distributions.

Findings

01

Bound scales with margin-normalized geometric complexity

02

Empirical validation on ResNet-18 with CIFAR datasets

03

Effective for both original and random labels

Abstract

There has been considerable effort to better understand the generalization capabilities of deep neural networks both as a means to unlock a theoretical understanding of their success as well as providing directions for further improvements. In this paper, we investigate margin-based multiclass generalization bounds for neural networks which rely on a recent complexity measure, the geometric complexity, developed for neural networks. We derive a new upper bound on the generalization error which scales with the margin-normalized geometric complexity of the network and which holds for a broad family of data distributions and model classes. Our generalization bound is empirically investigated for a ResNet-18 model trained with SGD on the CIFAR-10 and CIFAR-100 datasets with both original and random labels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic

MethodsStochastic Gradient Descent