What does a deep neural network confidently perceive? The effective   dimension of high certainty class manifolds and their low confidence   boundaries

Stanislav Fort; Ekin Dogus Cubuk; Surya Ganguli; Samuel S. Schoenholz

arXiv:2210.05546·cs.LG·October 12, 2022

What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries

Stanislav Fort, Ekin Dogus Cubuk, Surya Ganguli, Samuel S. Schoenholz

PDF

Open Access 1 Repo

TL;DR

This paper investigates the geometry of class manifolds in deep neural networks, revealing that higher-performing models tend to have higher-dimensional class regions, which impacts generalization and robustness.

Contribution

It introduces a tractable method to estimate the effective dimension of class manifolds using Gaussian width and Gordon's escape theorem, linking geometry to model performance.

Findings

01

Higher performing models have higher-dimensional class manifolds.

02

Model robustness correlates with increased class manifold dimension.

03

Ensembling effects can be understood through intersections of class manifolds.

Abstract

Deep neural network classifiers partition input space into high confidence regions for each class. The geometry of these class manifolds (CMs) is widely studied and intimately related to model performance; for example, the margin depends on CM boundaries. We exploit the notions of Gaussian width and Gordon's escape theorem to tractably estimate the effective dimension of CMs and their boundaries through tomographic intersections with random affine subspaces of varying dimension. We show several connections between the dimension of CMs, generalization, and robustness. In particular we investigate how CM dimension depends on 1) the dataset, 2) architecture (including ResNet, WideResNet \& Vision Transformer), 3) initialization, 4) stage of training, 5) class, 6) network width, 7) ensemble size, 8) label randomization, 9) training set size, and 10) robustness to data corruption. Together a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stanislavfort/slice-dice-optimize
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · AI in cancer detection · Medical Imaging and Analysis

MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Residual Connection · Dropout · 1x1 Convolution · Bottleneck Residual Block · Batch Normalization · Convolution · Wide Residual Block