A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa, Evan Gerritz, Steven W. Zucker

TL;DR
This paper introduces a new method to evaluate which layers of deep neural networks best generalize to unseen data by analyzing latent embeddings, revealing that deeper layers do not always have superior generalization capacity.
Contribution
The authors propose a novel approach to quantify layer-wise generalization in deep networks using latent embeddings, applicable in both supervised and unsupervised settings.
Findings
High classification accuracy does not guarantee high generalization.
Deeper layers do not always generalize better, affecting pruning strategies.
The method consistently reveals intrinsic generalization capacities across datasets.
Abstract
Generalization to unseen data remains poorly understood for deep learning classification and foundation models, especially in the open set scenario. How can one assess the ability of networks to adapt to new or extended versions of their input space in the spirit of few-shot learning, out-of-distribution generalization, domain adaptation, and category discovery? Which layers of a network are likely to generalize best? We provide a new method for evaluating the capacity of networks to represent a sampled domain, regardless of whether the network has been trained on all classes in that domain. Our approach is the following: after fine-tuning state-of-the-art pre-trained models for visual classification on a particular domain, we assess their performance on data from related but distinct variations in that domain. Generalization power is quantified as a function of the latent embeddings of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training
