Feature diversity in self-supervised learning
Pranshu Malviya, Arjun Vaithilingam Sudhakar

TL;DR
This paper investigates how feature diversity impacts the generalization performance of self-supervised CNN models, revealing that diversity peaks in the last layer and is influenced by model width and training epochs.
Contribution
It provides new insights into the relationship between feature diversity and model generalization in self-supervised learning, especially regarding layer-wise diversity and model width effects.
Findings
Last layer exhibits the highest feature diversity during training.
Model width positively correlates with feature diversity.
Diversity decreases as test error improves over epochs.
Abstract
Many studies on scaling laws consider basic factors such as model size, model shape, dataset size, and compute power. These factors are easily tunable and represent the fundamental elements of any machine learning setup. But researchers have also employed more complex factors to estimate the test error and generalization performance with high predictability. These factors are generally specific to the domain or application. For example, feature diversity was primarily used for promoting syn-to-real transfer by Chen et al. (2021). With numerous scaling factors defined in previous works, it would be interesting to investigate how these factors may affect overall generalization performance in the context of self-supervised learning with CNN models. How do individual factors promote generalization, which includes varying depth, width, or the number of training epochs with early stopping?…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification
MethodsTest
