On the Relationship Between Representation Geometry and Generalization in Deep Neural Networks
Sumit Yadav

TL;DR
This paper demonstrates that the effective dimension, an unsupervised geometric metric, strongly predicts neural network performance across various models, tasks, and noise conditions, establishing a causal relationship between representation geometry and accuracy.
Contribution
It introduces effective dimension as a domain-agnostic, label-free predictor of neural network performance and shows its causal influence through noise experiments across multiple architectures.
Findings
Effective dimension predicts accuracy with high correlation across models and tasks.
Degrading geometry via noise reduces accuracy, while improving it maintains performance.
The relationship holds across different noise types and is independent of model size.
Abstract
We investigate the relationship between representation geometry and neural network performance. Analyzing 52 pretrained ImageNet models across 13 architecture families, we show that effective dimension -- an unsupervised geometric metric -- strongly predicts accuracy. Output effective dimension achieves partial r=0.75 () after controlling for model capacity, while total compression achieves partial r=-0.72. These findings replicate across ImageNet and CIFAR-10, and generalize to NLP: effective dimension predicts performance for 8 encoder models on SST-2/MNLI and 15 decoder-only LLMs on AG News (r=0.69, p=0.004), while model size does not (r=0.07). We establish bidirectional causality: degrading geometry via noise causes accuracy loss (r=-0.94, ), while improving geometry via PCA maintains accuracy across architectures (-0.03pp at 95% variance). This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques
