On genuine invariance learning without weight-tying
Artem Moskalev, Anna Sepliarskaia, Erik J. Bekkers, Arnold, Smeulders

TL;DR
This paper examines how neural networks learn invariance from data versus genuine invariance from weight-tying, revealing limitations and proposing regularization methods to achieve true invariance that is robust to input shifts.
Contribution
It introduces metrics to quantify learned invariance and demonstrates regularization techniques that promote genuine invariance comparable to weight-tying models.
Findings
Learned invariance is data-dependent and unreliable under distribution shifts.
Regularization can guide networks to achieve genuine invariance similar to weight-tying.
Spectral decay phenomenon reveals how networks reduce sensitivity to specific transformations.
Abstract
In this paper, we investigate properties and limitations of invariance learned by neural networks from the data compared to the genuine invariance achieved through invariant weight-tying. To do so, we adopt a group theoretical perspective and analyze invariance learning in neural networks without weight-tying constraints. We demonstrate that even when a network learns to correctly classify samples on a group orbit, the underlying decision-making in such a model does not attain genuine invariance. Instead, learned invariance is strongly conditioned on the input data, rendering it unreliable if the input distribution shifts. We next demonstrate how to guide invariance learning toward genuine invariance by regularizing the invariance of a model at the training. To this end, we propose several metrics to quantify learned invariance: (i) predictive distribution invariance, (ii) logit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications
