Do Invariances in Deep Neural Networks Align with Human Perception?
Vedant Nanda, Ayan Majumdar, Camila Kolling, John P., Dickerson, Krishna P. Gummadi, Bradley C. Love, Adrian Weller

TL;DR
This paper evaluates how well deep neural network invariances align with human perception, highlighting the impact of loss functions on IRIs and identifying model components that improve human-like invariance learning.
Contribution
It introduces an adversarial regularizer for IRI generation, clarifies the influence of loss functions, and investigates model components that enhance human-like invariance alignment.
Findings
Residual architectures with contrastive loss improve alignment with human invariances.
Regularizer-free IRI generation provides more meaningful model comparisons.
Adversarial data augmentation influences the invariance properties of models.
Abstract
An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural network, and thus capture invariances of a given network. One necessary criterion for a network's invariances to align with human perception is for its IRIs look 'similar' to humans. Prior works, however, have mixed takeaways; some argue that later layers of DNNs do not learn human-like invariances (\cite{jenelle2019metamers}) yet others seem to indicate otherwise (\cite{mahendran2014understanding}). We argue that the loss function used to generate IRIs can heavily affect takeaways about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsALIGN
