The Effect of Learning Strategy versus Inherent Architecture Properties on the Ability of Convolutional Neural Networks to Develop Transformation Invariance
Megha Srivastava, Kalanit Grill-Spector

TL;DR
This study investigates how learning strategies and inherent architecture properties influence CNNs' ability to develop invariance to spatial transformations, highlighting pre-training as a key factor for robustness.
Contribution
It provides a comprehensive empirical analysis of seven CNN architectures, emphasizing the impact of pre-training and architecture on transformation invariance.
Findings
Pre-training on large datasets significantly improves invariance.
VGG and ResNet architectures show higher robustness than AlexNet.
Learning strategy and architecture both influence CNNs' transformation invariance.
Abstract
As object recognition becomes an increasingly common ML task, and recent research demonstrating CNNs vulnerability to attacks and small image perturbations necessitate fully understanding the foundations of object recognition. We focus on understanding the mechanisms behind how neural networks generalize to spatial transformations of complex objects. While humans excel at discriminating between objects shown at new positions, orientations, and scales, past results demonstrate that this may be limited to familiar objects - humans demonstrate low tolerance of spatial-variances for purposefully constructed novel objects. Because training artificial neural networks from scratch is similar to showing novel objects to humans, we seek to understand the factors influencing the tolerance of CNNs to spatial transformations. We conduct a thorough empirical examination of seven Convolutional Neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Face recognition and analysis
Methods1x1 Convolution · Local Response Normalization · Grouped Convolution · Dropout · How do I speak to a person at Expedia?-/+/ · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Softmax · Convolution
