Quantifying Translation-Invariance in Convolutional Neural Networks
Eric Kauderer-Abrams

TL;DR
This paper introduces translation-sensitivity maps to analyze how CNNs achieve translation invariance, revealing that training data augmentation plays a more crucial role than architectural choices.
Contribution
The study provides a new visualization tool and demonstrates that data augmentation is the primary factor for translation invariance in CNNs, challenging previous hypotheses.
Findings
Architectural choices have limited impact on translation-invariance.
Training data augmentation significantly enhances translation-invariance.
Receptive field size and pooling are secondary factors.
Abstract
A fundamental problem in object recognition is the development of image representations that are invariant to common transformations such as translation, rotation, and small deformations. There are multiple hypotheses regarding the source of translation invariance in CNNs. One idea is that translation invariance is due to the increasing receptive field size of neurons in successive convolution layers. Another possibility is that invariance is due to the pooling operation. We develop a simple a tool, the translation-sensitivity map, which we use to visualize and quantify the translation-invariance of various architectures. We obtain the surprising result that architectural choices such as the number of pooling layers and the convolution filter size have only a secondary effect on the translation-invariance of a network. Our analysis identifies training data augmentation as the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Cell Image Analysis Techniques
MethodsConvolution
