Stride and Translation Invariance in CNNs
Coenraad Mouton, Johannes C. Myburgh, Marelie H. Davel

TL;DR
This paper investigates how stride and local homogeneity influence translation invariance in CNNs, revealing dataset-specific relationships and trade-offs with generalization, and evaluates alternative solutions like pooling and data augmentation.
Contribution
It demonstrates that stride can enhance translation invariance when combined with local homogeneity and analyzes the dataset-specific interplay between pooling size and invariance.
Findings
Stride benefits translation invariance with local homogeneity
Larger pooling kernels improve invariance but reduce generalization
Global average pooling and data augmentation have varying effects
Abstract
Convolutional Neural Networks have become the standard for image classification tasks, however, these architectures are not invariant to translations of the input image. This lack of invariance is attributed to the use of stride which ignores the sampling theorem, and fully connected layers which lack spatial reasoning. We show that stride can greatly benefit translation invariance given that it is combined with sufficient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We also observe that this characteristic is dataset-specific and dictates the relationship between pooling kernel size and stride required for translation invariance. Furthermore we find that a trade-off exists between generalization and translation invariance in the case of pooling kernel size, as larger kernel sizes lead to better invariance but poorer generalization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
