On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural Networks
Hubert Leterme, K\'evin Polisano, Val\'erie Perrier, Karteek Alahari

TL;DR
This paper analyzes how max pooling in CNNs can approximate shift invariance under certain conditions, emphasizing the importance of filter frequency and orientation, and validates the theory with wavelet-based experiments.
Contribution
It provides a mathematical framework linking max pooling to complex modulus for shift invariance and introduces a measure of shift invariance considering filter properties.
Findings
Max pooling approximates a complex modulus under specific conditions.
Filter frequency and orientation are crucial for shift stability.
Experimental validation using wavelet transforms supports the theory.
Abstract
This paper focuses on improving the mathematical interpretability of convolutional neural networks (CNNs) in the context of image classification. Specifically, we tackle the instability issue arising in their first layer, which tends to learn parameters that closely resemble oriented band-pass filters when trained on datasets like ImageNet. Subsampled convolutions with such Gabor-like filters are prone to aliasing, causing sensitivity to small input shifts. In this context, we establish conditions under which the max pooling operator approximates a complex modulus, which is nearly shift invariant. We then derive a measure of shift invariance for subsampled convolutions followed by max pooling. In particular, we highlight the crucial role played by the filter's frequency and orientation in achieving stability. We experimentally validate our theory by considering a deterministic feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Image and Signal Denoising Methods · Face and Expression Recognition
MethodsMax Pooling
