How deep convolutional neural networks lose spatial information with training
Umberto M. Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu, Wyart

TL;DR
This paper investigates how deep convolutional neural networks lose spatial information during training, revealing the roles of pooling and ReLU units in sensitivity to diffeomorphisms and noise, with empirical and analytical insights.
Contribution
It provides empirical evidence and an analytical model explaining how spatial pooling and strides influence sensitivity loss and noise amplification in deep CNNs.
Findings
Spatial and channel pooling achieve stability to diffeomorphisms.
Sensitivity to noise increases with depth due to pooling and ReLU effects.
Strides in architecture affect the scaling of sensitivity to transformations.
Abstract
A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the net. This loss of sensitivity correlates with performance, and surprisingly correlates with a gain of sensitivity to white noise acquired during training. These facts are unexplained, and as we demonstrate still hold when white noise is added to the images of the training set. Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Neural Networks and Applications
