Mechanism of feature learning in convolutional neural networks
Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit,, Mikhail Belkin

TL;DR
This paper uncovers a fundamental mechanism behind feature learning in CNNs, linking filter covariances to input patch gradients, supported by empirical and theoretical evidence, and introduces a new kernel-based deep learning method.
Contribution
It proposes the Convolutional Neural Feature Ansatz, establishing a link between filter covariances and input gradients, and introduces Deep ConvRFM, a kernel-based method that learns features akin to CNNs.
Findings
High correlation between filter covariances and patch-based AGOPs in standard architectures.
Deep ConvRFM recovers similar features to CNNs, including edge detectors.
Deep ConvRFM improves performance over fixed convolutional kernels.
Abstract
Understanding the mechanism of how convolutional neural networks learn features from image data is a fundamental problem in machine learning and computer vision. In this work, we identify such a mechanism. We posit the Convolutional Neural Feature Ansatz, which states that covariances of filters in any convolutional layer are proportional to the average gradient outer product (AGOP) taken with respect to patches of the input to that layer. We present extensive empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs for convolutional layers in standard neural architectures, such as AlexNet, VGG, and ResNets pre-trained on ImageNet. We also provide supporting theoretical evidence. We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsConvolution · Dropout · Max Pooling · Softmax · Dense Connections
