Mechanism of feature learning in convolutional neural networks

Daniel Beaglehole; Adityanarayanan Radhakrishnan; Parthe Pandit,; Mikhail Belkin

arXiv:2309.00570·stat.ML·September 4, 2023·2 cites

Mechanism of feature learning in convolutional neural networks

Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit,, Mikhail Belkin

PDF

Open Access 1 Repo

TL;DR

This paper uncovers a fundamental mechanism behind feature learning in CNNs, linking filter covariances to input patch gradients, supported by empirical and theoretical evidence, and introduces a new kernel-based deep learning method.

Contribution

It proposes the Convolutional Neural Feature Ansatz, establishing a link between filter covariances and input gradients, and introduces Deep ConvRFM, a kernel-based method that learns features akin to CNNs.

Findings

01

High correlation between filter covariances and patch-based AGOPs in standard architectures.

02

Deep ConvRFM recovers similar features to CNNs, including edge detectors.

03

Deep ConvRFM improves performance over fixed convolutional kernels.

Abstract

Understanding the mechanism of how convolutional neural networks learn features from image data is a fundamental problem in machine learning and computer vision. In this work, we identify such a mechanism. We posit the Convolutional Neural Feature Ansatz, which states that covariances of filters in any convolutional layer are proportional to the average gradient outer product (AGOP) taken with respect to patches of the input to that layer. We present extensive empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs for convolutional layers in standard neural architectures, such as AlexNet, VGG, and ResNets pre-trained on ImageNet. We also provide supporting theoretical evidence. We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aradha/convrfm
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI

MethodsConvolution · Dropout · Max Pooling · Softmax · Dense Connections