Learning with invariances in random features and kernel models
Song Mei, Theodor Misiakiewicz, Andrea Montanari

TL;DR
This paper quantifies the benefits of invariance in machine learning models, showing that invariant architectures like kernels and random features can significantly reduce data and complexity requirements in high-dimensional settings.
Contribution
It introduces invariant random features and kernel models, providing a theoretical analysis of their test error and demonstrating their efficiency gains over unstructured models.
Findings
Invariant methods save a polynomial factor in sample size and hidden units.
Exploiting invariance reduces the complexity needed for the same test error.
Data augmentation with unstructured kernels is statistically equivalent to invariant kernels.
Abstract
A number of machine learning tasks entail a high degree of invariance: the data distribution does not change if we act on the data with a certain group of transformations. For instance, labels of images are invariant under translations of the images. Certain neural network architectures -- for instance, convolutional networks -- are believed to owe their success to the fact that they exploit such invariance properties. With the objective of quantifying the gain achieved by invariant architectures, we introduce two classes of models: invariant random features and invariant kernel methods. The latter includes, as a special case, the neural tangent kernel for convolutional networks with global average pooling. We consider uniform covariates distributions on the sphere and hypercube and a general invariant target function. We characterize the test error of invariant methods in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Radiomics and Machine Learning in Medical Imaging · Model Reduction and Neural Networks
