Provably Strict Generalisation Benefit for Invariance in Kernel Methods

Bryn Elesedy

arXiv:2106.02346·stat.ML·December 21, 2021

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

Bryn Elesedy

PDF

Open Access 1 Video

TL;DR

This paper provides a rigorous theoretical proof that enforcing invariance in kernel ridge regression yields a strict generalisation benefit, especially when the target function is invariant under a group action.

Contribution

It introduces a novel theoretical framework showing the positive impact of invariance in kernel methods, based on the interplay between kernels and group actions.

Findings

01

Invariance leads to a non-zero generalisation benefit.

02

Generalisation depends on an effective dimension influenced by the kernel and group.

03

Group actions induce an orthogonal decomposition of the RKHS and kernel.

Abstract

It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi arXiv:2102.10333 to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Provably Strict Generalisation Benefit for Invariance in Kernel Methods· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning