Inductive Bias of Multi-Channel Linear Convolutional Networks with   Bounded Weight Norm

Meena Jagadeesan; Ilya Razenshteyn; Suriya Gunasekar

arXiv:2102.12238·cs.LG·July 12, 2022·6 cites

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

Meena Jagadeesan, Ilya Razenshteyn, Suriya Gunasekar

PDF

Open Access 1 Repo

TL;DR

This paper characterizes the function space inductive bias of multi-channel linear convolutional networks with bounded weight norms, revealing how the bias depends on input channels and network width, with implications for understanding implicit regularization.

Contribution

It provides a theoretical analysis of the induced regularizer in linear convolutional networks, showing its dependence on input channels and network width, and connects it to known norms like nuclear and group-sparse norms.

Findings

01

Induced regularizer is independent of output channels for single-channel inputs.

02

For multi-channel inputs, the regularizer depends on the number of output channels but becomes independent for large C.

03

Closed-form regularizers include nuclear norm and group-sparse norm of Fourier coefficients.

Abstract

We provide a function space characterization of the inductive bias resulting from minimizing the $ℓ_{2}$ norm of the weights in multi-channel convolutional neural networks with linear activations and empirically test our resulting hypothesis on ReLU networks trained using gradient descent. We define an induced regularizer in the function space as the minimum $ℓ_{2}$ norm of weights of a network required to realize a function. For two layer linear convolutional networks with $C$ output channels and kernel size $K$ , we show the following: (a) If the inputs to the network are single channeled, the induced regularizer for any $K$ is independent of the number of output channels $C$ . Furthermore, we derive the regularizer is a norm given by a semidefinite program (SDP). (b) In contrast, for multi-channel inputs, multiple output channels can be necessary to merely realize all matrix-valued…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mjagadeesan/inductive-bias-multi-channel-CNN
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Graphene research and applications