Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm
Meena Jagadeesan, Ilya Razenshteyn, Suriya Gunasekar

TL;DR
This paper characterizes the function space inductive bias of multi-channel linear convolutional networks with bounded weight norms, revealing how the bias depends on input channels and network width, with implications for understanding implicit regularization.
Contribution
It provides a theoretical analysis of the induced regularizer in linear convolutional networks, showing its dependence on input channels and network width, and connects it to known norms like nuclear and group-sparse norms.
Findings
Induced regularizer is independent of output channels for single-channel inputs.
For multi-channel inputs, the regularizer depends on the number of output channels but becomes independent for large C.
Closed-form regularizers include nuclear norm and group-sparse norm of Fourier coefficients.
Abstract
We provide a function space characterization of the inductive bias resulting from minimizing the norm of the weights in multi-channel convolutional neural networks with linear activations and empirically test our resulting hypothesis on ReLU networks trained using gradient descent. We define an induced regularizer in the function space as the minimum norm of weights of a network required to realize a function. For two layer linear convolutional networks with output channels and kernel size , we show the following: (a) If the inputs to the network are single channeled, the induced regularizer for any is independent of the number of output channels . Furthermore, we derive the regularizer is a norm given by a semidefinite program (SDP). (b) In contrast, for multi-channel inputs, multiple output channels can be necessary to merely realize all matrix-valued…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Graphene research and applications
