Implicit Regularization in ReLU Networks with the Square Loss

Gal Vardi; Ohad Shamir

arXiv:2012.05156·cs.LG·June 9, 2021·6 cites

Implicit Regularization in ReLU Networks with the Square Loss

Gal Vardi, Ohad Shamir

PDF

Open Access 1 Repo

TL;DR

This paper investigates the implicit regularization effects of gradient descent in ReLU neural networks with square loss, revealing fundamental limitations in characterizing these effects explicitly and suggesting the need for new theoretical frameworks.

Contribution

It proves that implicit regularization cannot be fully characterized by explicit functions of parameters in simple ReLU models, highlighting the complexity of nonlinear neural network regularization.

Findings

01

Implicit regularization cannot be explicitly characterized for single ReLU neurons.

02

For one hidden-layer networks, only the 'balancedness' property can be characterized explicitly.

03

Results indicate the need for more general frameworks to understand implicit regularization in nonlinear models.

Abstract

Understanding the implicit regularization (or implicit bias) of gradient descent has recently been a very active research area. However, the implicit regularization in nonlinear neural networks is still poorly understood, especially for regression losses such as the square loss. Perhaps surprisingly, we prove that even for a single ReLU neuron, it is impossible to characterize the implicit regularization with the square loss by any explicit function of the model parameters (although on the positive side, we show it can be characterized approximately). For one hidden-layer networks, we prove a similar result, where in general it is impossible to characterize implicit regularization properties in this manner, except for the "balancedness" property identified in Du et al. [2018]. Our results suggest that a more general framework than the one considered so far may be needed to understand…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aminfadaei116/Implicit-Regularization-in-ReLU-Networks-Paper-code
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Machine Learning and ELM

Methods*Communicated@Fast*How Do I Communicate to Expedia?