Partial differential equation regularization for supervised machine learning
Adam M Oberman

TL;DR
This paper reviews various regularization techniques in supervised machine learning, highlighting how implicit methods like data augmentation and adversarial training can be understood as explicit gradient regularization.
Contribution
It offers a unified perspective on implicit regularization methods in deep learning by framing them as explicit gradient regularization techniques.
Findings
Implicit regularization methods can be interpreted as explicit gradient regularization.
Deep learning generalization bounds are discussed independently of data dimension.
Various regularization strategies like data augmentation and adversarial training are analyzed.
Abstract
This article is an overview of supervised machine learning problems for regression and classification. Topics include: kernel methods, training by stochastic gradient descent, deep learning architecture, losses for classification, statistical learning theory, and dimension independent generalization bounds. Implicit regularization in deep learning examples are presented, including data augmentation, adversarial training, and additive noise. These methods are reframed as explicit gradient regularization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
