Partial differential equation regularization for supervised machine   learning

Adam M Oberman

arXiv:1910.01612·cs.LG·October 4, 2019·75 Years of Mathematics of Computation

Partial differential equation regularization for supervised machine learning

Adam M Oberman

PDF

TL;DR

This paper reviews various regularization techniques in supervised machine learning, highlighting how implicit methods like data augmentation and adversarial training can be understood as explicit gradient regularization.

Contribution

It offers a unified perspective on implicit regularization methods in deep learning by framing them as explicit gradient regularization techniques.

Findings

01

Implicit regularization methods can be interpreted as explicit gradient regularization.

02

Deep learning generalization bounds are discussed independently of data dimension.

03

Various regularization strategies like data augmentation and adversarial training are analyzed.

Abstract

This article is an overview of supervised machine learning problems for regression and classification. Topics include: kernel methods, training by stochastic gradient descent, deep learning architecture, losses for classification, statistical learning theory, and dimension independent generalization bounds. Implicit regularization in deep learning examples are presented, including data augmentation, adversarial training, and additive noise. These methods are reframed as explicit gradient regularization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.