How Data Augmentation affects Optimization for Linear Regression

Boris Hanin; Yi Sun

arXiv:2010.11171·cs.LG·October 28, 2021·1 cites

How Data Augmentation affects Optimization for Linear Regression

Boris Hanin, Yi Sun

PDF

Open Access 1 Video

TL;DR

This paper analyzes how data augmentation schedules influence optimization in linear regression, revealing complex interactions with hyperparameters and providing convergence guarantees for augmented gradient descent.

Contribution

It offers a theoretical analysis of augmented gradient descent in linear regression, characterizing convergence and minimum points for arbitrary augmentation schemes.

Findings

01

Joint schedules for learning rate and augmentation ensure convergence.

02

Augmentation interacts complexly with learning rate even in convex settings.

03

Provides convergence rates and conditions for augmented gradient descent.

Abstract

Though data augmentation has rapidly emerged as a key tool for optimization in modern machine learning, a clear picture of how augmentation schedules affect optimization and interact with optimization hyperparameters such as learning rate is nascent. In the spirit of classical convex optimization and recent work on implicit bias, the present work analyzes the effect of augmentation on optimization in the simple convex setting of linear regression with MSE loss. We find joint schedules for learning rate and data augmentation scheme under which augmented gradient descent provably converges and characterize the resulting minimum. Our results apply to arbitrary augmentation schemes, revealing complex interactions between learning rates and augmentations even in the convex setting. Our approach interprets augmented (S)GD as a stochastic optimization method for a time-varying sequence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

How Data Augmentation affects Optimization for Linear Regression· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning

MethodsStochastic Gradient Descent