A Dynamic Sampling Adaptive-SGD Method for Machine Learning

Achraf Bahamou; Donald Goldfarb

arXiv:1912.13357·cs.LG·March 4, 2020·1 cites

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

Achraf Bahamou, Donald Goldfarb

PDF

Open Access

TL;DR

This paper introduces an adaptive stochastic gradient method that automatically adjusts batch size and learning rate, leveraging local curvature for efficient training of machine learning models without manual tuning.

Contribution

It presents a novel adaptive sampling and step size control method that ensures descent directions and achieves linear convergence on self-concordant functions, eliminating the need for manual hyperparameter tuning.

Findings

01

Outperforms fine-tuned SGD in training logistic regression and DNNs

02

Eliminates the need for manual learning rate tuning in SGD and ADAM

03

Achieves high probability of descent directions and linear convergence

Abstract

We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such directions, eliminating the need for the user to tune the learning rate. The proposed method exploits local curvature information and ensures that search directions are descent directions with high probability using an acute-angle test and can be used as a method that has a global linear rate of convergence on self-concordant functions with high probability. Numerical experiments show that this method is able to choose the best learning rates and compares favorably to fine-tuned SGD for training logistic regression and DNNs. We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Gaussian Processes and Bayesian Inference

MethodsTest · Logistic Regression · Adam · Stochastic Gradient Descent