"Oddball SGD": Novelty Driven Stochastic Gradient Descent for Training   Deep Neural Networks

Andrew J.R. Simpson

arXiv:1509.05765·cs.LG·September 21, 2015·2 cites

"Oddball SGD": Novelty Driven Stochastic Gradient Descent for Training Deep Neural Networks

Andrew J.R. Simpson

PDF

Open Access

TL;DR

This paper introduces Oddball SGD, a novel error-driven sampling method that accelerates deep neural network training by prioritizing high-error elements, achieving up to 50 times faster learning than traditional SGD.

Contribution

The paper proposes a novelty-driven feedback mechanism for SGD that significantly improves training speed by focusing on the most error-prone data points.

Findings

01

Oddball SGD trains DNNs 50x faster than regular SGD.

02

Prioritizing high-error elements accelerates convergence.

03

The method enhances training efficiency by leveraging error magnitude feedback.

Abstract

Stochastic Gradient Descent (SGD) is arguably the most popular of the machine learning methods applied to training deep neural networks (DNN) today. It has recently been demonstrated that SGD can be statistically biased so that certain elements of the training set are learned more rapidly than others. In this article, we place SGD into a feedback loop whereby the probability of selection is proportional to error magnitude. This provides a novelty-driven oddball SGD process that learns more rapidly than traditional SGD by prioritising those elements of the training set with the largest novelty (error). In our DNN example, oddball SGD trains some 50x faster than regular SGD.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques

MethodsStochastic Gradient Descent