"Oddball SGD": Novelty Driven Stochastic Gradient Descent for Training Deep Neural Networks
Andrew J.R. Simpson

TL;DR
This paper introduces Oddball SGD, a novel error-driven sampling method that accelerates deep neural network training by prioritizing high-error elements, achieving up to 50 times faster learning than traditional SGD.
Contribution
The paper proposes a novelty-driven feedback mechanism for SGD that significantly improves training speed by focusing on the most error-prone data points.
Findings
Oddball SGD trains DNNs 50x faster than regular SGD.
Prioritizing high-error elements accelerates convergence.
The method enhances training efficiency by leveraging error magnitude feedback.
Abstract
Stochastic Gradient Descent (SGD) is arguably the most popular of the machine learning methods applied to training deep neural networks (DNN) today. It has recently been demonstrated that SGD can be statistically biased so that certain elements of the training set are learned more rapidly than others. In this article, we place SGD into a feedback loop whereby the probability of selection is proportional to error magnitude. This provides a novelty-driven oddball SGD process that learns more rapidly than traditional SGD by prioritising those elements of the training set with the largest novelty (error). In our DNN example, oddball SGD trains some 50x faster than regular SGD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques
MethodsStochastic Gradient Descent
