Uniform Learning in a Deep Neural Network via "Oddball" Stochastic Gradient Descent
Andrew J.R. Simpson

TL;DR
This paper introduces 'Oddball' SGD, a novel training method that enforces uniform error distribution across training examples in deep neural networks, improving training consistency especially in encoding video data.
Contribution
The paper demonstrates that Oddball SGD can be used to achieve uniform error distribution in deep neural network training, challenging the assumption of uniform difficulty among training examples.
Findings
Oddball SGD enforces uniform error distribution in deep networks.
It improves training consistency for video encoding tasks.
The method adapts training frequency based on error magnitude.
Abstract
When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. "Oddball SGD" (novelty-driven stochastic gradient descent) was recently introduced to drive training probabilistically according to the error distribution - training frequency is proportional to training error magnitude. In this article, using a deep neural network to encode a video, we show that oddball SGD can be used to enforce uniform error across the training set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsStochastic Gradient Descent
