Uniform Learning in a Deep Neural Network via "Oddball" Stochastic   Gradient Descent

Andrew J.R. Simpson

arXiv:1510.02442·cs.LG·October 9, 2015

Uniform Learning in a Deep Neural Network via "Oddball" Stochastic Gradient Descent

Andrew J.R. Simpson

PDF

Open Access

TL;DR

This paper introduces 'Oddball' SGD, a novel training method that enforces uniform error distribution across training examples in deep neural networks, improving training consistency especially in encoding video data.

Contribution

The paper demonstrates that Oddball SGD can be used to achieve uniform error distribution in deep neural network training, challenging the assumption of uniform difficulty among training examples.

Findings

01

Oddball SGD enforces uniform error distribution in deep networks.

02

It improves training consistency for video encoding tasks.

03

The method adapts training frequency based on error magnitude.

Abstract

When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. "Oddball SGD" (novelty-driven stochastic gradient descent) was recently introduced to drive training probabilistically according to the error distribution - training frequency is proportional to training error magnitude. In this article, using a deep neural network to encode a video, we show that oddball SGD can be used to enforce uniform error across the training set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsStochastic Gradient Descent