Weight Update Skipping: Reducing Training Time for Artificial Neural   Networks

Pooneh Safayenikoo; Ismail Akturk

arXiv:2012.02792·cs.LG·December 8, 2020

Weight Update Skipping: Reducing Training Time for Artificial Neural Networks

Pooneh Safayenikoo, Ismail Akturk

PDF

TL;DR

This paper introduces a novel training method for ANNs that skips weight updates during periods of minimal accuracy change, significantly reducing training time while maintaining accuracy.

Contribution

The paper proposes a new approach to skip weight updates based on accuracy variation, lowering training costs without sacrificing model performance.

Findings

01

WUS reduced training time by up to 54% on CIFAR-10.

02

WUS+LR achieved 50% reduction in training time on CIFAR-10.

03

Method maintained comparable accuracy to baseline models.

Abstract

Artificial Neural Networks (ANNs) are known as state-of-the-art techniques in Machine Learning (ML) and have achieved outstanding results in data-intensive applications, such as recognition, classification, and segmentation. These networks mostly use deep layers of convolution or fully connected layers with many filters in each layer, demanding a large amount of data and tunable hyperparameters to achieve competitive accuracy. As a result, storage, communication, and computational costs of training (in particular training time) become limiting factors to scale them up. In this paper, we propose a new training methodology for ANNs that exploits the observation of improvement of accuracy shows temporal variations which allow us to skip updating weights when the variation is minuscule. During such time windows, we keep updating bias which ensures the network still trains and avoids…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution