AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Jiong Zhang; Hsiang-fu Yu; Inderjit S. Dhillon

arXiv:1905.03381·cs.LG·May 10, 2019·6 cites

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Jiong Zhang, Hsiang-fu Yu, Inderjit S. Dhillon

PDF

Open Access 1 Repo

TL;DR

AutoAssist is a framework that accelerates deep neural network training by filtering out less informative instances using a lightweight assistant network, reducing training time significantly while maintaining accuracy.

Contribution

The paper introduces AutoAssist, a novel instance filtering framework with an assistant network to speed up deep neural network training, outperforming traditional importance sampling methods.

Findings

01

Reduces training epochs by 40% for ResNet.

02

Saves 30% training time for transformer models.

03

Maintains comparable accuracy and BLEU scores.

Abstract

Deep neural networks have yielded superior performance in many applications; however, the gradient computation in a deep model with millions of instances lead to a lengthy training process even with modern GPU/TPU hardware acceleration. In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network. Typically, as the training procedure evolves, the amount of improvement in the current model by a stochastic gradient update on each instance varies dynamically. In AutoAssist, we utilize this fact and design a simple instance shrinking operation, which is used to filter out instances with relatively low marginal improvement to the current model; thus the computationally intensive gradient computations are performed on informative instances as much as possible. We prove that the proposed technique outperforms vanilla SGD with existing importance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhangjiong724/autoassist-exp
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Average Pooling · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization