Overcoming Challenges in Fixed Point Training of Deep Convolutional   Networks

Darryl D. Lin; Sachin S. Talathi

arXiv:1607.02241·cs.LG·July 11, 2016·29 cites

Overcoming Challenges in Fixed Point Training of Deep Convolutional Networks

Darryl D. Lin, Sachin S. Talathi

PDF

Open Access

TL;DR

This paper investigates the instability issues in training deep convolutional networks with low numerical precision and proposes methods to improve fixed point training stability through theoretical analysis and experimental validation.

Contribution

It provides a theoretical analysis of the instability caused by low numerical precision and introduces methods to enhance fixed point training of deep networks.

Findings

01

Theoretical link between low precision and training instability

02

Proposed methods improve fixed point training stability

03

Experimental validation shows enhanced training performance

Abstract

It is known that training deep neural networks, in particular, deep convolutional networks, with aggressively reduced numerical precision is challenging. The stochastic gradient descent algorithm becomes unstable in the presence of noisy gradient updates resulting from arithmetic with limited numeric precision. One of the well-accepted solutions facilitating the training of low precision fixed point networks is stochastic rounding. However, to the best of our knowledge, the source of the instability in training neural networks with noisy gradient updates has not been well investigated. This work is an attempt to draw a theoretical connection between low numerical precision and training algorithm stability. In doing so, we will also propose and verify through experiments methods that are able to improve the training performance of deep convolutional networks in fixed point.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques