Overcoming Challenges in Fixed Point Training of Deep Convolutional Networks
Darryl D. Lin, Sachin S. Talathi

TL;DR
This paper investigates the instability issues in training deep convolutional networks with low numerical precision and proposes methods to improve fixed point training stability through theoretical analysis and experimental validation.
Contribution
It provides a theoretical analysis of the instability caused by low numerical precision and introduces methods to enhance fixed point training of deep networks.
Findings
Theoretical link between low precision and training instability
Proposed methods improve fixed point training stability
Experimental validation shows enhanced training performance
Abstract
It is known that training deep neural networks, in particular, deep convolutional networks, with aggressively reduced numerical precision is challenging. The stochastic gradient descent algorithm becomes unstable in the presence of noisy gradient updates resulting from arithmetic with limited numeric precision. One of the well-accepted solutions facilitating the training of low precision fixed point networks is stochastic rounding. However, to the best of our knowledge, the source of the instability in training neural networks with noisy gradient updates has not been well investigated. This work is an attempt to draw a theoretical connection between low numerical precision and training algorithm stability. In doing so, we will also propose and verify through experiments methods that are able to improve the training performance of deep convolutional networks in fixed point.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques
