The Hidden Power of Pure 16-bit Floating-Point Neural Networks
Juyoung Yun, Byungkon Kang, Zhoulai Fu

TL;DR
This paper demonstrates that pure 16-bit neural networks can outperform 32-bit models in classification tasks, supported by extensive experiments and theoretical analysis, challenging the assumption that lower precision always harms performance.
Contribution
It is the first comprehensive study of pure 16-bit neural networks, showing their unexpected advantages over 32-bit models through experiments and theoretical insights.
Findings
Pure 16-bit networks outperform 32-bit in certain classification tasks
Theoretical analysis supports empirical performance gains
Low-precision training can be detrimental in some scenarios
Abstract
Lowering the precision of neural networks from the prevalent 32-bit precision has long been considered harmful to performance, despite the gain in space and time. Many works propose various techniques to implement half-precision neural networks, but none study pure 16-bit settings. This paper investigates the unexpected performance gain of pure 16-bit neural networks over the 32-bit networks in classification tasks. We present extensive experimental results that favorably compare various 16-bit neural networks' performance to those of the 32-bit models. In addition, a theoretical analysis of the efficiency of 16-bit models is provided, which is coupled with empirical evidence to back it up. Finally, we discuss situations in which low-precision training is indeed detrimental.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Neural Networks and Reservoir Computing
MethodsNone
