Training DNNs with Hybrid Block Floating Point
Mario Drumond, Tao Lin, Martin Jaggi, Babak Falsafi

TL;DR
This paper introduces HBFP, a hybrid BFP-FP approach for DNN training that combines the accuracy of floating point with the hardware efficiency of fixed point, achieving up to 8.5x higher throughput.
Contribution
The paper proposes HBFP, a novel hybrid BFP-FP method that enables efficient DNN training with high accuracy and improved hardware performance.
Findings
HBFP matches floating point accuracy across various models.
HBFP achieves up to 8.5x higher throughput.
HBFP leverages BFP for dot products and floating point for other operations.
Abstract
The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point's narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Ferroelectric and Negative Capacitance Devices
