FracTrain: Fractionally Squeezing Bit Savings Both Temporally and   Spatially for Efficient DNN Training

Yonggan Fu; Haoran You; Yang Zhao; Yue Wang; Chaojian Li; Kailash; Gopalakrishnan; Zhangyang Wang; Yingyan Celine Lin

arXiv:2012.13113·cs.CV·January 7, 2025·30 cites

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash, Gopalakrishnan, Zhangyang Wang, Yingyan Celine Lin

PDF

Open Access 1 Repo 1 Video

TL;DR

FracTrain introduces a novel fractional quantization approach that adaptively reduces training costs and energy consumption in deep neural networks by progressively and dynamically adjusting precision levels during training.

Contribution

It proposes a new fractional quantization method that dynamically adapts precision levels per input and training stage, improving efficiency without sacrificing accuracy.

Findings

01

Achieves up to 77.6% computational cost savings

02

Reduces training latency by 53.5%

03

Maintains comparable accuracy within 1.87% of SOTA

Abstract

Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs. As reducing precision is one of the most effective knobs for boosting training time/energy efficiency, there has been a growing interest in low-precision DNN training. In this paper, we explore from an orthogonal direction: how to fractionally squeeze out more training cost savings from the most redundant bit level, progressively along the training trajectory and dynamically per input. Specifically, we propose FracTrain that integrates (i) progressive fractional quantization which gradually increases the precision of activations, weights, and gradients that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RICE-EIC/FracTrain
pytorchOfficial

Videos

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning