Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities
Zhiwei Hao, Jianyuan Guo, Li Shen, Yong Luo, Han Hu, Guoxia Wang,, Dianhai Yu, Yonggang Wen, Dacheng Tao

TL;DR
This survey reviews low-precision training methods for large language models, categorizing approaches by numerical formats, discussing challenges, and highlighting future research directions to improve efficiency and scalability.
Contribution
It provides a systematic organization of low-precision training techniques into three main categories, offering a unified overview of the fragmented research landscape.
Findings
Categorized low-precision methods into fixed-point, floating-point, and customized formats.
Discussed the relationship between low-precision training and quantization-aware training.
Highlighted promising future research directions in low-precision training.
Abstract
Large language models (LLMs) have achieved impressive performance across various domains. However, the substantial hardware resources required for their training present a significant barrier to efficiency and scalability. To mitigate this challenge, low-precision training techniques have been widely adopted, leading to notable advancements in training efficiency. Despite these gains, low-precision training involves several componentssuch as weights, activations, and gradientseach of which can be represented in different numerical formats. The resulting diversity has created a fragmented landscape in low-precision training research, making it difficult for researchers to gain a unified overview of the field. This survey provides a comprehensive review of existing low-precision training methods. To systematically organize these approaches, we categorize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
