Low-Precision Training of Large Language Models: Methods, Challenges,   and Opportunities

Zhiwei Hao; Jianyuan Guo; Li Shen; Yong Luo; Han Hu; Guoxia Wang,; Dianhai Yu; Yonggang Wen; Dacheng Tao

arXiv:2505.01043·cs.LG·May 5, 2025

Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities

Zhiwei Hao, Jianyuan Guo, Li Shen, Yong Luo, Han Hu, Guoxia Wang,, Dianhai Yu, Yonggang Wen, Dacheng Tao

PDF

Open Access

TL;DR

This survey reviews low-precision training methods for large language models, categorizing approaches by numerical formats, discussing challenges, and highlighting future research directions to improve efficiency and scalability.

Contribution

It provides a systematic organization of low-precision training techniques into three main categories, offering a unified overview of the fragmented research landscape.

Findings

01

Categorized low-precision methods into fixed-point, floating-point, and customized formats.

02

Discussed the relationship between low-precision training and quantization-aware training.

03

Highlighted promising future research directions in low-precision training.

Abstract

Large language models (LLMs) have achieved impressive performance across various domains. However, the substantial hardware resources required for their training present a significant barrier to efficiency and scalability. To mitigate this challenge, low-precision training techniques have been widely adopted, leading to notable advancements in training efficiency. Despite these gains, low-precision training involves several components $\unicode x 2013$ such as weights, activations, and gradients $\unicode x 2013$ each of which can be represented in different numerical formats. The resulting diversity has created a fragmented landscape in low-precision training research, making it difficult for researchers to gain a unified overview of the field. This survey provides a comprehensive review of existing low-precision training methods. To systematically organize these approaches, we categorize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques